Self-Play or Play against People?

Next: Intelligence on the Web Up: Learning to Play Games Previous: Learning to Play Games

Self-Play or Play against People?

A machine that learns by playing games may acquire knowledge either from external expertise (playing with a human or human-programmed trainer), or by engaging in self-play.

Tesauro[131] was able to obtain strong Backgammon players, having one neural network play itself and adjusting the weights with a variant of Sutton's TD algorithm[128]. Although it worked for Backgammon, self-play has failed on other domains. Our group obtained similar results to those of Tesauro using hill-climbing, a much simpler algorithm[109]. This demonstrated that elements unique to Backgammon, more than the TD method, enable learning to succeed. Self-play remains an attractive idea because no external experience is required. In most cases, however, the learning agent explores a narrow portion of the problem domain and fails to generalize to the game as humans perceive it.

Attaining knowledge from human experience has proven to be difficult as well. Today's algorithms would require millions of games, hence rendering training against a live human impossible in practice. Programmed trainers have also led to the exploration of an insufficient subset of the game space: Tesauro[130] tried to learn Backgammon using human knowledge through a database of human expert examples, but self-play yielded better results. Angeline and Pollack[4] showed how a genetic program that learned to play tic-tac-toe against several fixed heuristic players was outperformed by the winner in a self-playing population. Most of today's expert computer players are programmed by humans; some employ no learning at all[103] and some use it during a final stage to fine-tune a few internal parameters[7]. A recent exception is Fogel's checkers player [40], which achieved a ``Class A'' rating by coevolutionary self-play alone.

Real-time, interactive games (e.g. video games) have distinctive features that differentiate them from board games. Koza [84] and others [117] evolved players for the game of Pacman. There has been important research in pursuer-evader games [114,101,25] as well as contests in simulated physics environments [123]. But these games do not have human participants, as their environments are either provided by the game itself, or emerge from coevolutionary interactions inside a population of agents.

Next: Intelligence on the Web Up: Learning to Play Games Previous: Learning to Play Games

Pablo Funes
2001-05-08