Game Learning
Introduction:
Game Learning a sub field
of AI that is focused on how to create and improve artificial game
players. Game learning brings together
machine learning, search algorithms and data modeling as techniques used to create
better game players.
A question to consider when thinking about game learning is
what is the allure of playing games? We
play games to relax, compete and socialize.
Games are an integral part of human recreation. And often because game playing is social in
nature we usually have game playing partners people with whom we often play
with. They know how we play and we know
how they play.
Who studies game learning?
To spend a great amount of time studying any particular subject most
people have to really be interested in the subject. I believe that most of the people researching game learning
techniques also avidly play the games they work on players for.
That leads me to my last question why study game
learning? There are the obvious academic
answers, the study of learning techniques may aid research in other “important”
areas of AI or academics. Another
answer could be can we create something that can outperform any human at a
task. I believe that although the
academic value of game learning could be enough of a reason to study it, there
is another value that is more immediate, playing games is fun and computers can
now be interesting game playing partners.
Main
Sections of Game Learning:
Book Learning
A book in game learning is a set of definite or almost
definate (only a few) choices given a specific game state. For example the opening moves of a chess
game. Once a specific opening has been
chosen the program can consult the “book” for the next moves. When using a book often you must use some
evaluation function to choose between different options. Another use of books is creating a book on
the experience from previous game experience.
In the Furnkranz paper he suggests saving positions where the program
made a mistake.
Evaluation Functions
In game learning one major technique is in evaluating game
states in order to choose the next move.
In AI term this means how can you change the weights of the evaluation
function to improve performance.
Furnkranz discusses three major types of eval functions.
Suppervised
learning – Where each
game state is assigned a value. If the
evaluation function outputs a state that is far off from the “correct” state then
training occures. The main problem with
this technique is it is often hard to give specific values to particular game
states. An article on improving Min Max searches using supervised learning: http://www.cs.ualberta.ca/~mburo/ps/logaij.pdf
Comparrison
training – Where a
pairs of states are compared and one is selected as prefearable. There is some evaluation function here that
indicates which move is better.
Reinforcement
training – Where weight
changes occur depending on if there was a win, loss or draw and by the margin
of the win or loss. The main problem
occurs here because all moves of a game are not equal. The application will reward good moves
poorly if the it looses and bad moves well if it wins. This is what lead to Temporal Difference. Link on the history of reinforcement
learning http://www-anw.cs.umass.edu/~rich/book/1/node7.html
Temporal
Difference – A
subsection of Reinforcement training. Here
the training is based off of the final outcome of the game but it works
differently. In my eyes it seems
similar to backpropogation. The error
value for each state is determined by the state that came after it and then the
weights for the evaluation function are trained. So instead of all moves being marked as bad if the player losses
the moves are now rated in comparison to the move made after it. It seems almost like trickle down economics:
if you win big everybody below you wins.
Very short page on TD http://eric_rollins.home.mindspring.com/search/neuralNetworks.htm Link to TD-Gammon http://www.research.ibm.com/massive/tdl.html
Opponent Modeling
I went to the University of Alberta’s Game playing page to
find out more on Oppenent Modeling.
In “Improved Oppenent Modeling in Poker” [postscript] the group discusses using neural
networks to model generic and specific opponets through previous game
play. They use probablistic information
to generate the inputs to different neural networks to predict succesive moves. They gave statistics of 80% to 90%
predictability for opponents moves. Apperently
their Poki program has produced some very interesting results. To learn more about it see the Albert Poker
page http://www.cs.ualberta.ca/~games/poker/ .
Throughout my reseach on game learning I consistently came
across the idea that Oppenent Modeling was only usefull in games where there
was a high level of unpredictability or chance. So it would not be useful for chess because using techniques
there has already been created a world class chess player. I believe oppenent modeling could be very
usefull in games like chess where the purpose of the program was not to beat
you but to teach you. Using these
concepts it could show you your weaknesses and in that sense make you a better
player. My personal experience with chess
programs is that you pick your opponent style.
Your oppenent does not adapt to you.
But I think it would be more advantagous to the chess player if it
picked out playing stratagies that you needed to learn on to become a better
player.
LINKS:
A whole book on Reinforcement training: http://www-anw.cs.umass.edu/~rich/book/the-book.html
AAAI page on games and puzzels: http://www.aaai.org/AITopics/html/games.html
University of Alberta’s game page: http://www.cs.ualberta.ca/~games/
A funny link from the author of today paper: http://www.ai.univie.ac.at/~juffi/sara/