Game Learning



Game Learning a sub field  of AI that is focused on how to create and improve artificial game players.  Game learning brings together machine learning, search algorithms and data modeling as techniques used to create better game players. 


A question to consider when thinking about game learning is what is the allure of playing games?  We play games to relax, compete and socialize.  Games are an integral part of human recreation.  And often because game playing is social in nature we usually have game playing partners people with whom we often play with.  They know how we play and we know how they play.


Who studies game learning?  To spend a great amount of time studying any particular subject most people have to really be interested in the subject.  I believe that most of the people researching game learning techniques also avidly play the games they work on players for.


That leads me to my last question why study game learning?  There are the obvious academic answers, the study of learning techniques may aid research in other “important” areas of AI or academics.  Another answer could be can we create something that can outperform any human at a task.  I believe that although the academic value of game learning could be enough of a reason to study it, there is another value that is more immediate, playing games is fun and computers can now be interesting game playing partners. 


Main Sections of Game Learning:


Book Learning


A book in game learning is a set of definite or almost definate (only a few) choices given a specific game state.  For example the opening moves of a chess game.  Once a specific opening has been chosen the program can consult the “book” for the next moves.  When using a book often you must use some evaluation function to choose between different options.  Another use of books is creating a book on the experience from previous game experience.  In the Furnkranz paper he suggests saving positions where the program made a mistake.


Evaluation Functions


In game learning one major technique is in evaluating game states in order to choose the next move.  In AI term this means how can you change the weights of the evaluation function to improve performance.  Furnkranz discusses three major types of eval functions. 


Suppervised learning – Where each game state is assigned a value.  If the evaluation function outputs a state that is far off from the “correct” state then training occures.  The main problem with this technique is it is often hard to give specific values to particular game states. An article on improving Min Max searches using supervised learning:


Comparrison training – Where a pairs of states are compared and one is selected as prefearable.  There is some evaluation function here that indicates which move is better.


Reinforcement training – Where weight changes occur depending on if there was a win, loss or draw and by the margin of the win or loss.  The main problem occurs here because all moves of a game are not equal.  The application will reward good moves poorly if the it looses and bad moves well if it wins.  This is what lead to Temporal Difference.  Link on the history of reinforcement learning   


Temporal Difference – A subsection of Reinforcement training.  Here the training is based off of the final outcome of the game but it works differently.  In my eyes it seems similar to backpropogation.  The error value for each state is determined by the state that came after it and then the weights for the evaluation function are trained.  So instead of all moves being marked as bad if the player losses the moves are now rated in comparison to the move made after it.  It seems almost like trickle down economics: if you win big everybody below you wins.  Very short page on TD Link to TD-Gammon


Opponent Modeling


I went to the University of Alberta’s Game playing page to find out more on Oppenent Modeling.

In “Improved Oppenent Modeling in Poker”  [postscript] the group discusses using neural networks to model generic and specific opponets through previous game play.  They use probablistic information to generate the inputs to different neural networks to predict succesive moves.  They gave statistics of 80% to 90% predictability for opponents moves.  Apperently their Poki program has produced some very interesting results.  To learn more about it see the Albert Poker page .


Throughout my reseach on game learning I consistently came across the idea that Oppenent Modeling was only usefull in games where there was a high level of unpredictability or chance.  So it would not be useful for chess because using techniques there has already been created a world class chess player.  I believe oppenent modeling could be very usefull in games like chess where the purpose of the program was not to beat you but to teach you.  Using these concepts it could show you your weaknesses and in that sense make you a better player.  My personal experience with chess programs is that you pick your opponent style.  Your oppenent does not adapt to you.  But I think it would be more advantagous to the chess player if it picked out playing stratagies that you needed to learn on to become a better player.





A whole book on Reinforcement training:

AAAI page on games and puzzels:

University of Alberta’s game page:

A funny link from the author of today paper: