next up previous
Next: The Huge Round-Robin Agent Up: Evolving Agents Without Human Previous: Tuning up the Novelty

Test Against Humans

To verify the hypothesis that selecting against humanity is not irrelevant, we selected a group of 10 players produced by the control experiment, and introduced them manually in the main population, to have them tested against humans. We ran this generation (no. 250) for longer than our usual generations, to get an accurate measurement.

Table 3.2: Evaluation of control agents (evolved without human intervention) after being introduced into the main population, and evaluated against humans. A robot's performance against our evaluation set does not predict how it will measure up against humans. As a coevolving population wanders through behavior space, it finds both good and bad players.
Control Performance Statistical Percent
generation vs evaluation strength of robots
no. set (% wins) (RS) below
360 10.0 -4.7 0.1
387 46.7 0.4 80.6
401 54.4 -0.2 59.7
354 61.1 0.4 80.0
541 63.3 0.1 70.5
462 66.7 0.1 67.8
570 70.0 -0.2 60.0
416 75.6 -1.0 40.7
410 78.9 0.4 80.3
535 96.7 0.3 77.2

Table 3.2 summarizes the result of this test. A group of 10 robots was chosen, each one the best from one of the 600 generations that group B (t=100) ran for. We chose the one that performed worst against the evaluation set (generation 360) and the one that performed best (gen. 535), along with eight others, chosen by their different performances vs. the evaluation set.

The last column of the table shows how these robots compare, as measured by their performance against humans (RS) with all other ranked robots.

From the internal point of view of robot-robot coevolution alone, all these agents should be equal: all of them are number one within their own generation. If anything, those of later generations should be better. But this is not the case, as performance against a training set suggests that after 100 generations the population is wandering, without reaching higher absolute performance levels. This wandering is also occurring with respect to the human performance space.

We conclude that a coevolving population of agents explores a subspace of strategies that is not identical to the subspace of human strategies and consequently the coevolutionary fitness is different from the fitness vs. people. Without further testing vs. humans, self play alone provides a weaker evolutionary measure.

next up previous
Next: The Huge Round-Robin Agent Up: Evolving Agents Without Human Previous: Tuning up the Novelty
Pablo Funes