Next: Concluding Remarks Up: Learning Cooperative Procedures Previous: Learning to Interleave Goals

Section 6. Results and Analysis

Our collective memory test-bed system solves randomly generated MOVERS-WORLD problems, in isolation or in sequences. Individual problems are constructed by randomly selecting subsets from the pool of permanent MOVERS-WORLD objects (agents, hand-trucks, and trucks) that will be active for that problem. Then a random group of boxes and locations is constructed and a list of goals involving them is generated.

A database was created of 60 problems, whose goals were always to move all boxes to the truck. The number of boxes was uniformly distributed between 3 and 5. The average number of rounds required to solve these problems without using collective memory to learn was 42.6, with a standard deviation of 0.64.

It should be noted that there are many sources of randomness that can effect the number of rounds of activity it takes a community to solve any given problem. The experiments underlying the data presented here are designed to control as many sources of randomness as possible in order to make comparisons meaningful. For example, it is not feasible to determine learning curves by running the system on all possible permutations of the database problems, so the system is run on four predetermined groups of sequences. Each group of sequences is balanced in the following way: each of the database problems occurs once as the first problem of some sequence in the group, once as the second of a different sequence in the group, et cetera.

There are presently two components of collective memory. Besides the procedural knowledge case-base described in this paper, agents can use operator probabilities trees to estimate the probability of success for operators an agent may attempt. Accurate estimates improve agent performance since the agents' first-principles planner is a hierarchical planner [Sacerdoti1974] which uses probabilities to guide search, including role-binding selections (c.f. [Kushmerick, Hanks, & Weld1995]). Operator probabilities trees are incrementally updated as the agent interacts with the domain and this enables the planner to improve during the course of solving a problem. Results will be presented verifying this intra-problem learning, but the reader is directed to [Garland & Alterman1996] for more technical details on operator probability trees.

Case-based reasoning has been previously demonstrated to be an effective technique for reducing CPU usage during planning in a standard distributed AI planner in [Sugawara1995]. In our domain though, CPU usage is an inappropriate measure of community performance because we are primarily interested in improving the community's run-time behavior, not the speed at which they plan (or retrieve plans from memory). However, since primitive actions and communication are simulated, roughly 90% of the CPU usage for our system is devoted to agent planning. The statistic we consider best suited to our domain is the number of rounds (defined in Section 2), which views primitive action and communication as taking the same order of time as a single planning session.


[The number of rounds declines.]
Figure 7: Overall community performance improves.

Figure 7 shows how the two collective memory structures improve community performance. Both the operator probabilities tree (abbreviated OP) and the procedural knowledge case-base (abbreviated PK) lead to statistically significant improvements. OP shows substantial intra-problem learning, reducing the average number of rounds for solving the first problem of a sequence to 37.3. This number continues to go down, albeit slowly, reaching 32.0 for the fifth problem. PK has less dramatic improvement, dropping to 35.7 for the fifth problem. Using both structures produces much better results than either of the two alone, as the average number of rounds drops steadily, ending at 25.9, which is close to a 40% improvement over no learning. Standard deviations for these points ranged from 0.56 to 0.90.


[Read caption.]
Figure 8: Community primitive action decreases.

In some domains (e.g. robot agents), the time it will take an agent to attempt primitive actions will far exceed the amount of time the agent spends planning or communicating. And in other domains (e.g. web-based applications), the communication costs will dwarf the costs of planning or acting. Figure 8 shows that the use of collective memory in MOVERS-WORLD reduces the number of primitive actions needed to solve problems. Figure 9 shows that the use of collective memory in MOVERS-WORLD reduces run-time communication.


[Read caption.]
Figure 9: Community communication decreases.


[Read caption.]
Figure 10: Agents search more to find role bindings.

The improved run-time performance of the community is a direct result of the fact that the planner produces plans that are either more efficient (e.g. interleaves goals) or more likely to be successful or both. The question must be asked whether this improvement in plan quality comes at a high price in increased planner effort due to increased match costs and/or increased branching factors.2 Using collective memory in MOVERS-WORLD, there is no high price to pay. There is a performance decline in the amount of effort the planner requires in order to instantiate local role-binding variables (see Figure 10). However, this is more than compensated by a dramatic reduction in the number of planning search nodes expanded during calls to the planner as shown in Figure 11.


[Read caption.]
Figure 11: PK reduces planning search.


Footnotes

... factors.2
There are other concerns regarding case-bases, principally increased retrieval time, but there are techniques to handle this such as [Minton1990,Smyth & Keane1995].

Next: Concluding Remarks Up: Learning Cooperative Procedures Previous: Learning to Interleave Goals
Andrew Garland
1998-05-22