Remembrance of Things Past

Leslie Pack Kaelbling
MIT Artificial Intelligence Lab

Thursday, October 14, Volen 101, 2:00-3.00 pm

Behaving effectively in partially observable environments requires remembering something about the past. Partially observable Markov decision processes (POMDPs) provide a formal model for describing and evaluating control problems that require memory. An agent in an unknown POMDP has two main strategies: learn a model of the POMDP, then solve for a good policy or learn a policy directly.

I will begin by describing an algorithm for learning POMDP models for robot navigation, which, coupled with previous work on controlling POMDPs, yields a behavior learning system. Then, I'll talk about some very recent work on direct approaches to learning policies for POMDPs without first learning a model. I'll conclude with a description of a new project on learning models for visual navigation in humans and robots.

Host: Jordan Pollack