Temporal-difference Learning and the Coming of Artificial Intelligence

Richard Sutton - University of Alberta

Oct. 3, 2014, 2:30 p.m. - None

MC 12

When mankind finally comes to understand the principles of intelligence, and how they can be embodied in machines, it will be the most important discovery of our age, perhaps of any age. The coming of AI is not imminent, but for almost everybody now alive the chance of it happening in their lifetimes is nonnegligible. For AI researchers, it is a great prize; though we rarely talk about it, we should be discussing now how our efforts might contribute to attaining it. In this talk, I review these considerations and how they have led me to focus on general learning algorithms for real-time prediction and control. In particular, I focus on temporal-difference (TD) learning, a method specialized for making long-term predictions from unprepared data, such as could be obtained by a robot interacting with its environment without human supervision. I present recent results that have deepened our understanding of TD learning and that suggests how it may be relevant to perception and to the acquisition of world knowledge generally. TD learning may not be key to the coming of AI, but it is a good example of the kind of research that could make a fundamental contribution to it.

Richard S. Sutton is a professor and iCORE chair in the department of computing science at the University of Alberta. He is a fellow of the Association for the Advancement of Artificial Intelligence and co-author of the textbook Reinforcement Learning: An Introduction from MIT Press. Before joining the University of Alberta in 2003, he worked in industry at AT&T and GTE Labs, and in academia at the University of Massachusetts. He received a PhD in computer science from the University of Massachusetts in 1984 and a BA in psychology from Stanford University in 1978. Rich's research interests center on the learning problems facing a decision-maker interacting with its environment, which he sees as central to artificial intelligence. He is also interested in animal learning psychology, in connectionist networks, and generally in systems that continually improve their representations and models of the world.