Reinforcement Learning and Markov Decision Processes

Fall 2009

When and where 

Wed, 4:00PM-7:00PM, Room Whitaker Lab 203 

Instructor 

Héctor Muñoz-Avila, munoz@cse.lehigh.edu

Instructor's office hours 

Mon, 4:00PM-5:00PM, Room 252 Packard Lab 

Texts

Required:

Reinforcement Learning: An Introduction Richard S. Sutton and Andrew G. Barto

Description

This course will study Reinforcement Learning (RL), a general learning technique where an agent (e.g., a robot) learns from its interactions with an environment (e.g., a sewer system) to accomplish some task (e.g., find locations in the sewer with dangerous gas concentration levels). The agent learns through rewards and punishments it gets from the environment. Here is a simple video introducing the three basic concepts of reinforcement learning: rewards, states, and actions.

RL is motivated by fields such as behavioral psychology. This motivation can be illustrated in the following video which shows initial training of a dog to take the action of staying put. The incentive or reward is to gain some food.

RL has been shown to be useful to solve a wide variety of tasks including (click links to see some videos): autonomous vehicles navigation tasks, robotics and programming game AI (under “downloads” check the videos of before and after learning).

Following the presentation of Sutton and Barto’s book, we will formalize the reinforcement learning problem as a Markov Decision process (MDP). We will study techniques for solving this problem, limitations and research issues. Concepts such as Markov states, Markov property, dynamic programming and Monte Carlo methods will be covered. For further details please read Chapter 1 of the book.

Announcements

** All announcements will be made in this web page:

www.cse.lehigh.edu/~munoz/RLMDP

Topics covered so far

Communication

All announcements, handouts, etc. will be posted in this web site:

www.cse.lehigh.edu/~munoz/RLMDP

Homework

There will be homework assignments.

Attendance

Attendance to class is required.

Exams

There will be two exams but no final exam.

Final Project

The final project will be a programming project of the student's choice provided that it has been approved by the instructor. In addition the student must handle a final report describing properties and empirical evaluation of their implementation (details will be agreed with the instructor before hand).

Class presentation

Each student will prepare and present about a topic he/she selects from a list of topics provided by the instructor.

Grading

Last update: Mon Aug 17 16:47:57 EDT 2009