ECE586RL: Markov Decision Processes and Reinforcement Learning
Course Information
Course DescriptionThe course will discuss techniques to solve dynamic optimization problems where the system dynamics are unknown. The course will first introduce dynamic programming techniques for Markov decision process (MDP) problems and then focus on solving the dynamic programming equations approximately when the underlying parameters of the Markov chain are unknown. While the emphasis will be on techniques for which one can prove performance bounds, heuristics used in reinforcement learning will also be presented to show their relationship to existing theory, and to identify open theoretical problems.Outline
Required MaterialsThere is no required textbook for the class. All course material will be presented in class and/or provided online as notes. Links for relevant papers will be listed in the course website. One useful reference is the book “Dynamic Programming and Optimal Control, Vol. II: Approximate Dynamic Programming” by D. Bertsekas. PrerequisitesECE 534; ECE 555 is recommended, but not required. Grading10% class participation; 30% homework (2 sets); 60% final project |