Lectures

Lecture 1: Markov Chain, Part I

Note: Lecture Note 1 from Prof. Dimitrios Katselis

Lecture 2: Markov Chain, Part II

Note: Lecture Note 2 from Prof. Dimitrios Katselis

Lecture 3: MDPs, Part I

Note: Sections 9.1-9.2 in Lecture Note 9 from Prof. Dimitrios Katselis

Lecture 4: MDPs, Part II

Note: Sections 9.2-9.3 in Lecture Note 9 from Prof. Dimitrios Katselis

Lecture 5: MDPs, Part III

Note: Section 9.3 in Lecture Note 9 from Prof. Dimitrios Katselis

Lecture 6: VI, PI, and Neural Network

Note: Section 9.4 in Lecture Note 9 from Prof. Dimitrios Katselis, Section 7.1 in Lecture Note 7 from Prof. Dimitrios Katselis

Lecture 7: A High-Level Introduction of RL, Q-Learning, Policy Optimization, and Zeroth-Order Optimization

Ref: Lilian Weng's Blog on RL: A (Long) Peek into Reinforcement Learning

Lecture 8: TD Learning for Policy Evaluation

Note: Section 10.6 in Lecture Note 10 from Prof. Dimitrios Katselis, Section 3.1 of Prof. Srikant's Paper on TD Learning

Lecture 9: Q Factors

Ref: Lilian Weng's Blog on RL: A (Long) Peek into Reinforcement Learning

Lecture 10: Q-Learning, SARSA, Approximate PI

Notes: Algorithms 1-8 in the survey paper by Busoniu et.al., Sections 1.6-1.7 in the LQR note, Algorithm 1 in Prof. Matni's Class Note on API

Lecture 11: Natural Policy Gradient, TRPO, PPO, Robust Adversarial RL

Ref: Prof. Katerina Fragkiadaki's Lecture Note on NPG, TRPO, and PPO, The PPO paper from John Schulman et. al., The RARL paper from Lerrel Pinto et. al.

Lecture 12: Imitation Learning

Ref: A List of Useful Materials for Imitation Learning

Lecture 13: Model-Based RL

Ref: A List of Useful Materials for Model-Based Reinforcement Learning

Lecture 14: Inversed RL

Ref: A List of Useful Materials for Inverse Reinforcement Learning

Lecture 15: Implementation Details for Actor-Critic

Ref: A List of Useful Materials for Implementing Actor-Critic

Lecture 16: Implementation Details for TRPO and PPO

Ref: A List of Useful Materials for Implementing TRPO and PPO

Lecture 17: Practical Value-Based Methods for Continuous Space

Ref: A List of Useful Materials for Implementing Value-Based Methods

Lecture 18: Transfer Learning

Ref: A List of Useful Materials for Transfer Learning

Lecture 19: Overview of RL Theory

Ref: A List of Useful Materials for RL Theory

Lecture 20: Average Cost MDPs

Ref: A List of Useful Materials for Average Cost MDPs

Lecture 21: The ODE Method

Ref: A List of Useful Materials for ODE Method

Lecture 22: Finite Time Analysis

Ref: A List of Useful Materials for Finite Time Analysis

Lecture 23: Convergence Theory for Policy Gradient

Ref: A List of Useful Materials for Convergence Theory of Policy Gradient on LQR