June 10, 2025
Summer25 Reading Group: Reinforcement Learning (RL) Theory
Instructor: Aadirupa Saha
Course Description
This summer reading group explores foundational and advanced topics in Reinforcement Learning theory, following closely the RL Theory Monograph by Agarwal, Jiang, Kakade, and Sun. Participants will take turns presenting key concepts weekly, with occasional discussions drawing from classic texts Reinforcement Learning: An Introduction by Sutton and Barto. The group aims to build theoretical intuition while fostering informal collaboration around RL and broader ML theory.
Timing: Tuesday-Friday, 5:30-7 PM Central
Sessions
Date | Presenter | Topics | Resource | Notes |
---|---|---|---|---|
2025-06-13 | Zhengyao | MDP Basics, Values, Policies, Bellman Consistency Equation | RLM (Chap 1.1.1-1.1.3), AK (Lec 5) | - |
2025-06-17 | Zhengyao | Bellman Optimality Equations, Value Iteration, Policy Iteration, Convergence Results | RLM (Thm 1.7, 1.8. Chap 1.3.1-1.3.3), AK (Lec 5) | - |
2025-06-20 | Aniket | Policy Iteration, Convergence Guarantee, Episodic, Generative and Offline RL setting The performance difference lemma | RLM (Thm 1.14, Lem 1.16. Chap 1.3.2, 1.4, 1.5) AK (Lec 6) | - |
2025-06-24 | TBA | Example of Policy Classes, Policy Gradient methods, Non-convexity and Convergence of Value functions under Softmax Parameterizations | RLM (Lem 11.4, 11.5, 11.6. Chap 11.1, 11.2) AK (Lec 6) | - |
Core References
- SB: Reinforcement Learning: An Introduction by Sutton & Barto
- RLM: RL: Theory & Algorithms by Agarwal, Jiang, Kakade, Sun
- FRL: Foundations of Reinforcement Learning and Interactive Decision Making by Foster & Rakhlin
- MFRL: Mathematical Foundation of Reinforcement Learning by Shiyu Zhao
- AK: COMS6998-11: Bandits and Reinforcement Learning, by Akshay Krishnamurthy
- NJ: CS 542: Statistical Reinforcement Learning, by Nan Jiang