----- The page is under construction, please check again for updates. Suggestions are most welcome! -----
Battle of Bandits (Active Learning from Preferences)
Some exciting blogs: (1). Reward design problem in RL (OpenAI), (2). Stop your Robots from making a mess! (BAIR), (3). Learning to summarize with human preferences (OpenAI), (4). Aligning language models with preference feedback! (OpenAI). There are a lot more available online to get you excited!
Some breakthrough results (non-inclusive): (1). First Work! (2). Dueling Bandits with simple UCB, (3). Reducing Dueling Bandits to Standard MAB, (4). Dynamic Dueling, (5). Adversarial Dueling Bandits, (6). Best of Both Dueling Bandits, (7). Optimization with Dueling Bandits, (8). Optimal Rates for Contextual Dueling Bandits, (8). RL with Dueling Feedback.
- Excellent Survey: Preference-based Online Learning with Dueling Bandits
- Book: Preference Learning(by Johannes Fürnkranz, Eyke Hüllermeier)
- Video talk: (1). Preference Learning (by Eyke Hullermeier), (2). Learning with Humans in the Loop (with Preferences) (by Thorsten Joachims).
- A great course (with many references and lecture notes): Online Prediction & Learning (by Aditya Gopalan)
- Book: Prediction Learning & Games (by Nicolo Cesa-Bianchi and Gabor Lugosi)
- Video lectures on Online Learning and Bandits (Simons Institute): (1). Part-I (by Wouter Koolen), (2). Part-II (Alan Malek), (3). Online Learning (Nicolo Cesa-Bianchi)
Bandits and Reinforcement Learning
- Few amazing books: (1) Bandit Algorithms (by Tor Lattimore, Csaba Szepesvari), (2) Introduction to Multiarmed Bandits, (3) Reinforcement Learning: An Introduction (by Richard Sutton & Andrew Barto), (4) Concentration Inequalities (by Stephane Boucheron, Gabor Lugosi, Pascal Massart).
- Course with comprehensive notes: Bandits and Reinforcement Learning (by Akshay Krishnamurthy), Bandits, Experts, and Games (by Alex Slivkins)
- Video lectures: Theory of Reinforcement Learning Boot Camp (hosted by Simons Institute)
- Books: (1). Convex Optimization (by Lieven Vandenberghe and Stephen P. Boyd), (2). Non-convex Optimization for Machine Learning (by Prateek Jain, Purushottam Kar), (3). Numerical Optimization (by Jorge Nocedal Stephen J. Wright)
- Notes on Convex Optimization: Convex Optimization: Algorithms and Complexity (by Sebastin Bubeck). Also a blog I'm a Bandit
- Notes on Online Optimization: Introduction to Online Convex Optimization (by Elad Hazan)
- Video lectures: Part-I, Part-II (by Ben Recht), <a href="https://www.youtube.com/watch?v=WvxNGy-RLy4"red">Online Convex Optimization</a> (by Nicolo Cesa-Bianchi)
- An awesome blog (with many references): Convex Optimization: Algorithms and Complexity(from CMU, ML)
- Recent Surveys: (1). Advances and Open Problems in Federated Learning, (2). A Field Guide to Federated Optimization
- Video lectures: Federated Learning and Analytics (workshop by Google)
- Distributed Optimization: (1). Federated Learning One World (FLOW) Seminar (2). A compilation of theoretical developments (by Brian Bullins)
- Books: (1). The Algorithmic Foundations of Differential Privacy (by Cynthia Dwork & Aaron Roth), (2). The Complexity of Differential Privacy (by Salil Vadhan), (3). Composition of Differential Privacy & Privacy Amplification by Subsampling (by Thomas Steinke)
- One-stop blog! DifferentialPrivacy.org. Another nice blog: Differential Privacy (by OpenMinded)
- Courses (with video lectures): (1). Algorithms for Private Data Analysis (by Gautam Kamath), (2). The Algorithmic Foundations of Data Privacy (by Aaron Roth). Microsoft Research also has a diverse playlist on Privacy Research .