Aadirupa Saha

[Selected Papers] [Full List] [Google Scholar] [DBLP] [arXiv]

Collaborators.

Throughout my journey, I have been incredibly fortunate to work alongside some of the brilliant minds in the field, whose wisdom has been instrumental in shaping my work and enriching my knowledge fundamentally: Chiranjib Bhattacharyya, Avrim Blum, Christos Dimitrikakis, Simon Du, Maryzam Fazel, Vitaly Feldman Pierre Gaillard, Aditya Gopalan, Katja Hofmann Eyke Hüllermeier, Prateek Jain, Tomer Koren, Branislav Kveton, Akshay Krishnamurthy, Haipeng Luo, Shie Mannor, Yishay Mansour, Praneeth Netrapalli, Vianney Perchet, Lev Reyzin, Rob Schapire, Nati Srebro, Michal Valko, Matthew Walter Haifeng Xu, Brian Ziebert (in alphabetical order).

Preprints:

Sample Efficient Policy Optimization with Mixture of Feedback
Aadirupa Saha, Pierre Gaillard
Expert Advice with Costly Observations [Arxiv] (coming soon!)
Lev Reyzin, Aadirupa Saha, Shuo Wu (alphabetical)
Optimal Rates for Learning Quantum States with Linear Tomography [Arxiv] (coming soon!)
Moise Blanchard, Aadirupa Saha, Dmitry Ostrovsky
Double-Monster: Efficient Min-Max Strategy for Personalized Prediction under General Preferences.
Aadirupa Saha, Robert Schapire
Efficient Predictive Models without Compromising User Privacy [Arxiv version]
Aadirupa Saha, Hilal Asi
Learning to Allocate Resources with Censored Feedback [Arxiv version]
Giovanni Montanari, Côme Fiegel, Aadirupa Saha, Vianney Perchet
Best Arm Identification in Linear MNL-Bandits. [Arxiv Version]
Shubham Gupta, Aadirupa Saha, Sumeet Katariya

2026

One Good Source is All You Need: Near-Optimal Regret for Bandits under Heterogeneous Noise [Arxiv version]
Amith Bhat, Haipeng Luo, Aadirupa Saha
International Conference on Machine Learning, ICML 2026
LLM-as-a-Judge on a Budget [Arxiv version]
Aadirupa Saha, Branislav Kveton, Aniket Wagde
International Conference on Artificial Intelligence and Statistics, AIStats 2026
Stochastically Dominant Preference Optimization: Policy Improvement for All [Arxiv version]
Ali Farajzadeh, Syed M Abbas, Aadirupa Saha, Brian D Ziebart
International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2026 (Extended Abstract)

2025

Efficient and Near-Optimal Algorithm for General Contextual Dueling Bandits with Offline Regression Oracles [NeurIPS version]
Aadirupa Saha, Robert Schapire
In Neural Information Processing Systems, NeurIPS 2025
Imitation Beyond Expectation Using Pluralistic Stochastic Dominance [NeurIPS version]
Ali Farajzadeh, Danyal Saeed, Syed M Abbas, Rushit N. Shah, Aadirupa Saha, Brian D Ziebart
In Neural Information Processing Systems, NeurIPS 2025 (*Spotlight*)
Source Adaptive Online Learning under Heteroscedastic Noise [OPT-ML version]
Amith Bhat, Aadirupa Saha, Thomas Kleine Buening, Haipeng Luo
In OPT for ML Workshop, Neural Information Processing Systems, NeurIPS 2025
Efficient Algorithms for Combinatorial-Bandits with Monotonicity. [OPT-ML version]
Aniket Wadge, Aadirupa Saha
In OPT for ML Workshop, Neural Information Processing Systems, NeurIPS 2025
HPO: Provably Faster Convergence Rates by Combining Offline Preferences with Online Exploration [Arxiv version] [Workshop version]
Avinandan Bose, Zhihan Xiong, Aadirupa Saha, Simon Shaolei Du, Maryam Fazel
In The Next Frontier in Reliable AI Workshop, International Conference on Learning Representations (ICLR), 2025
Tracking the Best Expert Privately [Arxiv version]
Hilal Asi, Vinod Raman, Aadirupa Saha (alphabetical)
International Conference on Machine Learning, ICML 2025
Dueling Convex Optimization for General Preferences: An Unified Framework for Optimal Convergence Rates. [Arxiv Version]
Aadirupa Saha, Tomer Koren, Yishay Mansour
International Conference on Machine Learning, ICML 2025
Stop Relying on No-Choice and Do not Repeat the Moves: Optimal, Efficient and Practical Algorithms for Assortment Optimization [Arxiv version]
Aadirupa Saha, Pierre Gaillard
International Conference on Learning Representations (ICLR), 2025

1 min pitch!
If you've ever encountered MNL Assortment Optimization problem, the go-to approach is to offer the same set of products repeatedly until your customer is really annoyed and selects no item! In fact, it requires a "belief" that no selection is their most preferred choice :-( Oh no!

But why be so pessimistic? And why annoy your customers repeatedly offering the same items and hoping them to leave (i.e. they decide to choose none of the offered items!)? We got a new idea with no such issues. How? We simply found better concentration tricks! It was a long time wish to resolve this efficiently.

2024

Strategic Linear Contextual Bandits. [Arxiv version]
Thomas Kleine Buening, Aadirupa Saha, Haifeng Xu, Christos Dimitrakakis
In Neural Information Processing Systems, NeurIPS 2024
Dueling in the Dark: An Efficient and Optimal O(√T) Mirror Descent Approach for Competing against Adversarial Preferences. [Arxiv: Coming soon!]
Aadirupa Saha, Barry-John Theobald, Yonathan Efroni
In OPT for ML Workshop, Neural Information Processing Systems, NeurIPS 2024
A Graph Theoretic Approach for Preference Learning with Feature Information. [Arxiv Version]
Aadirupa Saha, Arun Rajkumar
In Uncertainty in Artificial Intelligence, UAI 2024 (*Oral*)
Social Welfare for RecSys: Bandits Meet Mechanism Design to Combat Clickbait in Online Recommendation. [Arxiv version]
Thomas Kleine Buening, Aadirupa Saha, Haifeng Xu, Christos Dimitrakakis
International Conference on Learning Representations (ICLR), 2024 (*Spotlight*)
Only Pay for What Is Uncertain: Variance-Adaptive Thompson Sampling. [Arxiv version]
Aadirupa Saha, Branislav Kveton
International Conference on Learning Representations (ICLR), 2024

1 min pitch!
We lay the foundations for Bayesian multi-armed bandits with known and unknown heterogeneous reward variances with Thompson sampling. Our regret analysis shows improved performance with lower reward variances, implying faster learning in low-variance regimes. So why regret if you are already confident - Only Pay for What Is Uncertain!
Efficient Private Federated Non-Convex Optimization With Shuffled Model. [Workshop Version]
Lingxiao Wang, Xingyu Zhou, Kumar Kshitij Patel, Lawrence Tang, Aadirupa Saha
Privacy Regulation and Protection in ML Workshop, International Conference on Learning Representations (ICLR), 2024
Think Before You Duel: Understanding Complexities of Preference Learning under Constrained Resources. [Arxiv version]
Rohan Deb, Aadirupa Saha
International Conference on Artificial Intelligence and Statistics, AIStats 2024
On the Vulnerability of Fairness Constrained Learning to Malicious Noise. [Arxiv version]
Avrim Blum, Princewill Okoroafor, Aadirupa Saha, Kevin Stangl (alphabetical)
International Conference on Artificial Intelligence and Statistics, AIStats 2024
Faster Convergence with MultiWay Preferences. [Arxiv version]
Aadirupa Saha, Vitaly Feldman, Tomer Koren, Yishay Mansour
International Conference on Artificial Intelligence and Statistics, AIStats 2024
Dueling Optimization with a Monotone Adversary
Avrim Blum, Meghal Gupta, Gene Li, Naren Sarayu Manoj, Aadirupa Saha, Yuanyuan Yang
Algorithmic Learning Theory, ALT, 2024 (*Outstanding Paper Award*)

2023

Eliciting User Preferences for Personalized Multi-Objective Decision Making through Comparative Feedback [Arxiv Version]
Han Shao, Lee Cohen, Avrim Blum, Yishay Mansour, Aadirupa Saha, Mathew Walter
In Neural Information Processing Systems, NeurIPS 2023
Dueling Optimization with a Monotone Adversary. [Arxiv version]
Avrim Blum, Meghal Gupta, Gene Li, Naren Sarayu Manoj, Aadirupa Saha, Yuanyuan Yang
NeurIPS OPT+ML Workshop, NeurIPS, 2023 (Oral)
On the Vulnerability of Fairness Constrained Learning to Malicious Noise. [Arxiv version]
Avrim Blum, Princewill Okoroafor, Aadirupa Saha, Kevin Stangl
Algorithmic Fairness through the Lens of Time Workshop, NeurIPS, 2023
Federated Online and Bandit Convex Optimization [Arxiv Version]
Kumar Kshitij Patel, Lingxiao Wang, Aadirupa Saha, Nati Srebro
In the International Conference on Machine Learning, ICML 2023
Bandits Meet Mechanism Design to Combat Clickbait in Online Recommendation [Arxiv Version]
Thomas Kleine Buening, Aadirupa Saha, Haifeng Xu, Christos Dimitrakakis
Interactive Learning with Implicit Human Feedback Workshop, ICML 2023
One Arrow, Two Kills: An Unified Framework for Achieving Optimal Regret Guarantees in Sleeping Bandits [Arxiv Version] [Talk]
Pierre Gaillard, Aadirupa Saha, Soham Dan
In International Conference on Artificial Intelligence and Statistics, AIStats 2023

1 min pitch!

Sleeping Bandits are as interesting as they sound, but what is the right measure of Sleeping Regret? So many different notions of regrets were studied in the literature --- Sleeping External regret, Ordering regret, Policy regret --- but it is confusing to keep track of the implications of so many different notions, i.e. every combination of stochastic or adversarial losses and availability pairs.

Can we unify them under a single measure? We found one in this work - Sleeping Internal Regret! One of our main contributions is unifying existing notions of regret in sleeping bandits and exploring their implications for each other.

Our proposed algorithm achieves sublinear Internal Regret, even when losses and availabilities are both adversarial, which is the hardest combination of sleeping setup! Further, our results show how a low internal regret leads to both low external regret and low policy regret - One arrow, Two Kills!

Our unified notion of sleeping regret also helps to invent a general notion of Sleeping Dueling Bandits that is stronger than the existing regret definitions used in the contemporary dueling bandits literature and overcomes the issue of repeated draws if needed. This is the first bound of this kind in the dueling literature with many potentials!

ANACONDA: Improved Dynamic Regret Algorithm for Adaptive Non-Stationary Dueling Bandits [Arxiv Version]
Thomas Kleine Buening, Aadirupa Saha
In International Conference on Artificial Intelligence and Statistics, AIStats 2023
Dueling RL: Reinforcement Learning with Trajectory Preferences [Arxiv Version]
Aadirupa Saha*, Aldo Pacchiano*, Jonathan Lee (*Equal contribution)
In International Conference on Artificial Intelligence and Statistics, AIStats 2023

2022

Distributed Online and Bandit Convex Optimization
Kumar Kshitij Patel, Aadirupa Saha, Lingxiao Wang, Nati Srebro
In OPT ML Workshop, Neural Information Processing Systems, NeurIPS 2022
Versatile Dueling Bandits: Best-of-both-World Analyses for Online Learning from Preferences. [Arxiv Version]
Aadirupa Saha, Pierre Gaillard
In International Conference on Machine Learning, ICML 2022
Optimal and Efficient Dynamic Regret Algorithms for Non-Stationary Dueling Bandits. [Arxiv Version]
Aadirupa Saha*, Shubham Gupta* (*Equal contribution)
In International Conference on Machine Learning, ICML 2022
Stochastic Contextual Dueling Bandits under Linear Stochastic Transitivity Models. [Arxiv Version]
Viktor Bengs, Aadirupa Saha, Eyke Hüllermeier
In International Conference on Machine Learning, ICML 2022
Efficient and Optimal Algorithms for Contextual Dueling Bandits under Realizability [Arxiv Version]
Aadirupa Saha, Akshay Krishnamurthy
In Algorithmic Learning Theory, ALT 2022
Exploiting Correlation to Achieve Faster Learning Rates in Low-Rank Preference Bandits [Arxiv Version]
Aadirupa Saha*, Suprovat Ghoshal* (*Equal contribution)
In International Conference on Artificial Intelligence and Statistics, AIStats 2022

2021

Dueling Bandits with Adversarial Sleeping [Arxiv Version]
Aadirupa Saha, Pierre Gaillard
In Neural Information Processing Systems, NeurIPS 2021
Optimal Algorithms for Stochastic Contextual Dueling Bandits
Aadirupa Saha
In Neural Information Processing Systems, NeurIPS 2021
Dueling Convex Optimization
Aadirupa Saha, Tomer Koren, Yishay Mansour
In International Conference on Machine Learning, ICML 2021
Adversarial Dueling Bandits [Arxiv Version]
Aadirupa Saha, Tomer Koren, Yishay Mansour
In International Conference on Machine Learning, ICML 2021
Optimal Regret Algorithm for Pseudo-1d Bandit Convex Optimization [Arxiv Version]
Aadirupa Saha, Nagarajan Natarajan, Praneeth Netrapalli, Prateek Jain
In International Conference on Machine Learning, ICML 2021
Confidence-Budget Matching for Sequential Budgeted Learning [Arxiv Version]
Yonathan Efroni, Nadav Merlis, Aadirupa Saha, Shie Mannor
In International Conference on Machine Learning, ICML 2021
Strategically Efficient Exploration in Competitive Multi-agent Reinforcement Learning [Arxiv Version]
Robert Loftin, Aadirupa Saha, Sam Devlin, Katja Hofmann
In Uncertainty in Artificial Intelligence, UAI 2021

2020

From PAC to Instance-Optimal Sample Complexity in the Plackett-Luce Model [Arxiv Version]
Aadirupa Saha, Aditya Gopalan
In International Conference on Machine Learning, ICML 2020
Improved Sleeping Bandits with Stochastic Action Sets and Adversarial Rewards [Arxiv Version]
Aadirupa Saha, Pierre Gaillard, Michal Valko
In International Conference on Machine Learning, ICML 2020
Best-item Learning in Random Utility Models with Subset Choices [Arxiv Version]
Aadirupa Saha, Aditya Gopalan
In International Conference on Artificial Intelligence and Statistics, AIStats 2020
Polytime Decomposition of Generalized Submodular Base Polytopes with Efficient Sampling
Aadirupa Saha
In Asian Conference on Machine Learning, ACML 2020

2019

Combinatorial Bandits with Relative Feedback [Arxiv Version]
Aadirupa Saha, Aditya Gopalan
In Neural Information Processing Systems, NeurIPS 2019
Be Greedy: How Chromatic Number meets Regret Minimization in Graph Bandits
Shreyas Seshadri*, Aadirupa Saha*, Chiranjib Bhattacharyya (*Equal Contribution)
In Uncertainty in Artificial Intelligence, UAI 2019
Active Ranking with Subset-wise Preferences [Arxiv Version]
Aadirupa Saha, Aditya Gopalan
In International Conference on Artificial Intelligence and Statistics, AIStats 2019
PAC Battling Bandits in the Plackett-Luce Model [Arxiv Version]
Aadirupa Saha, Aditya Gopalan
In Algorithmic Learning Theory, ALT 2019
How Many Pairwise Preferences Do We Need to Rank A Graph Consistently? [Arxiv Version]
Aadirupa Saha, Rakesh Shivanna, Chiranjib Bhattacharyya
In AAAI Conference on Artificial Intelligence, AAAI 2019

2018

Battle of Bandits
Aadirupa Saha, Aditya Gopalan
In Uncertainty in Artificial Intelligence, UAI 2018
Online Learning for Structured Loss Spaces [Arxiv Version]
Siddharth Barman, Aditya Gopalan, Aadirupa Saha (alphabetical)
In AAAI Conference on Artificial Intelligence, AAAI 2018

2015

Consistent Multiclass Algorithms for Complex Performance Measures
Harikrishna Narasimhan, Harish Ramaswamy, Aadirupa Saha, Shivani Agarwal
In International Conference on Machine Learning, ICML 2015

2014

Learning Score Systems for Patient Mortality Prediction in Intensive Care Units via Orthogonal Matching Pursuit
Aadirupa Saha, Chandrahas Dewangan, Harikrishna Narasimhan, Sriram Sampath, Shivani Agarwal
In International Conference on Machine Learning and Applications, ICMLA 2014

2013

Energy Saving Replay Attack Prevention in Clustered Wireless Sensor Networks
Amrita Ghosal, Aadirupa Saha, Sipra Das Bit
In Pacific-Asia Workshop on Intelligence and Security Informatics, PAISI 2013

2011

Energy-Balancing and Lifetime Enhancement of Wireless Sensor Network with Archimedes Spiral
Subir Halder, Amrita Ghosal, Aadirupa Saha, Sipra DasBit
In International Conference on Ubiquitous Intelligence and Computing, ICUIC 2011

[Back to Top] [Selected Papers] [Full List] [Google Scholar] [DBLP] [arXiv]