Andrea Zanette

Andrea Zanette
Assistant Professor
Carnegie Mellon University

last_name at cmu.edu

I joined Carnegie Mellon University as an Assistant Professor in the ECE department in Fall 2024, with a courtesy appointment in the MLD. I am broadly interested in Foundation Models, from theory to practice. Topics of interest include reasoning, alignment, efficiency, and optimization, among others.

Before starting as a faculty, I was a postdoctoral scholar at UC Berkeley where I collaborated with Martin Wainwright, Peter Bartlett and Sergey Levine. Before that, I got my PhD at Stanford University where I established several fundamental results in the theory of reinforcement learning under the supervision of Emma Brunskill and Mykel J. Kochenderfer. I also had active collaborations with the labs of Microsoft Research and Facebook.

I am actively looking for strong and motivated PhD students to join our group! If you are interested in working with me, please apply to the PhD program (NOTE: If you missed the application deadline, but are still interested in a PhD position, please email me directly) and mention my name in your application. If you are an existing CMU student in any department feel free to reach out to me directly.

Postdocs: I can supervise one Postdoc starting in Fall 2026 for two years under the Carnegie Bosch Institute fellowship. Please get in touch with me prior to submitting your application at this link by the deadline January 31 at 5pm.

I am happy to host remote interns and visitors; please email me your CV and a short description of what you are interested in. I apologize I am generally unable to answer individual emails regarding PhD applications, internships, teaching or research assistantships, and general inquiries.

Current PhD Students and Close Collaborators

Yuda Song (primary advisors Aarti Singh and J. Andrew Bagnell)
Fahim Tajwar (primary advisors Ruslan Salakhutdinov and Jeff Schneider)
Tong Yang (PhD, primary advisor Yuejie Chi)
Daman Arora
Zhaoyi Zhou
Andy Zhou (on leave, co-founded Intology)
Guanning Zeng (Incoming)

Former Students and Collaborators

Gokul Swamy (CMU Robotics PhD —> ?)
Guanning Zeng (Intern —> PhD at CMU)
Hyunho Kook (Intern —> PhD at University of Southern California)
Sheikh Shafayat (Intern —> PhD at Max Planck Institute for Intelligent Systems)
Anmol Agarwal (MS —> AI Research Scientist at Mistral AI)
Ming Yin (Princeton Postdoc —> Assistant Professor at Georgia Tech)
Ruiqi Zhang (Berkeley PhD —> Quant Analyst at Citadel Securities)
Yifei Zhou (Berkeley PhD —> Member of Technical Staff at Anthropic)
Hanshi Sun (MS —> Research Scientist at ByteDance)
Huitao Yang (Intern —> MS at UCLA Stats)
Jiahao Shi (Intern —> PhD at Princeton)

Foundation Models

Fahim Tajwar*, Guanning Zeng*, Yueer Zhou, Yuda Song, Daman Arora, Yiding Jiang, Jeff Schneider, Ruslan Salakhutdinov, Haiwen Feng, Andrea Zanette
Maximum Likelihood Reinforcement Learning [Paper][Project Website]
Yuda Song*, Lili Chen*, Fahim Tajwar, Remi Munos, Deepak Pathak, J. Andrew Bagnell, Aarti Singh,
Andrea Zanette
Expanding the Capabilities of Reinforcement Learning via Text Feedback [Paper][Project Website]
Guanning Zeng, Zhaoyi Zhou, Daman Arora, Andrea Zanette
Shrinking the Variance: Shrinkage Baselines for Reinforcement Learning with Verifiable Rewards [Paper][Blog][Code]
Sheikh Shafayat*, Fahim Tajwar*, Ruslan Salakhutdinov, Jeff Schneider, Andrea Zanette
(* indicates equal contribution)
Can Large Reasoning Models Self-Train? [Paper][Blog][Code]
Daman Arora, Andrea Zanette
Training Language Models to Reason Efficiently [Paper][Blog][Code]
NeurIPS (Neural Information Processing Systems), 2025
Zhaoyi Zhou, Yuda Song, Andrea Zanette
Accelerating Unbiased LLM Evaluation via Synthetic Feedback [Paper][Code]
ICML (International Conference on Machine Learning), 2025
Hanshi Sun*, Momin Haider*†, Ruiqi Zhang*, Huitao Yang, Jiahao Qiu, Ming Yin,
Mengdi Wang, Peter Bartlett, Andrea Zanette*
(* indicates core authors, † rest in peace)
Fast Best-of-N Decoding via Speculative Rejection [Paper][Blog][Code]
NeurIPS (Neural Information Processing Systems), 2024
Yifei Zhou, Andrea Zanette, Jiayi Pan, Sergey Levine, Aviral Kumar
ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL [Paper]
ICML (International Conference on Machine Learning), 2024

Foundations of RL

Ruiqi Zhang, Andrea Zanette
Is Offline Decision Making Possible with Only Few Samples?
Reliable Decisions in Data-Starved Bandits via Trust Region Enhancement [Paper]
Ruiqi Zhang, Andrea Zanette
Policy Finetuning in Reinforcement Learning via Design of Experiments using Offline Data
NeurIPS (Neural Information Processing Systems), 2023 [Paper]
Andrea Zanette
When is Realizability Sufficient for Off-Policy Reinforcement Learning?
ICML (International Conference on Machine Learning), 2023 [Paper]
Andrea Zanette, Martin J. Wainwright
Bellman Residual Orthogonalization for Offline Reinforcement Learning [Paper]
NeurIPS (Neural Information Processing Systems), Full Oral, 2022
Andrea Zanette, Martin J. Wainwright
Stabilizing Q-learning with Linear Architectures for Provably Efficient Learning [Paper]
ICML (International Conference on Machine Learning), 2022
Andrea Zanette, Martin J. Wainwright, Emma Brunskill
Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning [Paper]
Spotlight presentation in ICML 2021 Workshop on Reinforcement Learning Theory
NeurIPS (Neural Information Processing Systems), 2021
Andrea Zanette*, Kefan Dong*, Jonathan Lee*, Emma Brunskill
Design of Experiments for Stochastic Contextual Linear Bandits [Paper]
NeurIPS (Neural Information Processing Systems), 2021
(* denotes equal contribution)
Andrea Zanette
Exponential Lower Bounds for Batch Reinforcement Learning:
Batch RL can be Exponentially Harder than Online RL
ICML (International Conference on Machine Learning), 2021, Long Oral, [Paper][Csaba’s Class Explanation]
Andrea Zanette, Ching-An Cheng, Alekh Agarwal
Cautiously Optimistic Policy Optimization and Exploration with Linear Function Approximation,
COLT (Conference on Learning Theory) 2021 [Paper]
Andrea Zanette, Alessandro Lazaric, Mykel Kochenderfer, Emma Brunskill
Provably Efficient Reward-Agnostic Navigation with Linear Value Iteration,
NeurIPS (Neural Information Processing Systems), 2020 [Paper]
Andrea Zanette, Alessandro Lazaric, Mykel Kochenderfer, Emma Brunskill
Learning Near Optimal Policies with Low Inherent Bellman Error
ICML (International Conference on Machine Learning), 2020 [Paper]
Andrea Zanette*, David Brandfonbrener*, Emma Brunskill, Matteo Pirotta, Alessandro Lazaric
Frequentist Regret Bounds for Randomized Least-Squares Value Iteration
AISTATS (International Conference on Artificial Intelligence and Statistics), 2020 [Paper]
(* denotes equal contribution)
Andrea Zanette, Mykel J. Kochenderfer, Emma Brunskill
Almost Horizon-Free Structure-Aware Best Policy Identification with a Generative Model
NeurIPS (Neural Information Processing Systems), 2019 [Paper]
Andrea Zanette, Alessandro Lazaric, Mykel J. Kochenderfer, Emma Brunskill
Limiting Extrapolation in Linear Approximate Value Iteration
NeurIPS (Neural Information Processing Systems), 2019 [Paper]
Andrea Zanette, Emma Brunskill
Tighter Problem-Dependent Regret Bounds in Reinforcement Learning without Domain Knowledge using Value Function Bounds
ICML (International Conference on Machine Learning), 2019 [Paper]
Andrea Zanette, Junzi Zhang, Mykel J. Kochenderfer
Robust Super-Level Set Estimation using Gaussian Processes
in ECML-PKDD (European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases), 2018 [Paper]
Andrea Zanette, Emma Brunskill
Problem Dependent Reinforcement Learning Bounds Which Can Identify Bandit Structure in MDPs
in ICML (International Conference on Machine Learning), 2018, Long Oral [Paper]
Andrea Zanette, Massimiliano Ferronato, Carlo Janna
Enriching the finite element method with meshfree techniques in structural mechanics
in IJNME (International Journal for Numerical Methods in Engineering), 2017, [Paper]
Awarded by Advances in Engineering as key scientific article contributing to excellence in science and engineering research [Award]
Andrea Zanette, Massimiliano Ferronato, Carlo Janna
Enriching the Finite Element Method with meshfree particles in structural mechanics
in PAMM (Proceedings in Applied Mathematics and Mechanics), 2015, Oral
Best Poster Award at International CAE Conference 2014 [Award]
Featured in Enginsoft 2014, issue number 4 [Media]

Awards

Solberg Academic Excellence Scholar (named professorship), Purdue University, 2023 (declined)
Nomination for ACM Doctoral Dissertation Award, 2021 (two per school)
Nomination for AAAI/ACM SIGAI Doctoral Dissertation Award, 2021 (one per school)
Outstanding Reviewer Award, NeurIPS 2021
Gene Golub Dissertation Award, Stanford ICME highest departmental doctoral recognition, 2021
Foundation of Data Science (postdoctoral fellowship), 2021-2023
Institute for the Foundations of Machine Learning (postdoctoral fellowship), 2021-2023 (declined)
TOTAL Innovation Fellowship (industrial PhD fellowship), awarded twice, 2018-2020
Key scientific article contributing to excellence in science and engineering research awarded by the committee of Advances in Engineering for the paper ‘Enriching the finite element method with mesh-free techniques in structural mechanics’
CAE Best Poster Award, International CAE conference 2014
Lifelong Learning Program fellowship, von Karman Institute for Fluid Dynamics, 2013
Top score (1 / 2484 students), admission exam in the School of Engineering at the University of Padova.