Home Publications Blog CV
Publications
2026
Autellix: An Efficient Serving Engine for LLM Agents as General Programs
Michael Luo, Xiaoxiang Shi, Colin Cai, Tianjun Zhang, Justin Wong, Yichuan Wang, Chi Wang, Yanping Huang, Zhifeng Chen, Joseph E. Gonzalez, Ion Stoica
USENIX Symposium on Networked Systems Design and Implementation (NSDI), 2026
Arxiv | Paper
2025
rLLM: A Framework for Post-Training Language Agents
Sijun Tan*, Michael Luo*, Colin Cai*, Tarun Venkat, Kyle Montgomery, Aaron Hao, Tianhao Wu, Arnav Balyan, Manan Roongta, Chenguang Wang, Li Erran Li, Raluca Ada Popa, Ion Stoica
Notion Blog, 2025
Blog | Code
DeepScaleR: Surpassing o1-preview with a 1.5B Model by Scaling RL
Michael Luo, Sijun Tan, Justin Wong, Xiaoxiang Shi, William Y. Tang, Manan Roongta, Colin Cai, Jeffrey Luo, Li Erran Li, Raluca Ada Popa, Ion Stoica
Notion Blog, 2025
Blog
DeepCoder: A Fully Open-Source 14B Coder at o3-mini Level
Michael Luo, Sijun Tan, Roy Huang, Ameen Patel, Alpay Ariyak, Qingyang Wu, Xiaoxiang Shi, Rachel Xin, Colin Cai, Maurice Weber, Ce Zhang, Li Erran Li, Raluca Ada Popa, Ion Stoica
Notion Blog, 2025
Blog
DeepSWE: Training a State-of-the-Art Coding Agent from Scratch by Scaling RL
Michael Luo, Naman Jain, Jaskirat Singh, Sijun Tan, Ameen Patel, Qingyang Wu, Alpay Ariyak, Colin Cai, Tarun Venkat, Shang Zhu, Ben Athiwaratkun, Manan Roongta, Ce Zhang, Li Erran Li, Raluca Ada Popa, Koushik Sen, Ion Stoica
Notion Blog, 2025
Blog
WorldModelBench: Judging Video Generation Models as World Models
Dacheng Li, Yunhao Fang, Yukang Chen, Shuo Yang, Shiyi Cao, Justin Wong, Michael Luo, Xiaolong Wang, Hongxu Yin, Joseph E. Gonzalez, Ion Stoica, Song Han, Yao Lu
arXiv preprint, 2025
Arxiv
DiT-Serve: An Efficient Serving Engine for Diffusion Transformers
Michael Luo, Aolin Hao, Zhiyu Yan, Chuan Cao, Quang Luong Ngoc Nguyen
Preprint, 2025
OpenReview
2024
Stylus: Automatic Adapter Selection for Diffusion Models
Michael Luo, Justin Wong, Brandon Trabucco, Yanping Huang, Joseph E. Gonzalez, Zhifeng Chen, Ruslan Salakhutdinov, Ion Stoica
Neural Information Processing Systems (NeurIPS), 2024  Oral
Arxiv | Talk
Starburst: A Cost-aware Scheduler for Hybrid Cloud
Michael Luo, Siyuan Zhuang, Suryaprakash Vengadesan, Romil Bhardwaj, Justin Chang, Eric J. Friedman, Scott Shenker, Ion Stoica
USENIX Annual Technical Conference (USENIX ATC), 2024  Best Paper Award
Paper
SimpleStrat: Diversifying Language Model Generation with Stratification
Justin Wong, Yury Orlovskiy, Michael Luo, Sanjit A. Seshia, Joseph E. Gonzalez
arXiv preprint, 2024
Arxiv
2023
SkyPilot: An Intercloud Broker for Sky Computing
Zongheng Yang, Zhanghao Wu, Michael Luo, Wei-Lin Chiang, Romil Bhardwaj, Woosuk Kwon, Siyuan Zhuang, Frank Sifei Luan, Gautam Mittal, Scott Shenker, Ion Stoica
USENIX Symposium on Networked Systems Design and Implementation (NSDI), 2023
2022
Balsa: Learning a Query Optimizer Without Expert Demonstrations
Zongheng Yang, Wei-Lin Chiang, Frank Luan, Gautam Mittal, Michael Luo, Ion Stoica
Special Interest Group on Management of Data (SIGMOD), 2022
Arxiv | Code
2021
Recovery RL: Safe Reinforcement Learning with Learned Recovery Zones
Brijen Thananjeyan*, Ashwin Balakrishna*, Suraj Nair, Michael Luo, Krishnan Srinivasan, Minho Hwang, Joseph E. Gonzalez, Julian Ibarz, Chelsea Finn, Ken Goldberg
IEEE Robotics and Automation Letters / International Conference on Robotics and Automation (ICRA), 2021
Website | Arxiv | Code
Distributed Reinforcement Learning is a Dataflow Problem
Eric Liang*, Zhanghao Wu*, Michael Luo, Sven Mika, Joseph E. Gonzalez, Ion Stoica
Neural Information Processing Systems (NeurIPS), 2021
Arxiv | Code
Accelerating Quadratic Optimization with Reinforcement Learning
Jeffrey Ichnowski, Paras Jain, Bartolomeo Stellato, Goran Banjac, Michael Luo, Francesco Borrelli, Joseph E. Gonzalez, Ion Stoica, Ken Goldberg
Neural Information Processing Systems (NeurIPS), 2021
Website | Arxiv | Code
Discovering Non-monotonic Autoregressive Orderings with Variational Inference
Xuanlin Li*, Brandon Trabucco*, Dong Huk Park, Michael Luo, Sheng Shen, Trevor Darrell, Yang Gao
International Conference on Learning Representations (ICLR), 2021
Arxiv | Code
MESA: Offline Meta-RL for Safe Adaptation and Fault Tolerance
Michael Luo, Ashwin Balakrishna, Brijen Thananjeyan, Suraj Nair, Julian Ibarz, Jie Tan, Chelsea Finn, Ion Stoica, Ken Goldberg
Neural Information Processing Systems (NeurIPS) Safe Control Workshop, 2021
Website | Arxiv | Code
LazyDAgger: Reducing Context Switching in Interactive Robot Imitation Learning
Ryan Hoque, Ashwin Balakrishna, Carl Putterman, Michael Luo, Daniel S. Brown, Daniel Seita, Brijen Thananjeyan, Ellen Novoseller, Ken Goldberg
Conference on Automation Science and Engineering (CASE), 2021
Website | Arxiv
AlphaGarden: Learning Seed Placement and Automation Policies for Polyculture Farming with Companion Plants
Yahav Avigal, Anna Deza, William Wong, Sebastian Oehme, Mark Presten, Mark Theis, Jackson Chui, Paul Shao, Huang Huang, Atsunobu Kotani, Satvik Sharma, Michael Luo, Stefano Carpin, Joshua Viers, Stavros Vougioukas, Ken Goldberg
International Conference on Robotics and Automation (ICRA), 2021
Website | Code
2020
IMPACT: Importance Weighted Asynchronous Architectures with Clipped Target Networks
Michael Luo, Jiahao Yao, Richard Liaw, Eric Liang, Ion Stoica
International Conference on Learning Representations (ICLR), 2020
Arxiv | Code
Connecting Context-specific Adaptation in Humans to Meta-learning
Rachit Dubey*, Erin Grant*, Michael Luo*, Karthik Narasimhan, Thomas L. Griffiths
Preprint
Arxiv | Code