Autellix: An Efficient Serving Engine for LLM Agents as General Programs
Michael Luo, Xiaoxiang Shi, Colin Cai, Tianjun Zhang, Justin Wong, Yichuan Wang, Chi Wang, Yanping Huang, Zhifeng Chen, Joseph E. Gonzalez, Ion Stoica
USENIX Symposium on Networked Systems Design and Implementation (NSDI), 2026
Arxiv |
Paper
|
rLLM: A Framework for Post-Training Language Agents
Sijun Tan*, Michael Luo*, Colin Cai*, Tarun Venkat, Kyle Montgomery, Aaron Hao, Tianhao Wu, Arnav Balyan, Manan Roongta, Chenguang Wang, Li Erran Li, Raluca Ada Popa, Ion Stoica
Notion Blog, 2025
Blog |
Code
|
DeepScaleR: Surpassing o1-preview with a 1.5B Model by Scaling RL
Michael Luo, Sijun Tan, Justin Wong, Xiaoxiang Shi, William Y. Tang, Manan Roongta, Colin Cai, Jeffrey Luo, Li Erran Li, Raluca Ada Popa, Ion Stoica
Notion Blog, 2025
Blog
|
DeepCoder: A Fully Open-Source 14B Coder at o3-mini Level
Michael Luo, Sijun Tan, Roy Huang, Ameen Patel, Alpay Ariyak, Qingyang Wu, Xiaoxiang Shi, Rachel Xin, Colin Cai, Maurice Weber, Ce Zhang, Li Erran Li, Raluca Ada Popa, Ion Stoica
Notion Blog, 2025
Blog
|
DeepSWE: Training a State-of-the-Art Coding Agent from Scratch by Scaling RL
Michael Luo, Naman Jain, Jaskirat Singh, Sijun Tan, Ameen Patel, Qingyang Wu, Alpay Ariyak, Colin Cai, Tarun Venkat, Shang Zhu, Ben Athiwaratkun, Manan Roongta, Ce Zhang, Li Erran Li, Raluca Ada Popa, Koushik Sen, Ion Stoica
Notion Blog, 2025
Blog
|
WorldModelBench: Judging Video Generation Models as World Models
Dacheng Li, Yunhao Fang, Yukang Chen, Shuo Yang, Shiyi Cao, Justin Wong, Michael Luo, Xiaolong Wang, Hongxu Yin, Joseph E. Gonzalez, Ion Stoica, Song Han, Yao Lu
arXiv preprint, 2025
Arxiv
|
DiT-Serve: An Efficient Serving Engine for Diffusion Transformers
Michael Luo, Aolin Hao, Zhiyu Yan, Chuan Cao, Quang Luong Ngoc Nguyen
Preprint, 2025
OpenReview
|
Stylus: Automatic Adapter Selection for Diffusion Models
Michael Luo, Justin Wong, Brandon Trabucco, Yanping Huang, Joseph E. Gonzalez, Zhifeng Chen, Ruslan Salakhutdinov, Ion Stoica
Neural Information Processing Systems (NeurIPS), 2024 Oral
Arxiv |
Talk
|
Starburst: A Cost-aware Scheduler for Hybrid Cloud
Michael Luo, Siyuan Zhuang, Suryaprakash Vengadesan, Romil Bhardwaj, Justin Chang, Eric J. Friedman, Scott Shenker, Ion Stoica
USENIX Annual Technical Conference (USENIX ATC), 2024 Best Paper Award
Paper
|
SimpleStrat: Diversifying Language Model Generation with Stratification
Justin Wong, Yury Orlovskiy, Michael Luo, Sanjit A. Seshia, Joseph E. Gonzalez
arXiv preprint, 2024
Arxiv
|
SkyPilot: An Intercloud Broker for Sky Computing
Zongheng Yang, Zhanghao Wu, Michael Luo, Wei-Lin Chiang, Romil Bhardwaj, Woosuk Kwon, Siyuan Zhuang, Frank Sifei Luan, Gautam Mittal, Scott Shenker, Ion Stoica
USENIX Symposium on Networked Systems Design and Implementation (NSDI), 2023
|
Recovery RL: Safe Reinforcement Learning with Learned Recovery Zones
Brijen Thananjeyan*, Ashwin Balakrishna*, Suraj Nair, Michael Luo, Krishnan Srinivasan, Minho Hwang, Joseph E. Gonzalez, Julian Ibarz, Chelsea Finn, Ken Goldberg
IEEE Robotics and Automation Letters / International Conference on Robotics and Automation (ICRA), 2021
Website |
Arxiv |
Code
|
Distributed Reinforcement Learning is a Dataflow Problem
Eric Liang*, Zhanghao Wu*, Michael Luo, Sven Mika, Joseph E. Gonzalez, Ion Stoica
Neural Information Processing Systems (NeurIPS), 2021
Arxiv |
Code
|
Accelerating Quadratic Optimization with Reinforcement Learning
Jeffrey Ichnowski, Paras Jain, Bartolomeo Stellato, Goran Banjac, Michael Luo, Francesco Borrelli, Joseph E. Gonzalez, Ion Stoica, Ken Goldberg
Neural Information Processing Systems (NeurIPS), 2021
Website |
Arxiv |
Code
|
Discovering Non-monotonic Autoregressive Orderings with Variational Inference
Xuanlin Li*, Brandon Trabucco*, Dong Huk Park, Michael Luo, Sheng Shen, Trevor Darrell, Yang Gao
International Conference on Learning Representations (ICLR), 2021
Arxiv |
Code
|
MESA: Offline Meta-RL for Safe Adaptation and Fault Tolerance
Michael Luo, Ashwin Balakrishna, Brijen Thananjeyan, Suraj Nair, Julian Ibarz, Jie Tan, Chelsea Finn, Ion Stoica, Ken Goldberg
Neural Information Processing Systems (NeurIPS) Safe Control Workshop, 2021
Website |
Arxiv |
Code
|
LazyDAgger: Reducing Context Switching in Interactive Robot Imitation Learning
Ryan Hoque, Ashwin Balakrishna, Carl Putterman, Michael Luo, Daniel S. Brown, Daniel Seita, Brijen Thananjeyan, Ellen Novoseller, Ken Goldberg
Conference on Automation Science and Engineering (CASE), 2021
Website |
Arxiv
|
AlphaGarden: Learning Seed Placement and Automation Policies for Polyculture Farming with Companion Plants
Yahav Avigal, Anna Deza, William Wong, Sebastian Oehme, Mark Presten, Mark Theis, Jackson Chui, Paul Shao, Huang Huang, Atsunobu Kotani, Satvik Sharma, Michael Luo, Stefano Carpin, Joshua Viers, Stavros Vougioukas, Ken Goldberg
International Conference on Robotics and Automation (ICRA), 2021
Website |
Code
|
|