About Me

Zhewei Yao is a senior scientist and SDE II at Snowflake as a founding member of Snowflake large scale training. Before this, Zhewei was a principal researcher and R&D manager at Microsoft, working on efficient large scale training and inference. He obtained his Ph.D. degree from University of California at Berkeley, where he was a Ph.D. researcher in BAIR, RISELab (former AMPLab), BDD, and Math Department. He was advised by Michael Mahoney, and he worked very closely with Kurt Keutzer. His research interest lied in computing statistics, optimization, and machine learning. Before joining UC Berkeley, he received his B.S. in Math from Zhiyuan Honor College at Shanghai Jiao Tong University (last update 4/08/2024).

Publications (See full list at Google Scholar)

ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers.
Z. Yao, R.Y. Aminabadi, M. Zhang, X. Wu, C. Li, Y. He
arXiv
Accepted for publication, Proc. NeurIPS 2022
Extreme Compression for Pre-trained Transformers Made Simple and Efficient.
X. Wu^*, Z. Yao^*, M. Zhang^*, C. Li, Y. He
arXiv
Accepted for publication, Proc. NeurIPS 2022
DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale.
S. Rajbhandari, C. Li, Z. Yao, M. Zhang, R. Y. Aminabadi, A. A. Awan, J. Rasley, Y. He
arXiv
Accepted for publication, Proc. ICML 2022

Selected Talks

ICCV’23 (LBQNN)
Online (Oct, 2023), slides
ICML’21 (ICML)
Online (Jul, 2021)
SIAM CSE’21: Beyond First Order Methods in Machine Learning Systems (CSE)
Online (Mar, 2021)
AAAI’21 (AAAI)
Online (Feb, 2021)
IEEE BigData’20 (BigData)
Online (Dec, 2020), slides
Berkeley Real-time Intelligent Secure Explanaible Systems Lab Camp (RiseLab)
Online (Oct, 2020), slides1 and slides2, vedio
Fast.AI (Fast.AI)
Online (Oct, 2020), slides, vedio
Scalable Parallel Computing Lab (SPCL)
Online (Oct, 2020), slides, vedio
ICML’20 Workshop on Beyond First-Order Optimization Methods in Machine Learning (Beyond)
Online (July, 2020), slides, vedio
Berkeley Real-time Intelligent Secure Explanaible Systems Lab Sponsor Retreat (RiseLab)
Tahoe Lake, CA, USA (May, 2020), slides
NeurIPS’19 Workshop on Beyond First-Order Optimization Methods in Machine Learning (Beyond)
Vancouver, Canada (December, 2019)
DIMACS Workshop on Randomized Numerical Linear Algebra, Statistics, and Optimization (DIMACS)
Rutgers University, New Jersey, USA (September, 2019), slides
Computer Vision Panel (IJCAI)
Macau, China (August, 2019), slides
Randomized Algorithms for Optimization Problems in Statistics (JSM)
Colorado Convention Center, Denver, Colorado, USA (July, 2019), slides
Berkeley Scientific Computing and Matrix Computations Seminar (Link)
Berkeley, CA, USA (November, 2018), slides
Berkeley Real-time Intelligent Secure Explanaible Systems Lab Sponsor Retreat (RiseLab)
Tahoe Lake, CA, USA (August, 2018), slides