About Me

I am a Ph.D. student in the RISELab (former AMPLab), BDD and Math Department at University of California at Berkeley. I am advised by Michael Mahoney. My research interest lies in computing statistics, optimization and machine learning. Currently, I am interested in leveraging tools from randomized linear algebra to provide efficient and scalable solutions for large-scale optimization and learning problems. I am also working on the theory and application of deep learning. Before joining UC Berkeley, I recieved my B.S. in Math from Zhiyuan Honor College at Shanghai Jiao Tong University. CV

Publications

Papers

  • ANODEV2: A Coupled Neural ODE Evolution Framework.
    T. Zhang, Z. Yao, A. Gholami, K. Keutzer, J. Gonzalez, G. Biros, and M. W. Mahoney
    arXiv preprint arxiv
  • Residual Networks as Nonlinear Systems: Stability Analysis using Linearization.
    K. Rothauge, Z. Yao, Z. Hu, and M. W. Mahoney
    arXiv preprint arxiv
  • HAWQ: Hessian AWare Quantization of Neural Networks with Mixed-Precision.
    Z. Dong, Z. Yao, A. Gholami, M. W. Mahoney, K. Keutzer
    arXiv preprint arxiv
  • Inefficiency of K-FAC for Large Batch Size Training.
    L. Ma, G. Montague, J. Ye, Z. Yao, A. Gholami, K. Keutzer, M. W. Mahoney
    arXiv preprint arXiv
  • Shallow Learning for Fluid Flow Reconstruction with Limited Sensors and Limited Data.
    N. B. Erichson, L. Mathelin, Z. Yao, S. L. Brunton, M. W. Mahoney, J. N. Kutz
    arXiv preprint arXiv
  • JumpReLU: A Retrofit Defense Strategy for Adversarial Attacks.
    N. B. Erichson, Z. Yao, M. W. Mahoney
    arXiv preprint arXiv
  • Trust Region Based Adversarial Attack on Neural Networks.
    Z. Yao, A. Gholami, P. Xu, K. Keutzer, M. W. Mahoney
    arXiv preprint arXiv
    Accepted for publication, Proc. CVPR 2019.
  • Parameter Re-Initialization through Cyclical Batch Scheduling.
    N. Mu, Z. Yao, A. Gholami, K. Keutzer, M. W. Mahoney
    arXiv preprint arXiv
    Accepted for publication, Proc. MLSYS Workshop at NIPS 2018
  • On the Computational Inefficiency of Large Batch Sizes for Stochastic Gradient Descent.
    N. Golmant, N. Vemuri, Z. Yao, V. Feinberg, A. Gholami, K. Rothauge, M. W. Mahoney, J. Gonzalez
    arXiv preprint arXiv
  • Large batch size training of neural networks with adversarial training and second-order information.
    Z. Yao, A. Gholami, K. Keutzer, M. W. Mahoney
    arXiv preprint arXiv
  • Hessian-based Analysis of Large Batch Training and Robustness to Adversaries.
    Z. Yao, A. Gholami, Q. Lei, K. Keutzer, M. W. Mahoney
    arXiv preprint arXiv
    Accepted for publication, Proc. NIPS 2018.
  • Inexact non-convex Newton-type methods.
    Z. Yao, P. Xu, F. Roosta-Khorasani, M. W. Mahoney
    arXiv preprint arXiv
  • A hybrid adaptive MCMC algorithm in function spaces.
    Q. Zhou, Z. Hu, Z. Yao, J. Li
    arXiv preprint arXiv
    SIAM/ASA Journal on Uncertainty Quantification 5 (1), 621-639
  • On an adaptive preconditioned Crank–Nicolson MCMC algorithm for infinite dimensional Bayesian inference.
    Z. Hu, Z. Yao, J. Li
    arXiv preprint arXiv
    Journal of Computational Physics 332, 492-503
  • A TV-Gaussian prior for infinite-dimensional Bayesian inverse problems and its numerical implementation.
    Z. Yao, Z. Hu, J. Li
    arXiv preprint arXiv
    Inverse Problems 32 (7), 075006

Workshops

  • An Empirical Exploration of Gradient Correlations in Deep Learning.
    D. Rothchild, R. Fox, N. Golmant, J. Gonzalez, M. W. Mahoney, K. Rothauge, I. Stoica and Z. Yao
    Integration of Deep Learning Theories, NeurIPS 2018

Teaching