About Me

I am a Ph.D. student in the RISELab (former AMPLab), BDD and Math Department at University of California at Berkeley. I am advised by Michael Mahoney. My research interest lies in computing statistics, optimization and machine learning. Currently, I am interested in leveraging tools from randomized linear algebra to provide efficient and scalable solutions for large-scale optimization and learning problems. I am also working on the theory and application of deep learning. Before joining UC Berkeley, I recieved my B.S. in Math from Zhiyuan Honor College at Shanghai Jiao Tong University. CV

Publications

Papers

  • Inefficiency of K-FAC for Large Batch Size Training.
    L Ma, G Montague, J Ye, Z Yao, A Gholami, K Keutzer, MW Mahoney
    arXiv preprint arXiv
  • Shallow Learning for Fluid Flow Reconstruction with Limited Sensors and Limited Data.
    NB Erichson, L Mathelin, Z Yao, SL Brunton, MW Mahoney, JN Kutz
    arXiv preprint arXiv
  • JumpReLU: A Retrofit Defense Strategy for Adversarial Attacks.
    NB Erichson, Z Yao, MW Mahoney
    Under review
  • Trust Region Based Adversarial Attack on Neural Networks.
    Z Yao, A Gholami, P Xu, K Keutzer, MW Mahoney
    arXiv preprint arXiv
    Accepted for publication, Proc. CVPR 2019.
  • Parameter Re-Initialization through Cyclical Batch Scheduling.
    N Mu, Z Yao, A Gholami, K Keutzer, MW Mahoney
    arXiv preprint arXiv
    Accepted for publication, Proc. MLSYS Workshop at NIPS 2018
  • On the Computational Inefficiency of Large Batch Sizes for Stochastic Gradient Descent.
    N Golmant, N Vemuri, Z Yao, V Feinberg, A Gholami, K Rothauge, MW Mahoney, J Gonzalez
    arXiv preprint arXiv
  • Large batch size training of neural networks with adversarial training and second-order information.
    Z Yao, A Gholami, K Keutzer, MW Mahoney
    arXiv preprint arXiv
  • Hessian-based Analysis of Large Batch Training and Robustness to Adversaries.
    Z Yao, A Gholami, Q Lei, K Keutzer, MW Mahoney
    arXiv preprint arXiv
    Accepted for publication, Proc. NIPS 2018.
  • Inexact non-convex Newton-type methods.
    Z Yao, P Xu, F Roosta-Khorasani, MW Mahoney
    arXiv preprint arXiv
  • A hybrid adaptive MCMC algorithm in function spaces.
    Q Zhou, Z Hu, Z Yao, J Li
    arXiv preprint arXiv
    SIAM/ASA Journal on Uncertainty Quantification 5 (1), 621-639
  • On an adaptive preconditioned Crank–Nicolson MCMC algorithm for infinite dimensional Bayesian inference.
    Z Hu, Z Yao, J Li
    arXiv preprint arXiv
    Journal of Computational Physics 332, 492-503
  • A TV-Gaussian prior for infinite-dimensional Bayesian inverse problems and its numerical implementation.
    Z Yao, Z Hu, J Li
    arXiv preprint arXiv
    Inverse Problems 32 (7), 075006

Workshops

  • An Empirical Exploration of Gradient Correlations in Deep Learning.
    D Rothchild, R Fox, N Golmant, J Gonzalez, MW Mahoney, K Rothauge, I Stoica and Z Yao
    Integration of Deep Learning Theories, NeurIPS 2018

Teaching