Yaoyu Zhang ( 张耀宇 ) - 上海交通大学

个人简介

Brief Introduction In 2016, Yaoyu ZHANG graduated from Shanghai Jiao Tong University and received his doctorate in applied mathematics. From 2016 to 2019, he carried out research as a postdoc at New York University Abu Dhabi and Courant Institute of Mathematical Sciences, New York University. From September 2019 to July 2020, he was a member at Institute for Advanced Study, Princeton, New Jersey. In 2020, he joined the faculty of Institute of Natural Sciences / School of Mathematical Sciences, Shanghai Jiao Tong University. Brain&AI Robustness of the primary visual cortex Online Talks Dynamics of Deep Neural Networks--A Fourier Analysis Perspective. Short talks by postdoctoral members at IAS, 2019. A Type of Generalization Error Induced by Initialization in Deep Neural Networks. MSML2020 Paper Presentation. Embedding Principle of Loss Landscape of Deep Neural Networks. 机器学习联合研讨计划 2021. 深度学习损失景观的嵌入原则. 2021 NeurIPS MeetUp China.

研究领域

Theory of Deep Learning Linear stability hypothesis and rank stratification Loss landscape and the embedding principle Implicit regularization

近期论文

查看导师最新文章（温馨提示：请注意重名现象，建议点开原文通过作者单位确认）

Highlights in Deep Learning Theory Linear stability hypothesis and rank stratification Linear stability hypothesis: linearly stable functions are preferred by nonlinear training for general nonlinear models. Rank stratification: a procedure to stratify the function space of a nonlinear model into function sets with different ``effective sizes of parameters’’ (i.e., model ranks). Yaoyu Zhang*, Zhongwang Zhang, Leyang Zhang, Zhiwei Bai, Tao Luo, Zhi-Qin John Xu*, Linear Stability Hypothesis and Rank Stratification for Nonlinear Models, arxiv 2211.11623, (2022). Embedding Principle Embedding Principle: loss landscape of any DNN contains all critical points of all narrower DNNs. [Short paper] Yaoyu Zhang*, Zhongwang Zhang, Tao Luo, Zhi-Qin John Xu*, Embedding Principle of Loss Landscape of Deep Neural Networks. NeurIPS 2021 spotlight. Prove the Embedding Principle by the multi-step compositional embedding and unravel its practical implications to optimization, training&generalization and pruning. [Long paper] Yaoyu Zhang*, Yuqing Li, Zhongwang Zhang, Tao Luo, Zhi-Qin John Xu*, Embedding Principle: a hierarchical structure of loss landscape of deep neural networks. arXiv:2111.15527, (2021). Extend the results in short paper, formally define the critical embedding and discover a wider class of general compatible critical embeddings. [Embedding Principle in depth] Zhiwei Bai, Tao Luo, Zhi-Qin John Xu*, Yaoyu Zhang*, Embedding Principle in Depth for the Loss Landscape Analysis of Deep Neural Networks. arxiv 2205.13283, (2022). Extend the Embedding Principle from width to depth. Phase diagram Phase diagram: a diagram showing the transition between linear (NTK/kernel/lazy) regime, critical (mean-field) regime or condensed regime depending on initialization hyperparameters for NNs at the infinite-width limit. [Two-layer] Tao Luo#, Zhi-Qin John Xu#, Zheng Ma, Yaoyu Zhang*, Phase diagram for two-layer ReLU neural networks at infinite-width limit, Journal of Machine Learning Research (JMLR) 22(71):1−47, (2021) A map for realizing different training and implicit regularization effect. [Three-layer (empirical)] Hanxu Zhou, Qixuan Zhou, Zhenyuan Jin, Tao Luo, Yaoyu Zhang, Zhi-Qin John Xu*, Empirical Phase Diagram for Three-layer Neural Networks with Infinite Width. NeurIPS 2022. Frequency Principle Frequency Principle: DNNs often learn target function from low to high frequencies. [First paper] Zhiqin Xu, Yaoyu Zhang, Yanyang Xiao, Training Behavior of Deep Neural Network in Frequency Domain, International Conference on Neural Information Processing (ICONIP), pp. 264-274, 2019. (arXiv:1807.01251, Jul 2018) [web] [pdf] Empirically discovering the Frequency Principle in simple datasets (specifically 1-d sythetic data). [2021 World Artificial Intelligence Conference Youth Outstanding Paper Nomination Award] Zhi-Qin John Xu*, Yaoyu Zhang, Tao Luo, Yanyang Xiao, Zheng Ma, Frequency Principle: Fourier Analysis Sheds Light on Deep Neural Networks, Communications in Computational Physics (CiCP) 28(5). 1746-1767, 2020. Extensively demonstrate the Frequency Principle in high-dimensional real datasets with an intuitive theoretical explanation. [Linear Frequency Principle] Yaoyu Zhang, Tao Luo, Zheng Ma, Zhi-Qin John Xu*, Linear Frequency Principle Model to Understand the Absence of Overfitting in Neural Networks, Chinese Physics Letters (CPL) 38(3), 038701, 2021. Propose a linear frequency principle (LFP) model to quantitatively understand the training and generalization consequence of the Frequency Principle. [Linear Frequency Principle in detail] (Alphabetic order) Tao Luo*, Zheng Ma, Zhi-Qin John Xu, Yaoyu Zhang, On the exact computation of linear frequency principle dynamics and its generalization, SIAM Journal on Mathematics of Data Science (SIMODS) to appear, arxiv 2010.08153 (2020). Deep Learning Theory [Data-dependency of implicit bias] Leyang Zhang, Zhi-Qin John Xu, Tao Luo*, Yaoyu Zhang*, Limitation of characterizing implicit regularization by data-independent functions, arXiv:2201.12198 (2022) [Initial condensation] Hanxu Zhou, Qixuan Zhou, Tao Luo, Yaoyu Zhang*, Zhi-Qin John Xu*, Towards Understanding the Condensation of Neural Networks at Initial Training, NeurIPS 2022. [DNN vs. Ritz-Galerkin] (Alphabetic order) Jihong Wang, Zhi-Qin John Xu*, Jiwei Zhang*, Yaoyu Zhang, Implicit bias with Ritz-Galerkin method in understanding deep learning for solving PDEs, CSIAM Trans. Appl. Math. 3(2), pp. 299-317, 2022. [F-Principle theory for general DNN] (Alphabetic order) Tao Luo, Zheng Ma, Zhi-Qin John Xu, Yaoyu Zhang, Theory of the Frequency Principle for General Deep Neural Networks, CSIAM Trans. Appl. Math. 2, pp. 484-507, 2021. [Initialization effect] Yaoyu Zhang, Zhi-Qin John Xu*, Tao Luo, Zheng Ma, A type of generalization error induced by initialization in deep neural networks, Mathematical and Scientific Machine Learning (MSML), 2020. [Derivation of LFP model] (Alphabetic order) Tao Luo, Zheng Ma, Zhi-Qin John Xu, Yaoyu Zhang, On the exact computation of linear frequency principle dynamics and its generalization, arXiv:2010.08153, 2020. [Limit convergence rate decay for Frequency Principle] (Alphabetic order) Tao Luo*, Zheng Ma, Zhiwei Wang, Zhi-Qin John Xu, Yaoyu Zhang, An Upper Limit of Decaying Rate with Respect to Frequency in Deep Neural Network, MSML 2022 Deep learning for Science&Engineering [Deep Learning for Combustion] Tianhan Zhang*, Yuxiao Yi, Yifan Xu, Zhi X. Chen, Yaoyu Zhang, Weinan E, Zhi-Qin John Xu*, A multi-scale sampling method for accurate and robust deep neural network to predict combustion chemical kinetics. Combustion and Flame, 245, 112319, 2022. [DiDo for industrial design] Lulu Zhang, Zhi-Qin John Xu*, Yaoyu Zhang*, Data-informed Deep Optimization, PLoS ONE 17 (6), e0270191, 2022. [MOD-Net for PDEs] Lulu Zhang, Tao Luo, Yaoyu Zhang, Zhi-Qin John Xu*, Zheng Ma*, MOD-Net: A Machine Learning Approach via Model-Operator-Data Network for Solving PDEs, Communications in Computational Physics (CiCP) (2022) to appear, arxiv 2107.03673, (2021). [DeePMR for chemical kinetics reduction] Zhiwei Wang, Yaoyu Zhang, Yiguang Ju, Weinan E, Zhi-Qin John Xu*, Tianhan Zhang*, A deep learning-based model reduction (DeePMR) method for simplifying chemical kinetics, arXiv:2201.02025, (2021). [DeepCombustion0.0 for combustion] Tianhan Zhang, Yaoyu Zhang*, Weinan E, Yiguang Ju, A deep learning-based ODE solver for chemical kinetics, arXiv:2012.12654, (2020). Computational neuroscience Yaoyu Zhang, Lai-Sang Young, DNN-Assisted Statistical Analysis of a Model of Local Cortical Circuits, Scientific Reports 10, 20139, 2020. Yaoyu Zhang, Yanyang Xiao, Douglas Zhou, David Cai, Spike-Triggered Regression for Synaptic Connectivity Reconstruction in Neuronal Networks, Frontiers in Computational Neuroscience 11, 101, 2017. Yaoyu Zhang, Yanyang Xiao, Douglas Zhou, David Cai, Granger Causality Analysis with Nonuniform Sampling and Its Application to Pulse-coupled Nonlinear Dynamics, Physical Review E 93, 042217, 2016. Douglas Zhou, Yaoyu Zhang, Yanyang Xiao, David Cai, Analysis of Sampling Artifacts on the Granger Causality Analysis for Topology Extraction of Neuronal Dynamics, Frontiers in Computational Neuroscience 8, 75, 2014. Douglas Zhou, Yaoyu Zhang, Yanyang Xiao, David Cai, Reliability of the Granger Causality Inference, New Journal of Physics 16 (4), 043016, 2014. Douglas Zhou, Yanyang Xiao, Yaoyu Zhang, Zhiqin Xu, David Cai, Granger Causality Network Reconstruction of Conductance-Based Integrate-and-Fire Neuronal Systems, PloS One 9 (2), e87636, 2014. Douglas Zhou, Yanyang Xiao, Yaoyu Zhang, Zhiqin Xu, David Cai, Causal and Structural Connectivity of Pulse-coupled Nonlinear Networks, Physical Review Letters 111 (5), 054102, 2013.