当前位置: X-MOL 学术Comm. Pure Appl. Math. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Infinite‐width limit of deep linear neural networks
Communications on Pure and Applied Mathematics ( IF 3.1 ) Pub Date : 2024-05-06 , DOI: 10.1002/cpa.22200
Lénaïc Chizat 1 , Maria Colombo 1 , Xavier Fernández‐Real 1 , Alessio Figalli 2
Affiliation  

This paper studies the infinite‐width limit of deep linear neural networks (NNs) initialized with random parameters. We obtain that, when the number of parameters diverges, the training dynamics converge (in a precise sense) to the dynamics obtained from a gradient descent on an infinitely wide deterministic linear NN. Moreover, even if the weights remain random, we get their precise law along the training dynamics, and prove a quantitative convergence result of the linear predictor in terms of the number of parameters. We finally study the continuous‐time limit obtained for infinitely wide linear NNs and show that the linear predictors of the NN converge at an exponential rate to the minimal ‐norm minimizer of the risk.

中文翻译:

深度线性神经网络的无限宽度限制

本文研究了用随机参数初始化的深度线性神经网络(NN)的无限宽度限制。我们发现,当参数数量发散时,训练动态会(在精确意义上)收敛到从无限宽的确定性线性神经网络上的梯度下降获得的动态。此外,即使权重保持随机,我们也能沿着训练动态得到它们的精确规律,并证明线性预测器在参数数量方面的定量收敛结果。最后,我们研究了无限宽线性神经网络获得的连续时间极限,并表明神经网络的线性预测因子以指数速率收敛到风险的最小范数。
更新日期:2024-05-06
down
wechat
bug