当前位置:
X-MOL 学术
›
SIAM J. Numer. Anal.
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
On the Existence of Minimizers in Shallow Residual ReLU Neural Network Optimization Landscapes
SIAM Journal on Numerical Analysis ( IF 2.8 ) Pub Date : 2024-11-26 , DOI: 10.1137/23m1556241 Steffen Dereich, Arnulf Jentzen, Sebastian Kassing
SIAM Journal on Numerical Analysis ( IF 2.8 ) Pub Date : 2024-11-26 , DOI: 10.1137/23m1556241 Steffen Dereich, Arnulf Jentzen, Sebastian Kassing
SIAM Journal on Numerical Analysis, Volume 62, Issue 6, Page 2640-2666, December 2024.
Abstract. In this article, we show the existence of minimizers in the loss landscape for residual artificial neural networks (ANNs) with a multidimensional input layer and one hidden layer with ReLU activation. Our work contrasts with earlier results in [D. Gallon, A. Jentzen, and F. Lindner, preprint, arXiv:2211.15641, 2022] and [P. Petersen, M. Raslan, and F. Voigtlaender, Found. Comput. Math., 21 (2021), pp. 375–444] which showed that in many situations minimizers do not exist for common smooth activation functions even in the case where the target functions are polynomials. The proof of the existence property makes use of a closure of the search space containing all functions generated by ANNs and additional discontinuous generalized responses. As we will show, the additional generalized responses in this larger space are suboptimal so that the minimum is attained in the original function class.
中文翻译:
浅残差 ReLU 神经网络优化景观中最小化器的存在
SIAM 数值分析杂志,第 62 卷,第 6 期,第 2640-2666 页,2024 年 12 月。
抽象。在本文中,我们展示了残差人工神经网络 (ANN) 的损失景观中存在的最小化器,该网络具有多维输入层和一个具有 ReLU 激活的隐藏层。我们的工作与 [D. Gallon, A. Jentzen, and F. Lindner, preprint, arXiv:2211.15641, 2022] 和 [P. Petersen, M. Raslan, and F. Voigtlaender, Found. Comput. Math., 21 (2021), pp. 375–444] 的早期结果形成鲜明对比,这表明在许多情况下,即使目标函数是多项式,常见的平滑激活函数也不存在最小化器。existence 属性的证明利用了包含 ANN 生成的所有函数的搜索空间的闭包,以及其他不连续的广义响应。正如我们将要展示的,在这个更大的空间中,额外的广义响应是次优的,因此在原始函数类中达到最小值。
更新日期:2024-11-27
Abstract. In this article, we show the existence of minimizers in the loss landscape for residual artificial neural networks (ANNs) with a multidimensional input layer and one hidden layer with ReLU activation. Our work contrasts with earlier results in [D. Gallon, A. Jentzen, and F. Lindner, preprint, arXiv:2211.15641, 2022] and [P. Petersen, M. Raslan, and F. Voigtlaender, Found. Comput. Math., 21 (2021), pp. 375–444] which showed that in many situations minimizers do not exist for common smooth activation functions even in the case where the target functions are polynomials. The proof of the existence property makes use of a closure of the search space containing all functions generated by ANNs and additional discontinuous generalized responses. As we will show, the additional generalized responses in this larger space are suboptimal so that the minimum is attained in the original function class.
中文翻译:
浅残差 ReLU 神经网络优化景观中最小化器的存在
SIAM 数值分析杂志,第 62 卷,第 6 期,第 2640-2666 页,2024 年 12 月。
抽象。在本文中,我们展示了残差人工神经网络 (ANN) 的损失景观中存在的最小化器,该网络具有多维输入层和一个具有 ReLU 激活的隐藏层。我们的工作与 [D. Gallon, A. Jentzen, and F. Lindner, preprint, arXiv:2211.15641, 2022] 和 [P. Petersen, M. Raslan, and F. Voigtlaender, Found. Comput. Math., 21 (2021), pp. 375–444] 的早期结果形成鲜明对比,这表明在许多情况下,即使目标函数是多项式,常见的平滑激活函数也不存在最小化器。existence 属性的证明利用了包含 ANN 生成的所有函数的搜索空间的闭包,以及其他不连续的广义响应。正如我们将要展示的,在这个更大的空间中,额外的广义响应是次优的,因此在原始函数类中达到最小值。