当前位置: X-MOL 学术Acta Numer. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Neural network approximation
Acta Numerica ( IF 16.3 ) Pub Date : 2021-08-04 , DOI: 10.1017/s0962492921000052
Ronald DeVore 1 , Boris Hanin 2 , Guergana Petrova 3
Affiliation  

Neural networks (NNs) are the method of choice for building learning algorithms. They are now being investigated for other numerical tasks such as solving high-dimensional partial differential equations. Their popularity stems from their empirical success on several challenging learning problems (computer chess/Go, autonomous navigation, face recognition). However, most scholars agree that a convincing theoretical explanation for this success is still lacking. Since these applications revolve around approximating an unknown function from data observations, part of the answer must involve the ability of NNs to produce accurate approximations.This article surveys the known approximation properties of the outputs of NNs with the aim of uncovering the properties that are not present in the more traditional methods of approximation used in numerical analysis, such as approximations using polynomials, wavelets, rational functions and splines. Comparisons are made with traditional approximation methods from the viewpoint of rate distortion, i.e. error versus the number of parameters used to create the approximant. Another major component in the analysis of numerical approximation is the computational time needed to construct the approximation, and this in turn is intimately connected with the stability of the approximation algorithm. So the stability of numerical approximation using NNs is a large part of the analysis put forward.The survey, for the most part, is concerned with NNs using the popular ReLU activation function. In this case the outputs of the NNs are piecewise linear functions on rather complicated partitions of the domain of f into cells that are convex polytopes. When the architecture of the NN is fixed and the parameters are allowed to vary, the set of output functions of the NN is a parametrized nonlinear manifold. It is shown that this manifold has certain space-filling properties leading to an increased ability to approximate (better rate distortion) but at the expense of numerical stability. The space filling creates the challenge to the numerical method of finding best or good parameter choices when trying to approximate.

中文翻译:

神经网络逼近

神经网络 (NN) 是构建学习算法的首选方法。他们现在正在研究其他数值任务,例如求解高维偏微分方程。它们的受欢迎程度源于它们在几个具有挑战性的学习问题(计算机国际象棋/围棋、自主导航、人脸识别)上的经验性成功。然而,大多数学者一致认为,对于这一成功仍然缺乏令人信服的理论解释。由于这些应用围绕从数据观察中逼近未知函数,部分答案必须涉及 NN 产生准确逼近的能力。本文调查了 NN 输出的已知近似属性,旨在揭示数值分析中使用的更传统的近似方法中不存在的属性,例如使用多项式、小波、有理函数和样条的近似。从率失真的角度与传统的近似方法进行比较,IE误差与用于创建近似值的参数数量的关系。数值逼近分析中的另一个主要组成部分是构建逼近所需的计算时间,而这又与逼近算法的稳定性密切相关。因此,使用 NN 进行数值逼近的稳定性是所提出的分析的很大一部分。该调查主要关注使用流行的 ReLU 激活函数的 NN。在这种情况下,NN 的输出是在域的相当复杂的分区上的分段线性函数F成凸多面体的细胞。当 NN 的架构固定且允许参数变化时,NN 的输出函数集是参数化的非线性流形。结果表明,这种流形具有一定的空间填充特性,从而提高了逼近能力(更好的速率失真),但以牺牲数值稳定性为代价。空间填充对尝试近似时寻找最佳或良好参数选择的数值方法提出了挑战。
更新日期:2021-08-04
down
wechat
bug