How Deep Neural Networks Learn Compositional Data: The Random Hierarchy Model,Physical Review X

当前位置： X-MOL 学术 › Phys. Rev. X › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

How Deep Neural Networks Learn Compositional Data: The Random Hierarchy Model
Physical Review X ( IF 11.6 ) Pub Date : 2024-07-01 , DOI: 10.1103/physrevx.14.031001
Francesco Cagnetta ₁ , Leonardo Petrini ₁ , Umberto M. Tomasini ₁ , Alessandro Favero _{1,

1} , Matthieu Wyart ₁

Affiliation

Deep learning algorithms demonstrate a surprising ability to learn high-dimensional tasks from limited examples. This is commonly attributed to the depth of neural networks, enabling them to build a hierarchy of abstract, low-dimensional data representations. However, how many training examples are required to learn such representations remains unknown. To quantitatively study this question, we introduce the random hierarchy model: a family of synthetic tasks inspired by the hierarchical structure of language and images. The model is a classification task where each class corresponds to a group of high-level features, chosen among several equivalent groups associated with the same class. In turn, each feature corresponds to a group of subfeatures chosen among several equivalent groups and so on, following a hierarchy of composition rules. We find that deep networks learn the task by developing internal representations invariant to exchanging equivalent groups. Moreover, the number of data required corresponds to the point where correlations between low-level features and classes become detectable. Overall, our results indicate how deep networks overcome the curse of dimensionality by building invariant representations and provide an estimate of the number of data required to learn a hierarchical task.

中文翻译：

深度神经网络如何学习成分数据：随机层次模型

深度学习算法展示了从有限示例中学习高维任务的惊人能力。这通常归因于神经网络的深度，使它们能够构建抽象的低维数据表示的层次结构。然而，学习这种表示需要多少训练样本仍然未知。为了定量研究这个问题，我们引入了随机层次模型：受语言和图像层次结构启发的一系列合成任务。该模型是一个分类任务，其中每个类别对应于一组高级特征，这些特征是从与同一类别关联的几个等效组中选择的。反过来，每个特征对应于在几个等效组中选择的一组子特征，依此类推，遵循组合规则的层次结构。我们发现深度网络通过开发对交换等效组不变的内部表示来学习任务。此外，所需的数据数量对应于低级特征和类别之间的相关性变得可检测的点。总的来说，我们的结果表明深度网络如何通过构建不变表示来克服维数灾难，并提供学习分层任务所需数据数量的估计。

更新日期：2024-07-02

点击分享查看原文

点击收藏

公开下载

阅读更多本刊新发论文本刊介绍/投稿指南