当前位置: X-MOL 学术ACM Comput. Surv. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Object-Centric Learning with Capsule Networks: A Survey
ACM Computing Surveys ( IF 23.8 ) Pub Date : 2024-06-21 , DOI: 10.1145/3674500
Fabio De Sousa Ribeiro 1 , Kevin Duarte 2 , Miles Everett 3 , Georgios Leontidis 3 , Mubarak Shah 2
Affiliation  

Capsule networks emerged as a promising alternative to convolutional neural networks for learning object-centric representations. The idea is to explicitly model part-whole hierarchies by using groups of neurons called capsules to encode visual entities, then learn the relationships between these entities dynamically from data. However, a major hurdle for capsule network research has been the lack of a reliable point of reference for understanding their foundational ideas and motivations. This survey provides a comprehensive and critical overview of capsule networks which aims to serve as a main point of reference going forward. To that end, we introduce the fundamental concepts and motivations behind capsule networks, such as equivariant inference. We then cover various technical advances in capsule routing algorithms as well as alternative geometric and generative formulations. We provide a detailed explanation of how capsule networks relate to the attention mechanism in Transformers and uncover non-trivial conceptual similarities between them in the context of object-centric representation learning. We also review the extensive applications of capsule networks in computer vision, video and motion, graph representation learning, natural language processing, medical imaging, and many others. To conclude, we provide an in-depth discussion highlighting promising directions for future work.



中文翻译:


使用胶囊网络进行以对象为中心的学习:一项调查



胶囊网络成为卷积神经网络的一个有前途的替代品,用于学习以对象为中心的表示。这个想法是通过使用称为胶囊的神经元组来编码视觉实体来显式地建模部分-整体层次结构,然后从数据中动态地学习这些实体之间的关系。然而,胶囊网络研究的一个主要障碍是缺乏可靠的参考点来理解其基本思想和动机。这项调查对胶囊网络进行了全面而批判性的概述,旨在作为未来的主要参考点。为此,我们介绍了胶囊网络背后的基本概念和动机,例如等变推理。然后,我们介绍胶囊路由算法以及替代几何和生成公式的各种技术进步。我们详细解释了胶囊网络如何与变形金刚中的注意力机制相关,并在以对象为中心的表示学习的背景下揭示它们之间的重要概念相似性。我们还回顾了胶囊网络在计算机视觉、视频和运动、图形表示学习、自然语言处理、医学成像等领域的广泛应用。最后,我们进行了深入的讨论,强调了未来工作的有希望的方向。

更新日期:2024-06-21
down
wechat
bug