当前位置: X-MOL 学术Int. J. Comput. Vis. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Interweaving Insights: High-Order Feature Interaction for Fine-Grained Visual Recognition
International Journal of Computer Vision ( IF 11.6 ) Pub Date : 2024-10-20 , DOI: 10.1007/s11263-024-02260-y
Arindam Sikdar, Yonghuai Liu, Siddhardha Kedarisetty, Yitian Zhao, Amr Ahmed, Ardhendu Behera

This paper presents a novel approach for Fine-Grained Visual Classification (FGVC) by exploring Graph Neural Networks (GNNs) to facilitate high-order feature interactions, with a specific focus on constructing both inter- and intra-region graphs. Unlike previous FGVC techniques that often isolate global and local features, our method combines both features seamlessly during learning via graphs. Inter-region graphs capture long-range dependencies to recognize global patterns, while intra-region graphs delve into finer details within specific regions of an object by exploring high-dimensional convolutional features. A key innovation is the use of shared GNNs with an attention mechanism coupled with the Approximate Personalized Propagation of Neural Predictions (APPNP) message-passing algorithm, enhancing information propagation efficiency for better discriminability and simplifying the model architecture for computational efficiency. Additionally, the introduction of residual connections improves performance and training stability. Comprehensive experiments showcase state-of-the-art results on benchmark FGVC datasets, affirming the efficacy of our approach. This work underscores the potential of GNN in modeling high-level feature interactions, distinguishing it from previous FGVC methods that typically focus on singular aspects of feature representation. Our source code is available at https://github.com/Arindam-1991/I2-HOFI.



中文翻译:


Interwaving Insights:用于细粒度视觉识别的高阶特征交互



本文通过探索图神经网络 (GNN) 来促进高阶特征交互,提出了一种细粒度视觉分类 (FGVC) 的新方法,特别关注构建区域间和区域内图。与以前通常隔离全局和局部特征的 FGVC 技术不同,我们的方法在学习过程中通过图无缝地结合了这两种特征。区域间图捕获长距离依赖关系以识别全局模式,而区域内图通过探索高维卷积特征来深入研究对象特定区域内的更精细细节。一项关键创新是将具有注意力机制的共享 GNN 与神经预测的近似个性化传播 (APPNP) 消息传递算法结合使用,从而提高信息传播效率以提高可区分性,并简化模型架构以提高计算效率。此外,残差连接的引入提高了性能和训练稳定性。综合实验在基准 FGVC 数据集上展示了最先进的结果,肯定了我们方法的有效性。这项工作强调了 GNN 在建模高级特征交互方面的潜力,将其与以前通常关注特征表示的单一方面的 FGVC 方法区分开来。我们的源代码可在 https://github.com/Arindam-1991/I2-HOFI 获取。

更新日期:2024-10-20
down
wechat
bug