当前位置:
X-MOL 学术
›
Inform. Fusion
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Dynamic collaborative learning with heterogeneous knowledge transfer for long-tailed visual recognition
Information Fusion ( IF 14.7 ) Pub Date : 2024-10-15 , DOI: 10.1016/j.inffus.2024.102734 Hao Zhou, Tingjin Luo, Yongming He
Information Fusion ( IF 14.7 ) Pub Date : 2024-10-15 , DOI: 10.1016/j.inffus.2024.102734 Hao Zhou, Tingjin Luo, Yongming He
Solving the long-tailed visual recognition with deep convolutional neural networks is still a challenging task. As a mainstream method, multi-experts models achieve SOTA accuracy for tackling this problem, but the uncertainty in network learning and the complexity in fusion inference constrain the performance and practicality of the multi-experts models. To remedy this, we propose a novel dynamic collaborative learning with heterogeneous knowledge transfer model (DCHKT) in this paper, in which experts with different expertise collaborate to make predictions. DCHKT consists of two core components: dynamic adaptive weight adjustment and heterogeneous knowledge transfer learning. First, the dynamic adaptive weight adjustment is designed to shift the focus of model training between the global expert and domain experts via dynamic adaptive weight. By modulating the trade-off between the learning of features and classifier, the dynamic adaptive weight adjustment can enhance the discriminative ability of each expert and alleviate the uncertainty of model learning. Then, heterogeneous knowledge transfer learning, which measures the distribution differences between the fusion logits of multiple experts and the predicted logits of each expert with different specialties, can achieve message passing between experts and enhance the consistency of ensemble prediction in model training and inference to promote their collaborations. Finally, extensive experimental results on public long-tailed datasets: CIFAR-LT, ImageNet-LT, Place-LT and iNaturalist2018, demonstrate the effectiveness and superiority of our DCHKT.
中文翻译:
用于长尾视觉识别的异构知识转移的动态协作学习
用深度卷积神经网络解决长尾视觉识别仍然是一项具有挑战性的任务。作为一种主流方法,多专家模型实现了 SOTA 准确性来解决这个问题,但网络学习的不确定性和融合推理的复杂性限制了多专家模型的性能和实用性。为了解决这个问题,我们在本文中提出了一种新的异构知识转移模型 (DCHKT) 动态协作学习,其中具有不同专业知识的专家合作进行预测。DCHKT 由两个核心组件组成:动态自适应权重调整和异构知识迁移学习。首先,动态自适应权重调整旨在通过动态自适应权重将模型训练的重点在全球专家和领域专家之间转移。通过调制特征学习和分类器之间的权衡,动态自适应权重调整可以增强每个专家的判别能力,缓解模型学习的不确定性。然后,异构知识迁移学习,即测量多个专家的融合 Logit 与每个不同专业的专家的预测 Logit 之间的分布差异,可以实现专家之间的消息传递,增强集成预测在模型训练和推理中的一致性,以促进他们的合作。最后,在公共长尾数据集 CIFAR-LT、ImageNet-LT、Place-LT 和 iNaturalist2018 上的广泛实验结果证明了我们 DCHKT 的有效性和优越性。
更新日期:2024-10-15
中文翻译:
用于长尾视觉识别的异构知识转移的动态协作学习
用深度卷积神经网络解决长尾视觉识别仍然是一项具有挑战性的任务。作为一种主流方法,多专家模型实现了 SOTA 准确性来解决这个问题,但网络学习的不确定性和融合推理的复杂性限制了多专家模型的性能和实用性。为了解决这个问题,我们在本文中提出了一种新的异构知识转移模型 (DCHKT) 动态协作学习,其中具有不同专业知识的专家合作进行预测。DCHKT 由两个核心组件组成:动态自适应权重调整和异构知识迁移学习。首先,动态自适应权重调整旨在通过动态自适应权重将模型训练的重点在全球专家和领域专家之间转移。通过调制特征学习和分类器之间的权衡,动态自适应权重调整可以增强每个专家的判别能力,缓解模型学习的不确定性。然后,异构知识迁移学习,即测量多个专家的融合 Logit 与每个不同专业的专家的预测 Logit 之间的分布差异,可以实现专家之间的消息传递,增强集成预测在模型训练和推理中的一致性,以促进他们的合作。最后,在公共长尾数据集 CIFAR-LT、ImageNet-LT、Place-LT 和 iNaturalist2018 上的广泛实验结果证明了我们 DCHKT 的有效性和优越性。