当前位置:
X-MOL 学术
›
IEEE Trans. Image Process.
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Feature Mixture on Pre-Trained Model for Few-Shot Learning
IEEE Transactions on Image Processing ( IF 10.8 ) Pub Date : 7-2-2024 , DOI: 10.1109/tip.2024.3411452 Shuo Wang 1 , Jinda Lu 1 , Haiyang Xu 1 , Yanbin Hao 1 , Xiangnan He 1
IEEE Transactions on Image Processing ( IF 10.8 ) Pub Date : 7-2-2024 , DOI: 10.1109/tip.2024.3411452 Shuo Wang 1 , Jinda Lu 1 , Haiyang Xu 1 , Yanbin Hao 1 , Xiangnan He 1
Affiliation
Few-shot learning (FSL) aims at recognizing a novel object under limited training samples. A robust feature extractor (backbone) can significantly improve the recognition performance of the FSL model. However, training an effective backbone is a challenging issue since 1) designing and validating structures of backbones are time-consuming and expensive processes, and 2) a backbone trained on the known (base) categories is more inclined to focus on the textures of the objects it learns, which is hard to describe the novel samples. To solve these problems, we propose a feature mixture operation on the pre-trained (fixed) features: 1) We replace a part of the values of the feature map from a novel category with the content of other feature maps to increase the generalizability and diversity of training samples, which avoids retraining a complex backbone with high computational costs. 2) We use the similarities between the features to constrain the mixture operation, which helps the classifier focus on the representations of the novel object where these representations are hidden in the features from the pre-trained backbone with biased training. Experimental studies on five benchmark datasets in both inductive and transductive settings demonstrate the effectiveness of our feature mixture (FM). Specifically, compared with the baseline on the Mini-ImageNet dataset, it achieves 3.8% and 4.2% accuracy improvements for 1 and 5 training samples, respectively. Additionally, the proposed mixture operation can be used to improve other existing FSL methods based on backbone training.
中文翻译:
用于少样本学习的预训练模型的特征混合
小样本学习(FSL)旨在在有限的训练样本下识别新物体。强大的特征提取器(主干)可以显着提高 FSL 模型的识别性能。然而,训练有效的骨干网是一个具有挑战性的问题,因为 1)设计和验证骨干网的结构是耗时且昂贵的过程,2)在已知(基本)类别上训练的骨干网更倾向于关注对象的纹理。它学习的对象,这很难描述新颖的样本。为了解决这些问题,我们提出了对预训练(固定)特征的特征混合操作:1)我们用其他特征图的内容替换来自新类别的特征图的部分值,以增加泛化性和训练样本的多样性,避免了重新训练计算成本高的复杂骨干网。 2)我们使用特征之间的相似性来约束混合操作,这有助于分类器专注于新对象的表示,其中这些表示隐藏在来自经过偏置训练的预训练主干的特征中。在归纳和传导设置中对五个基准数据集的实验研究证明了我们的特征混合(FM)的有效性。具体来说,与 Mini-ImageNet 数据集上的基线相比,1 个和 5 个训练样本的准确率分别提高了 3.8% 和 4.2%。此外,所提出的混合操作可用于改进基于骨干训练的其他现有 FSL 方法。
更新日期:2024-08-19
中文翻译:
用于少样本学习的预训练模型的特征混合
小样本学习(FSL)旨在在有限的训练样本下识别新物体。强大的特征提取器(主干)可以显着提高 FSL 模型的识别性能。然而,训练有效的骨干网是一个具有挑战性的问题,因为 1)设计和验证骨干网的结构是耗时且昂贵的过程,2)在已知(基本)类别上训练的骨干网更倾向于关注对象的纹理。它学习的对象,这很难描述新颖的样本。为了解决这些问题,我们提出了对预训练(固定)特征的特征混合操作:1)我们用其他特征图的内容替换来自新类别的特征图的部分值,以增加泛化性和训练样本的多样性,避免了重新训练计算成本高的复杂骨干网。 2)我们使用特征之间的相似性来约束混合操作,这有助于分类器专注于新对象的表示,其中这些表示隐藏在来自经过偏置训练的预训练主干的特征中。在归纳和传导设置中对五个基准数据集的实验研究证明了我们的特征混合(FM)的有效性。具体来说,与 Mini-ImageNet 数据集上的基线相比,1 个和 5 个训练样本的准确率分别提高了 3.8% 和 4.2%。此外,所提出的混合操作可用于改进基于骨干训练的其他现有 FSL 方法。