当前位置: X-MOL 学术Crop Prot. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Small-sample cucumber disease identification based on multimodal self-supervised learning
Crop Protection ( IF 2.5 ) Pub Date : 2024-10-30 , DOI: 10.1016/j.cropro.2024.107006
Yiyi Cao, Guangling Sun, Yuan Yuan, Lei Chen

It is difficult and costly to obtain large-scale, labeled crop disease data in the field of agriculture. How to use small samples of unlabeled data for feature learning has become an urgent problem that needs to be solved. The emergence of self-supervised contrastive learning methods and self-supervised mask learning methods can solve the problem of missing labels on the training data. However, each of these paradigms comes with its own advantages and drawbacks. At the same time, the features learned by dataset in a single modality are limited, ignoring the correlation with other modal information. Hence, this paper introduced an effective framework for multimodal self-supervised learning, denoted as MMSSL, to address the task of identifying cucumber diseases with small sample sizes. Integrating image self-supervised mask learning, image self-supervised contrastive learning, and multimodal image-text contrastive learning, the model can not only learn disease feature information from different modalities, but also capture global and local disease feature information. Simultaneously, the mask learning branch was enhanced by introducing a prompt learning module based on a cross-attention network. This module aided in approximately locating the masked regions in the image data in advance, facilitating the decoder in making accurate decoding predictions. Experimental results demonstrate that the proposed method achieves a 95% accuracy in cucumber disease identification in the absence of labels. The approach effectively uncovers high-level semantic features within multimodal small-sample cucumber disease data. GradCAM is also employed for visual analysis to further understand the decision-making process of the model in disease identification. In conclusion, the proposed method in this paper is advantageous for enhancing the classification accuracy of small-sample cucumber data in a multimodal, unlabeled context, demonstrating good generalization performance.

中文翻译:


基于多模态自监督学习的小样本黄瓜病害鉴定



在农业领域获得大规模的标记作物病害数据既困难又昂贵。如何使用小样本的未标注数据进行特征学习,已成为亟待解决的问题。自监督对比学习方法和自监督掩码学习方法的出现可以解决训练数据上标签缺失的问题。但是,这些范例中的每一种都有其自身的优点和缺点。同时,数据集在单一模态中学习的特征是有限的,忽略了与其他模态信息的相关性。因此,本文引入了一个有效的多模态自我监督学习框架,称为 MMSSL,以解决识别小样本黄瓜病害的任务。该模型集成了图像自监督掩码学习、图像自监督对比学习和多模态图像文本对比学习,不仅可以从不同模态学习疾病特征信息,还可以捕获全局和局部疾病特征信息。同时,通过引入基于交叉注意力网络的提示学习模块,增强了掩码学习分支。该模块有助于提前大致定位图像数据中的掩码区域,便于解码器做出准确的解码预测。实验结果表明,所提方法在没有标签的情况下,黄瓜病害鉴定的准确率达到了 95%。该方法有效地揭示了多模态小样本黄瓜病害数据中的高级语义特征。GradCAM 还用于视觉分析,以进一步了解模型在疾病识别中的决策过程。 综上所述,本文提出的方法有利于提高小样本黄瓜数据在多模态、无标签背景下的分类精度,表现出良好的泛化性能。
更新日期:2024-10-30
down
wechat
bug