A Fuzzy Multi-Granularity Convolutional Neural Network with Double Attention Mechanisms for Measuring Semantic Textual Similarity,IEEE Transactions on Fuzzy Systems

当前位置： X-MOL 学术 › IEEE Trans. Fuzzy Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A Fuzzy Multi-Granularity Convolutional Neural Network with Double Attention Mechanisms for Measuring Semantic Textual Similarity
IEEE Transactions on Fuzzy Systems ( IF 10.7 ) Pub Date : 7-15-2024 , DOI: 10.1109/tfuzz.2024.3427801
Butian Zhao ₁ , Runtong Zhang ₁ , Kaiyuan Bai ₂

Affiliation

Semantic textual similarity (STS) is a fundamental task in the field of natural language processing (NLP). Recent advances demonstrate that deep learning-based approaches can achieve excitingly accurate STS measurement. However, existing studies cannot capture the spatial location of important information by attention mechanisms, fail to model sentences from the perspective of overall sentences, and neglect to deal with semantic fuzziness. In this paper, we propose a novel double attentive fuzzy convolutional neural network (DAFCNN) to measure STS more accurately with the consideration of semantic fuzziness. This paper first introduces the spatial attention module and combines it with the improved attentive convolutions to create a multi-granularity convolutional neural network in DAFCNN, which not only extracts critical spatial location information but also models sentences from multiple perspectives at word and sentence levels. Second, DAFCNN pioneers a fuzzy learning module (FLM) to fulfill the extraction of fuzzy semantic features. By using fuzzy membership function, fuzzy aggregation operator, and trainable parameters and weights, FLM can map sentence representations to fuzzy space to constitute representations with more accurate and rich semantics. Third, compared with various state-of-the-art STS models, DAFCNN decreases by 14.57% mean square error, increases by 4.61% Pearson's γ and 8.57% Spearman's ρ on STS score datasets, and increases by 3.39% accuracy and 2.41% F1 score on semantic classification dataset. The ablation experiment demonstrates the effectiveness of each module of DAFCNN. Finally, the experiment results also indicate that FLM is a promising new attempt to incorporate fuzzy set theory in the NLP field.

中文翻译：

具有双重注意力机制的模糊多粒度卷积神经网络用于测量语义文本相似度

语义文本相似度（STS）是自然语言处理（NLP）领域的一项基本任务。最近的进展表明，基于深度学习的方法可以实现令人兴奋的准确 STS 测量。然而，现有研究无法通过注意力机制捕获重要信息的空间位置，无法从整体句子的角度对句子进行建模，并且忽视了语义模糊性的处理。在本文中，我们提出了一种新颖的双注意力模糊卷积神经网络（DAFCNN），在考虑语义模糊性的情况下更准确地测量 STS。本文首先引入了空间注意力模块，并将其与改进的注意力卷积相结合，在 DAFCNN 中创建了多粒度卷积神经网络，不仅可以提取关键的空间位置信息，还可以在词和句子级别从多个角度对句子进行建模。其次，DAFCNN首创了模糊学习模块（FLM）来实现模糊语义特征的提取。通过使用模糊隶属函数、模糊聚合算子以及可训练的参数和权重，FLM可以将句子表示映射到模糊空间，从而构成具有更准确和丰富语义的表示。第三，与各种最先进的STS模型相比，DAFCNN在STS评分数据集上均方误差降低了14.57%，Pearson's γ提高了4.61%，Spearman's ρ提高了4.61%，准确率提高了3.39%，F1提高了2.41%语义分类数据集上的得分。消融实验证明了 DAFCNN 各模块的有效性。最后，实验结果也表明FLM是将模糊集合理论融入NLP领域的一次有前途的新尝试。

更新日期：2024-08-22

点击分享查看原文

点击收藏

阅读更多本刊新发论文本刊介绍/投稿指南