Complex & Intelligent Systems ( IF 5.0 ) Pub Date : 2024-11-09 , DOI: 10.1007/s40747-024-01631-9 Lal Khan, Atika Qazi, Hsien-Tsung Chang, Mousa Alhajlah, Awais Mahmood
Sentiment analysis (SA) as a research field has gained popularity among the researcher throughout the globe over the past 10 years. Deep neural networks (DNN) and word vector models are employed nowadays and perform well in sentiment analysis. Among the different deep neural networks utilized for SA globally, Bi-directional long short-term memory (Bi-LSTM), BERT, and CNN models have received much attention. Even though these models can process a wide range of text types, Because DNNs treat different features the same, using these models in the feature learning phase of a DNN model leads to the creation of a feature space with very high dimensionality. We suggest an attention-based, stacked, two-layer CNN-Bi-LSTM DNN to overcome these glitches. After local feature extraction, by applying stacked two-layer Bi-LSTM, our proposed model extracted coming and outgoing sequences by seeing sequential data streams in backward and forward directions. The output of the stacked two-layer Bi-LSTM is supplied to the attention layer to assign various words with varying values. A second Bi-LSTM layer is constructed atop the initial layer in the suggested network to increase performance. Various experiments have been conducted to evaluate the effectiveness of our proposed model on two Urdu sentiment analysis datasets named as UCSA-21 and UCSA, and an accuracies of 83.12% and 78.91% achieved, respectively.
中文翻译:
赋能乌尔都语情感分析:一种基于注意力的堆叠式 CNN-Bi-LSTM DNN,具有多语言 BERT
在过去的 10 年里,情感分析 (SA) 作为一个研究领域在全球研究人员中越来越受欢迎。现在采用深度神经网络 (DNN) 和词向量模型,并且在情感分析中表现良好。在全球用于 SA 的不同深度神经网络中,双向长短期记忆 (Bi-LSTM)、BERT 和 CNN 模型受到广泛关注。尽管这些模型可以处理各种文本类型,但由于 DNN 对不同特征的处理方式相同,因此在 DNN 模型的特征学习阶段使用这些模型会导致创建具有非常高维度的特征空间。我们建议使用基于注意力的堆叠式双层 CNN-Bi-LSTM DNN 来克服这些故障。在局部特征提取之后,通过应用堆叠的两层 Bi-LSTM,我们提出的模型通过查看向后和向前的连续数据流来提取传入和传出序列。堆叠的两层 Bi-LSTM 的输出被提供给注意力层,以分配具有不同值的各种单词。在建议网络的初始层之上构建第二个 Bi-LSTM 层,以提高性能。已经进行了各种实验,以评估我们提出的模型在名为 UCSA-21 和 UCSA 的两个乌尔都语情感分析数据集上的有效性,并分别实现了 83.12% 和 78.91% 的准确率。