当前位置: X-MOL 学术Inform. Fusion › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Triple disentangled representation learning for multimodal affective analysis
Information Fusion ( IF 14.7 ) Pub Date : 2024-09-03 , DOI: 10.1016/j.inffus.2024.102663
Ying Zhou , Xuefeng Liang , Han Chen , Yin Zhao , Xin Chen , Lida Yu

In multimodal affective analysis (MAA) tasks, the presence of heterogeneity among different modalities has propelled the exploration of the disentanglement methods as a pivotal area. Many emerging studies focus on disentangling the modality-invariant and modality-specific representations from input data and then fusing them for prediction. However, our study shows that modality-specific representations may contain information that is irrelevant or conflicting with the tasks, which downgrades the effectiveness of learned multimodal representations. We revisit the disentanglement issue, and propose a novel triple disentanglement approach, TriDiRA, which disentangles the modality-invariant, effective modality-specific and ineffective modality-specific representations from input data. By fusing only the modality-invariant and effective modality-specific representations, TriDiRA can significantly alleviate the impact of irrelevant and conflicting information across modalities during model training and prediction. Extensive experiments conducted on four benchmark datasets demonstrate the effectiveness and generalization of our triple disentanglement, which outperforms SOTA methods. The code is available at .

中文翻译:


用于多模态情感分析的三重解缠表示学习



在多模态情感分析(MAA)任务中,不同模态之间异质性的存在推动了对解缠结方法的探索,将其作为一个关键领域。许多新兴研究的重点是从输入数据中分离模态不变和模态特定的表示,然后融合它们进行预测。然而,我们的研究表明,特定于模态的表示可能包含与任务无关或冲突的信息,这会降低学习的多模态表示的有效性。我们重新审视解缠结问题,并提出了一种新颖的三重解缠结方法 TriDiRA,该方法从输入数据中解离模态不变、有效模态特定和无效模态特定表示。通过仅融合模态不变和有效的模态特定表示,TriDiRA 可以显着减轻模型训练和预测期间跨模态的不相关和冲突信息的影响。在四个基准数据集上进行的大量实验证明了我们的三重解缠的有效性和泛化性,其性能优于 SOTA 方法。该代码可在 处获取。
更新日期:2024-09-03
down
wechat
bug