Disagreement Matters: Exploring Internal Diversification for Redundant Attention in Generic Facial Action Analysis,IEEE Transactions on Affective Computing

当前位置： X-MOL 学术 › IEEE Trans. Affect. Comput. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Disagreement Matters: Exploring Internal Diversification for Redundant Attention in Generic Facial Action Analysis
IEEE Transactions on Affective Computing ( IF 9.6 ) Pub Date : 2023-06-16 , DOI: 10.1109/taffc.2023.3286838
Xiaotian Li ₁ , Zheng Zhang ₂ , Xiang Zhang ₁ , Taoyue Wang ₁ , Zhihua Li ₁ , Huiyuan Yang ₃ , Umur Ciftci ₁ , Qiang Ji ₄ , Jeffrey Cohn ₅ , Lijun Yin ₆

Affiliation

This paper demonstrates the effectiveness of a diversification mechanism for building a more robust multi-attention system in generic facial action analysis. While previous multi-attention (e.g., visual attention and self-attention) research on facial expression recognition (FER) and Action Unit (AU) detection have been thoroughly studied to focus on “external attention diversification”, where attention branches localize different facial areas, we delve into the realm of “internal attention diversification” and explore the impact of diverse attention patterns within the same Region of Interest (RoI). Our experiments reveal that variability in attention patterns significantly impacts model performance, indicating that unconstrained multi-attention plagued by redundancy and over-parameterization, leading to sub-optimal results. To tackle this issue, we propose a compact module that guides the model to achieve self-diversified multi-attention. Our method is applied to both CNN-based and Transformer-based models, benchmarked on popular databases such as BP4D and DISFA for AU detection, as well as CK+, MMI, BU-3DFE, and BP4D+ for facial expression recognition. We also evaluate the mechanism on Self-attention and Channel-wise attention designs for improving their adaptive capabilities in multi-modal feature fusion tasks. The multi-modal evaluation is conducted on BP4D, BP4D+, and our newly developed large-scale comprehensive emotion database BP4D++, which contains well-synchronized and aligned sensor modalities, addressing the scarcity of annotations and identities in human affective computing. We plan to release the new database to the research community, fostering further advancements in this field.

中文翻译：

分歧问题：探索通用面部动作分析中冗余注意力的内部多样化

本文展示了在通用面部动作分析中建立更强大的多注意力系统的多样化机制的有效性。虽然之前对面部表情识别（FER）和动作单元（AU）检测的多注意力（例如视觉注意力和自注意力）研究已经深入研究以关注“外部注意力多样化”，其中注意力分支定位不同的面部区域，我们深入研究“内部注意力多样化”领域，并探讨同一兴趣区域（RoI）内不同注意力模式的影响。我们的实验表明，注意力模式的可变性显着影响模型性能，表明不受约束的多重注意力受到冗余和过度参数化的困扰，导致次优结果。为了解决这个问题，我们提出了一个紧凑的模块来指导模型实现自我多样化的多注意力。我们的方法适用于基于 CNN 和基于 Transformer 的模型，以流行的数据库为基准，例如用于 AU 检测的 BP4D 和 DISFA，以及用于面部表情识别的 CK+、MMI、BU-3DFE 和 BP4D+。我们还评估了自注意力和通道注意力设计的机制，以提高其在多模态特征融合任务中的自适应能力。多模态评估是在BP4D、BP4D+和我们新开发的大型综合情感数据库BP4D++上进行的，其中包含良好同步和对齐的传感器模态，解决了人类情感计算中注释和身份的稀缺问题。我们计划向研究界发布新数据库，促进该领域的进一步进步。

更新日期：2023-06-16

点击分享查看原文

点击收藏

阅读更多本刊新发论文本刊介绍/投稿指南