A simple yet effective self-debiasing framework for transformer models,Artificial Intelligence

当前位置： X-MOL 学术 › Artif. Intell. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A simple yet effective self-debiasing framework for transformer models
Artificial Intelligence ( IF 5.1 ) Pub Date : 2024-12-02 , DOI: 10.1016/j.artint.2024.104258
Xiaoyue Wang, Xin Liu, Lijie Wang, Suhang Wu, Jinsong Su, Hua Wu

Current Transformer-based natural language understanding (NLU) models heavily rely on dataset biases, while failing to handle real-world out-of-distribution (OOD) instances. Many methods have been proposed to deal with this issue, but they ignore the fact that the features learned in different layers of Transformer-based NLU models are different. In this paper, we first conduct preliminary studies to obtain two conclusions: 1) both low- and high-layer sentence representations encode common biased features during training; 2) the low-layer sentence representations encode fewer unbiased features than the high-layer ones. Based on these conclusions, we propose a simple yet effective self-debiasing framework for Transformer-based NLU models. Concretely, we first stack a classifier on a selected low layer. Then, we introduce a residual connection that feeds the low-layer sentence representation to the top-layer classifier. In this way, the top-layer sentence representation will be trained to ignore the common biased features encoded by the low-layer sentence representation and focus on task-relevant unbiased features. During inference, we remove the residual connection and directly use the top-layer sentence representation to make predictions. Extensive experiments and in-depth analyses on NLU tasks demonstrate the superiority of our framework, achieving a new state-of-the-art (SOTA) on three OOD test sets.

中文翻译：

一种简单而有效的 transformer 模型自去偏置框架

当前基于 Transformer 的自然语言理解（NLU）模型严重依赖数据集偏差，而无法处理现实世界的分布外（OOD）实例。已经提出了许多方法来解决这个问题，但它们忽略了这样一个事实，即在基于 Transformer 的 NLU 模型的不同层中学到的特征是不同的。在本文中，我们首先进行了初步研究，得出两个结论：1）低层和高层句子表示在训练过程中都编码了常见的偏见特征;2）低层句子表示编码的无偏特征比高层句子表示少。基于这些结论，我们为基于 Transformer 的 NLU 模型提出了一个简单而有效的自去偏置框架。具体来说，我们首先将分类器堆叠在选定的低层上。然后，我们引入了一个残差连接，将低层句子表示提供给顶层分类器。通过这种方式，顶层句子表示将被训练为忽略由低层句子表示编码的常见偏特征，并专注于与任务相关的无偏特征。在推理过程中，我们去除残差连接，直接使用顶层的句子表示进行预测。对 NLU 任务的广泛实验和深入分析证明了我们框架的优越性，在三个 OOD 测试集上实现了新的最先进的（SOTA）。

更新日期：2024-12-02

点击分享查看原文

点击收藏

公开下载

阅读更多本刊新发论文本刊介绍/投稿指南