当前位置:
X-MOL 学术
›
arXiv.cs.CL
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Mixup Regularized Adversarial Networks for Multi-Domain Text Classification
arXiv - CS - Computation and Language Pub Date : 2021-01-31 , DOI: arxiv-2102.00467
Yuan Wu, Diana Inkpen, Ahmed El-Roby
arXiv - CS - Computation and Language Pub Date : 2021-01-31 , DOI: arxiv-2102.00467
Yuan Wu, Diana Inkpen, Ahmed El-Roby
Using the shared-private paradigm and adversarial training has significantly
improved the performances of multi-domain text classification (MDTC) models.
However, there are two issues for the existing methods. First, instances from
the multiple domains are not sufficient for domain-invariant feature
extraction. Second, aligning on the marginal distributions may lead to fatal
mismatching. In this paper, we propose a mixup regularized adversarial network
(MRAN) to address these two issues. More specifically, the domain and category
mixup regularizations are introduced to enrich the intrinsic features in the
shared latent space and enforce consistent predictions in-between training
instances such that the learned features can be more domain-invariant and
discriminative. We conduct experiments on two benchmarks: The Amazon review
dataset and the FDU-MTL dataset. Our approach on these two datasets yields
average accuracies of 87.64\% and 89.0\% respectively, outperforming all
relevant baselines.
中文翻译:
用于多域文本分类的混合正则化对抗网络
使用共享-私有范式和对抗训练可显着提高多域文本分类(MDTC)模型的性能。但是,现有方法存在两个问题。首先,来自多个域的实例不足以进行域不变特征提取。其次,在边际分布上对齐可能会导致致命的不匹配。在本文中,我们提出了一个混合正则化对抗网络(MRAN)来解决这两个问题。更具体地说,引入域和类别混合正则化以丰富共享潜在空间中的固有特征,并在训练实例之间强制执行一致的预测,从而使学习的特征可以具有更多的领域不变性和区分性。我们根据两个基准进行实验:Amazon评论数据集和FDU-MTL数据集。我们对这两个数据集的方法的平均准确度分别为87.64 \%和89.0 \%,优于所有相关基准。
更新日期:2021-02-02
中文翻译:

用于多域文本分类的混合正则化对抗网络
使用共享-私有范式和对抗训练可显着提高多域文本分类(MDTC)模型的性能。但是,现有方法存在两个问题。首先,来自多个域的实例不足以进行域不变特征提取。其次,在边际分布上对齐可能会导致致命的不匹配。在本文中,我们提出了一个混合正则化对抗网络(MRAN)来解决这两个问题。更具体地说,引入域和类别混合正则化以丰富共享潜在空间中的固有特征,并在训练实例之间强制执行一致的预测,从而使学习的特征可以具有更多的领域不变性和区分性。我们根据两个基准进行实验:Amazon评论数据集和FDU-MTL数据集。我们对这两个数据集的方法的平均准确度分别为87.64 \%和89.0 \%,优于所有相关基准。