当前位置: X-MOL 学术IEEE Trans. Pattern Anal. Mach. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Disentangled Representation Learning
IEEE Transactions on Pattern Analysis and Machine Intelligence ( IF 20.8 ) Pub Date : 7-1-2024 , DOI: 10.1109/tpami.2024.3420937
Xin Wang 1 , Hong Chen 1 , Si'ao Tang 1 , Zihao Wu 1 , Wenwu Zhu 1
Affiliation  

Disentangled Representation Learning (DRL) aims to learn a model capable of identifying and disentangling the underlying factors hidden in the observable data in representation form. The process of separating underlying factors of variation into variables with semantic meaning benefits in learning explainable representations of data, which imitates the meaningful understanding process of humans when observing an object or relation. As a general learning strategy, DRL has demonstrated its power in improving the model explainability, controlability, robustness, as well as generalization capacity in a wide range of scenarios such as computer vision, natural language processing, and data mining. In this article, we comprehensively investigate DRL from various aspects including motivations, definitions, methodologies, evaluations, applications, and model designs. We first present two well-recognized definitions, i.e., Intuitive Definition and Group Theory Definition for disentangled representation learning. We further categorize the methodologies for DRL into four groups from the following perspectives, the model type, representation structure, supervision signal, and independence assumption. We also analyze principles to design different DRL models that may benefit different tasks in practical applications. Finally, we point out challenges in DRL as well as potential research directions deserving future investigations. We believe this work may provide insights for promoting the DRL research in the community.

中文翻译:


解开表示学习



解缠表示学习(DRL)旨在学习一种能够识别和解开隐藏在表示形式的可观察数据中的潜在因素的模型。将潜在的变异因素分离为具有语义意义的变量的过程有利于学习可解释的数据表示,这模仿了人类在观察对象或关系时有意义的理解过程。作为一种通用的学习策略,DRL 在计算机视觉、自然语言处理、数据挖掘等广泛场景中展示了其在提高模型可解释性、可控性、鲁棒性以及泛化能力方面的强大能力。在本文中,我们从动机、定义、方法、评估、应用和模型设计等各个方面全面研究了 DRL。我们首先提出两个公认的定义,即解纠缠表示学习的直观定义和群论定义。我们从模型类型、表示结构、监督信号和独立性假设等角度进一步将 DRL 方法分为四类。我们还分析了设计不同 DRL 模型的原理,这些模型可能有利于实际应用中的不同任务。最后,我们指出 DRL 的挑战以及值得未来研究的潜在研究方向。我们相信这项工作可以为促进社区 DRL 研究提供见解。
更新日期:2024-08-22
down
wechat
bug