当前位置: X-MOL 学术Communication Methods and Measures › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Building the Bridge: Topic Modeling for Comparative Research
Communication Methods and Measures ( IF 11.4 ) Pub Date : 2021-09-07 , DOI: 10.1080/19312458.2021.1965973
Fabienne Lind 1 , Jakob-Moritz Eberl 1 , Olga Eisele 1 , Tobias Heidenreich 1 , Sebastian Galyga 1 , Hajo G. Boomgaarden 1
Affiliation  

ABSTRACT

In communication research, topic modeling is primarily used for discovering systematic patterns in monolingual text corpora. To advance the usage, we provide an overview of recently presented strategies to extract topics from multilingual text collections for the purpose of comparative research. Moreover, we discuss, demonstrate, and facilitate the usability of the “Polylingual Topic Model” (PLTM) for such analyses. The appeal of this model is that it derives lists of related clustered words in different languages with little reliance on translation or multilingual dictionaries and without the need for manual post-hoc matching of topics. PLTM bridges the gap between languages by making use of document connections in training documents. As these training documents are the crucial resource for the model, we compare model evaluation metrics for different strategies to build training documents. By discussing the advantages and limitations of the different strategies in respect to different scenarios, our study contributes to the methodological discussion on automated content analysis of multilingual text corpora.



中文翻译:

搭建桥梁:比较研究的主题建模

摘要

在传播研究中,主题建模主要用于发现单语文本语料库中的系统模式。为了推进使用,我们概述了最近提出的从多语言文本集中提取主题的策略,以进行比较研究。此外,我们讨论、演示和促进“多语言主题模型”(PLTM)在此类分析中的可用性。该模型的吸引力在于它可以导出不同语言的相关聚类单词列表,几乎不依赖翻译或多语言词典,并且无需手动事后匹配主题。PLTM 通过利用培训文档中的文档连接来弥合语言之间的差距。由于这些培训文档是模型的关键资源,我们比较不同策略的模型评估指标来构建训练文档。通过讨论不同策略在不同场景下的优势和局限性,我们的研究有助于多语言文本语料库自动内容分析的方法论讨论。

更新日期:2021-09-07
down
wechat
bug