当前位置:
X-MOL 学术
›
Sociological Methods & Research
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Seeded Topic Models in Digital Archives: Analyzing Interpretations of Immigration in Swedish Newspapers, 1945–2019
Sociological Methods & Research ( IF 6.5 ) Pub Date : 2024-08-22 , DOI: 10.1177/00491241241268453 Miriam Hurtado Bodell 1 , Måns Magnusson 1, 2 , Marc Keuschnigg 1, 3
Sociological Methods & Research ( IF 6.5 ) Pub Date : 2024-08-22 , DOI: 10.1177/00491241241268453 Miriam Hurtado Bodell 1 , Måns Magnusson 1, 2 , Marc Keuschnigg 1, 3
Affiliation
Sociologists are discussing the need for more formal ways to extract meaning from digital text archives. We focus attention on the seeded topic model, a semi-supervised extension to the standard topic model that allows sociological knowledge to be infused into the computational learning of meaning structures. Seed words help crystallize topics around known concepts, while utilizing topic models’ functionality to identify associations in text based on word co-occurrences. The method estimates a concept’s shared interpretation (or framing) via its associations with other frequently co-occurring topics. In a case study, we extract longitudinal measures of media frames regarding immigration from a vast corpus of millions of Swedish newspaper articles from the period 1945–2019. We infer turning points that partition the immigration discourse into meaningful eras and locate Sweden’s era of multicultural ideals that coined its tolerant reputation.
中文翻译:
数字档案中的种子主题模型:分析瑞典报纸中移民的解释,1945-2019
社会学家正在讨论是否需要更正式的方法来从数字文本档案中提取含义。我们将注意力集中在种子主题模型上,这是标准主题模型的半监督扩展,它允许将社会学知识融入到意义结构的计算学习中。种子词有助于围绕已知概念明确主题,同时利用主题模型的功能根据单词共现来识别文本中的关联。该方法通过概念与其他经常同时出现的主题的关联来估计概念的共享解释(或框架)。在一项案例研究中,我们从 1945 年至 2019 年期间数百万篇瑞典报纸文章的庞大语料库中提取了有关移民的媒体框架的纵向衡量标准。我们推断出将移民话语划分为有意义的时代的转折点,并确定了瑞典创造了宽容声誉的多元文化理想时代。
更新日期:2024-08-22
中文翻译:
数字档案中的种子主题模型:分析瑞典报纸中移民的解释,1945-2019
社会学家正在讨论是否需要更正式的方法来从数字文本档案中提取含义。我们将注意力集中在种子主题模型上,这是标准主题模型的半监督扩展,它允许将社会学知识融入到意义结构的计算学习中。种子词有助于围绕已知概念明确主题,同时利用主题模型的功能根据单词共现来识别文本中的关联。该方法通过概念与其他经常同时出现的主题的关联来估计概念的共享解释(或框架)。在一项案例研究中,我们从 1945 年至 2019 年期间数百万篇瑞典报纸文章的庞大语料库中提取了有关移民的媒体框架的纵向衡量标准。我们推断出将移民话语划分为有意义的时代的转折点,并确定了瑞典创造了宽容声誉的多元文化理想时代。