Communication Methods and Measures ( IF 6.3 ) Pub Date : 2021-12-27 , DOI: 10.1080/19312458.2021.2015574 Christian Baden 1 , Christian Pipal 2 , Martijn Schoonvelde 3 , Mariken A. C. G van der Velden 4
ABSTRACT
We identify three gaps that limit the utility and obstruct the progress of computational text analysis methods (CTAM) for social science research. First, we contend that CTAM development has prioritized technological over validity concerns, giving limited attention to the operationalization of social scientific measurements. Second, we identify a mismatch between CTAMs’ focus on extracting specific contents and document-level patterns, and social science researchers’ need for measuring multiple, often complex contents in the text. Third, we argue that the dominance of English language tools depresses comparative research and inclusivity toward scholarly communities examining languages other than English. We substantiate our claims by drawing upon a broad review of methodological work in the computational social sciences, as well as an inventory of leading research publications using quantitative textual analysis. Subsequently, we discuss implications of these three gaps for social scientists’ uneven uptake of CTAM, as well as the field of computational social science text research as a whole. Finally, we propose a research agenda intended to bridge the identified gaps and improve the validity, utility, and inclusiveness of CTAM.
中文翻译:
社会科学计算文本分析方法的三个差距:研究议程
摘要
我们确定了限制效用并阻碍社会科学研究计算文本分析方法 (CTAM) 进展的三个差距。首先,我们认为 CTAM 的发展优先考虑技术而不是有效性问题,对社会科学测量的操作化关注有限。其次,我们发现 CTAM 对提取特定内容和文档级模式的关注与社会科学研究人员对测量文本中多个通常复杂的内容的需求之间存在不匹配。第三,我们认为英语语言工具的主导地位抑制了比较研究和学术团体对英语以外语言的包容性。我们通过对计算社会科学方法论工作的广泛回顾来证实我们的主张,以及使用定量文本分析的领先研究出版物清单。随后,我们讨论了这三个差距对社会科学家对 CTAM 的不均衡吸收以及整个计算社会科学文本研究领域的影响。最后,我们提出了一项研究议程,旨在弥合已确定的差距并提高 CTAM 的有效性、实用性和包容性。