当前位置:
X-MOL 学术
›
Soc. Sci. Comput. Rev.
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Assessing Data Quality in the Age of Digital Social Research: A Systematic Review
Social Science Computer Review ( IF 3.0 ) Pub Date : 2024-04-27 , DOI: 10.1177/08944393241245395 Jessica Daikeler 1 , Leon Fröhling 1 , Indira Sen 2 , Lukas Birkenmaier 1 , Tobias Gummer 1, 3 , Jan Schwalbach 1 , Henning Silber 1, 3 , Bernd Weiß 1 , Katrin Weller 1 , Clemens Lechner 1
Social Science Computer Review ( IF 3.0 ) Pub Date : 2024-04-27 , DOI: 10.1177/08944393241245395 Jessica Daikeler 1 , Leon Fröhling 1 , Indira Sen 2 , Lukas Birkenmaier 1 , Tobias Gummer 1, 3 , Jan Schwalbach 1 , Henning Silber 1, 3 , Bernd Weiß 1 , Katrin Weller 1 , Clemens Lechner 1
Affiliation
While survey data has long been the focus of quantitative social science analyses, observational and content data, although long-established, are gaining renewed attention; especially when this type of data is obtained by and for observing digital content and behavior. Today, digital technologies allow social scientists to track “everyday behavior” and to extract opinions from public discussions on online platforms. These new types of digital traces of human behavior, together with computational methods for analyzing them, have opened new avenues for analyzing, understanding, and addressing social science research questions. However, even the most innovative and extensive amounts of data are hollow if they are not of high quality. But what does data quality mean for modern social science data? To investigate this rather abstract question the present study focuses on four objectives. First, we provide researchers with a decision tree to identify appropriate data quality frameworks for a given use case. Second, we determine which data types and quality dimensions are already addressed in the existing frameworks. Third, we identify gaps with respect to different data types and data quality dimensions within the existing frameworks which need to be filled. And fourth, we provide a detailed literature overview for the intrinsic and extrinsic perspectives on data quality. By conducting a systematic literature review based on text mining methods, we identified and reviewed 58 data quality frameworks. In our decision tree, the three categories, namely, data type, the perspective it takes, and its level of granularity, help researchers to find appropriate data quality frameworks. We, furthermore, discovered gaps in the available frameworks with respect to visual and especially linked data and point out in our review that even famous frameworks might miss important aspects. The article ends with a critical discussion of the current state of the literature and potential future research avenues.
中文翻译:
评估数字社会研究时代的数据质量:系统回顾
虽然调查数据长期以来一直是定量社会科学分析的焦点,但观察数据和内容数据虽然早已存在,但正在重新获得关注;特别是当此类数据是通过观察数字内容和行为而获得的时。如今,数字技术使社会科学家能够跟踪“日常行为”并从在线平台上的公众讨论中提取意见。这些新型人类行为的数字痕迹,以及分析它们的计算方法,为分析、理解和解决社会科学研究问题开辟了新的途径。然而,如果质量不高,即使是最具创新性和最广泛的数据也是空洞的。但数据质量对于现代社会科学数据意味着什么?为了调查这个相当抽象的问题,本研究重点关注四个目标。首先,我们为研究人员提供决策树,以确定针对给定用例的适当数据质量框架。其次,我们确定现有框架中已经解决了哪些数据类型和质量维度。第三,我们确定了现有框架内需要填补的不同数据类型和数据质量维度的差距。第四,我们提供了关于数据质量的内在和外在观点的详细文献概述。通过基于文本挖掘方法进行系统文献综述,我们识别并审查了 58 个数据质量框架。在我们的决策树中,三个类别,即数据类型、所采取的视角及其粒度级别,可以帮助研究人员找到合适的数据质量框架。此外,我们发现了可用框架在视觉数据和特别是链接数据方面的差距,并在我们的评论中指出,即使是著名的框架也可能会错过重要的方面。文章最后对文献的现状和未来潜在的研究途径进行了批判性讨论。
更新日期:2024-04-27
中文翻译:
评估数字社会研究时代的数据质量:系统回顾
虽然调查数据长期以来一直是定量社会科学分析的焦点,但观察数据和内容数据虽然早已存在,但正在重新获得关注;特别是当此类数据是通过观察数字内容和行为而获得的时。如今,数字技术使社会科学家能够跟踪“日常行为”并从在线平台上的公众讨论中提取意见。这些新型人类行为的数字痕迹,以及分析它们的计算方法,为分析、理解和解决社会科学研究问题开辟了新的途径。然而,如果质量不高,即使是最具创新性和最广泛的数据也是空洞的。但数据质量对于现代社会科学数据意味着什么?为了调查这个相当抽象的问题,本研究重点关注四个目标。首先,我们为研究人员提供决策树,以确定针对给定用例的适当数据质量框架。其次,我们确定现有框架中已经解决了哪些数据类型和质量维度。第三,我们确定了现有框架内需要填补的不同数据类型和数据质量维度的差距。第四,我们提供了关于数据质量的内在和外在观点的详细文献概述。通过基于文本挖掘方法进行系统文献综述,我们识别并审查了 58 个数据质量框架。在我们的决策树中,三个类别,即数据类型、所采取的视角及其粒度级别,可以帮助研究人员找到合适的数据质量框架。此外,我们发现了可用框架在视觉数据和特别是链接数据方面的差距,并在我们的评论中指出,即使是著名的框架也可能会错过重要的方面。文章最后对文献的现状和未来潜在的研究途径进行了批判性讨论。