当前位置: X-MOL 学术Psychological Methods › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The dire disregard of measurement invariance testing in psychological science.
Psychological Methods ( IF 7.6 ) Pub Date : 2023-12-25 , DOI: 10.1037/met0000624
Esther Maassen 1 , E Damiano D'Urso 1 , Marcel A L M van Assen 1 , Michèle B Nuijten 1 , Kim De Roover 1 , Jelte M Wicherts 1
Affiliation  

Self-report scales are widely used in psychology to compare means in latent constructs across groups, experimental conditions, or time points. However, for these comparisons to be meaningful and unbiased, the scales must demonstrate measurement invariance (MI) across compared time points or (experimental) groups. MI testing determines whether the latent constructs are measured equivalently across groups or time, which is essential for meaningful comparisons. We conducted a systematic review of 426 psychology articles with openly available data, to (a) examine common practices in conducting and reporting of MI testing, (b) assess whether we could reproduce the reported MI results, and (c) conduct MI tests for the comparisons that enabled sufficiently powerful MI testing. We identified 96 articles that contained a total of 929 comparisons. Results showed that only 4% of the 929 comparisons underwent MI testing, and the tests were generally poorly reported. None of the reported MI tests were reproducible, and only 26% of the 174 newly performed MI tests reached sufficient (scalar) invariance, with MI failing completely in 58% of tests. Exploratory analyses suggested that in nearly half of the comparisons where configural invariance was rejected, the number of factors differed between groups. These results indicate that MI tests are rarely conducted and poorly reported in psychological studies. We observed frequent violations of MI, suggesting that reported differences between (experimental) groups may not be solely attributed to group differences in the latent constructs. We offer recommendations aimed at improving reporting and computational reproducibility practices in psychology. (PsycInfo Database Record (c) 2024 APA, all rights reserved).

中文翻译:


心理科学中对测量不变性测试的严重忽视。



自我报告量表在心理学中广泛用于比较不同群体、实验条件或时间点的潜在结构的平均值。然而,为了使这些比较有意义且公正,量表必须证明比较时间点或(实验)组之间的测量不变性(MI)。 MI 测试确定潜在结构是否在不同组或时间之间进行同等测量,这对于有意义的比较至关重要。我们对 426 篇具有公开数据的心理学文章进行了系统回顾,以 (a) 检查进行和报告 MI 测试的常见做法,(b) 评估我们是否可以重现报告的 MI 结果,以及 (c) 对以下对象进行 MI 测试:这些比较能够实现足够强大的 MI 测试。我们确定了 96 篇文章,总共包含 929 项比较。结果显示,929 项比较中只有 4% 进行了 MI 测试,而且测试报告普遍较差。所报告的 MI 测试均不可重复,并且在 174 个新执行的 MI 测试中,只有 26% 达到了足够的(标量)不变性,其中 58% 的测试完全失败。探索性分析表明,在拒绝配置不变性的近一半比较中,组间因素的数量有所不同。这些结果表明,心理研究中很少进行 MI 测试,而且报告也很少。我们观察到频繁违反 MI,这表明(实验)组之间报告的差异可能不仅仅归因于潜在结构的组差异。我们提供旨在改进心理学报告和计算再现性实践的建议。 (PsycInfo 数据库记录 (c) 2024 APA,保留所有权利)。
更新日期:2023-12-25
down
wechat
bug