当前位置:
X-MOL 学术
›
Psychological Assessment
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
A comparison of scoring algorithms for the NIH Toolbox executive function tasks in a U.S. norming sample.
Psychological Assessment ( IF 3.3 ) Pub Date : 2024-12-01 , DOI: 10.1037/pas0001350 Yusuke Shono,Berivan Ece,Emily H Ho,Aaron J Kaat,Erica M LaForte,Ezgi Ayturk,Richard Gershon
Psychological Assessment ( IF 3.3 ) Pub Date : 2024-12-01 , DOI: 10.1037/pas0001350 Yusuke Shono,Berivan Ece,Emily H Ho,Aaron J Kaat,Erica M LaForte,Ezgi Ayturk,Richard Gershon
Executive function (EF) has been extensively linked to various behavioral, clinical, and educational outcomes. There have been, however, few systematic investigations into how best to score EF tasks using speed and accuracy performance, particularly how to generate a summary and norm-referenced score. Using data from an updated norming study for the NIH Toolbox Version 3 (NIHTB V3) with the general U.S. population aged between 3 and 85 (N = 3,794; 52.3% female; Mage = 25.06, SDage = 22.92), we empirically evaluated and compared several scoring algorithms for two EF tests: The Dimensional Change Card Sort (a test of cognitive flexibility) and Flanker (a test of inhibitory control) Tests. Results showed that joint scoring algorithms integrating speed and accuracy into single scores (namely, rate-correct score, linear integrated speed-accuracy score, and speed-accuracy additive score) provided more robust psychometric evidence for the EF tests than single-index scores of accuracy and speed. These integrated speed-accuracy scores were consistent and stable within and across tasks and time; similar to that of another well-validated EF measure, but as predicted, not related to a crystallized intelligence measure score; and increased rapidly from early childhood through late adolescence/early adulthood and then declined toward late adulthood. The rate-correct score was particularly free from ceiling effects and sensitive to age-related changes and variability in EF performance. Among various scoring algorithms, we recommend rate-correct score, which served as the basis for generating new NIHTB V3 norm-referenced scores, with good test-retest reliability (Dimensional Change Card Sort = .77, Flanker = .81) and acceptable convergent and discriminant validity. (PsycInfo Database Record (c) 2024 APA, all rights reserved).
中文翻译:
美国规范样本中 NIH 工具箱执行函数任务的评分算法比较。
执行功能 (EF) 与各种行为、临床和教育结果广泛相关。然而,关于如何使用速度和准确性性能最好地对 EF 任务进行评分的系统研究很少,特别是如何生成摘要和常模引用分数。使用来自 NIH 工具箱第 3 版 (NIHTB V3) 更新规范研究的数据,美国普通人口年龄在 3 至 85 岁之间(N = 3,794;52.3% 女性;Mage = 25.06,SDage = 22.92),我们实证评估和比较了两项 EF 测试的几种评分算法:维度变化卡排序(认知灵活性测试)和 Flanker(抑制控制测试)测试。结果表明,将速度和准确性整合到单个分数中的联合评分算法(即速率正确分数、线性综合速度-准确性分数和速度-准确性加法分数)为 EF 测试提供了比准确性和速度的单一指数分数更有力的心理测量证据。这些综合的速度-准确性分数在任务和时间内和之间是一致和稳定的;与另一个经过充分验证的 EF 测量方法类似,但正如预测的那样,与结晶的智力测量分数无关;从幼儿期到青春期晚期/成年早期迅速增加,然后在成年后期下降。速率正确分数特别不受天花板效应的影响,并且对与年龄相关的变化和 EF 性能的可变性敏感。在各种评分算法中,我们推荐率正确分数,这是生成新的 NIHTB V3 常模参考分数的基础,具有良好的重测信度 (Dimensional Change Card Sort = .77, Flanker = .81) 以及可接受的收敛和判别效度。 (PsycInfo 数据库记录 (c) 2024 APA,保留所有权利)。
更新日期:2024-12-01
中文翻译:
美国规范样本中 NIH 工具箱执行函数任务的评分算法比较。
执行功能 (EF) 与各种行为、临床和教育结果广泛相关。然而,关于如何使用速度和准确性性能最好地对 EF 任务进行评分的系统研究很少,特别是如何生成摘要和常模引用分数。使用来自 NIH 工具箱第 3 版 (NIHTB V3) 更新规范研究的数据,美国普通人口年龄在 3 至 85 岁之间(N = 3,794;52.3% 女性;Mage = 25.06,SDage = 22.92),我们实证评估和比较了两项 EF 测试的几种评分算法:维度变化卡排序(认知灵活性测试)和 Flanker(抑制控制测试)测试。结果表明,将速度和准确性整合到单个分数中的联合评分算法(即速率正确分数、线性综合速度-准确性分数和速度-准确性加法分数)为 EF 测试提供了比准确性和速度的单一指数分数更有力的心理测量证据。这些综合的速度-准确性分数在任务和时间内和之间是一致和稳定的;与另一个经过充分验证的 EF 测量方法类似,但正如预测的那样,与结晶的智力测量分数无关;从幼儿期到青春期晚期/成年早期迅速增加,然后在成年后期下降。速率正确分数特别不受天花板效应的影响,并且对与年龄相关的变化和 EF 性能的可变性敏感。在各种评分算法中,我们推荐率正确分数,这是生成新的 NIHTB V3 常模参考分数的基础,具有良好的重测信度 (Dimensional Change Card Sort = .77, Flanker = .81) 以及可接受的收敛和判别效度。 (PsycInfo 数据库记录 (c) 2024 APA,保留所有权利)。