Language Testing ( IF 2.2 ) Pub Date : 2022-02-03 , DOI: 10.1177/02655322211066822
Christopher Nicklin 1 , Joseph P. Vitta 2
Instrument measurement conducted with Rasch analysis is a common process in language assessment research. A recent systematic review of 215 studies involving Rasch analysis in language testing and applied linguistics research reported that 23 different software packages had been utilized. However, none of the analyses were conducted with one of the numerous R-based Rasch analysis software packages, which generally employ one of the three estimation methods: conditional maximum likelihood estimation (CMLE), joint maximum likelihood estimation (JMLE), or marginal maximum likelihood estimation (MMLE). For this study, eRm, a CMLE-based R package, was utilized to conduct a dichotomous Rasch analysis of a Yes/No vocabulary test based on the academic word list. The resulting parameters and diagnostic statistics were compared with the equivalent results from four other R-based Rasch measurement software packages and Winsteps. Finally, all of the packages were utilized in the analysis of 1000 simulated datasets to investigate the extent to which results generated from the contrasting estimation methods converged or diverged. Overall, the differences between the results produced with the three estimation methods were negligible, and the discrepancies observed between datasets were attributable to the software choice as opposed to the estimation method.
中文翻译:

使用是/否词汇测试数据评估跨 R 包的 Rasch 测量估计方法
使用 Rasch 分析进行仪器测量是语言评估研究中的常见过程。最近对涉及语言测试和应用语言学研究中的 Rasch 分析的 215 项研究的系统评价报告说,已经使用了 23 种不同的软件包。然而,没有一个分析是使用众多基于 R 的 Rasch 分析软件包之一进行的,这些软件包通常采用以下三种估计方法之一:条件最大似然估计 (CMLE)、联合最大似然估计 (JMLE) 或边际最大值似然估计(MMLE)。在这项研究中,基于 CMLE 的 R 包 eRm 被用来对基于学术词汇表的 Yes/No 词汇测试进行二分 Rasch 分析。将得到的参数和诊断统计数据与其他四个基于 R 的 Rasch 测量软件包和 Winsteps 的等效结果进行比较。最后,所有软件包都用于分析 1000 个模拟数据集,以研究对比估计方法产生的结果收敛或发散的程度。总体而言,三种估计方法产生的结果之间的差异可以忽略不计,数据集之间观察到的差异可归因于软件选择而不是估计方法。所有软件包都用于分析 1000 个模拟数据集,以研究对比估计方法产生的结果在多大程度上收敛或发散。总体而言,三种估计方法产生的结果之间的差异可以忽略不计,数据集之间观察到的差异可归因于软件选择而不是估计方法。所有软件包都用于分析 1000 个模拟数据集,以研究对比估计方法产生的结果在多大程度上收敛或发散。总体而言,三种估计方法产生的结果之间的差异可以忽略不计,数据集之间观察到的差异可归因于软件选择而不是估计方法。