Data aggregation can lead to biased inferences in Bayesian linear mixed models and Bayesian analysis of variance.,Psychological Methods

当前位置： X-MOL 学术 › Psychological Methods › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Data aggregation can lead to biased inferences in Bayesian linear mixed models and Bayesian analysis of variance.
Psychological Methods ( IF 7.6 ) Pub Date : 2024-01-25 , DOI: 10.1037/met0000621
Daniel J Schad ₁ , Bruno Nicenboim ₂ , Shravan Vasishth ₃

Affiliation

Bayesian linear mixed-effects models (LMMs) and Bayesian analysis of variance (ANOVA) are increasingly being used in the cognitive sciences to perform null hypothesis tests, where a null hypothesis that an effect is zero is compared with an alternative hypothesis that the effect exists and is different from zero. While software tools for Bayes factor null hypothesis tests are easily accessible, how to specify the data and the model correctly is often not clear. In Bayesian approaches, many authors use data aggregation at the by-subject level and estimate Bayes factors on aggregated data. Here, we use simulation-based calibration for model inference applied to several example experimental designs to demonstrate that, as with frequentist analysis, such null hypothesis tests on aggregated data can be problematic in Bayesian analysis. Specifically, when random slope variances differ (i.e., violated sphericity assumption), Bayes factors are too conservative for contrasts where the variance is small and they are too liberal for contrasts where the variance is large. Running Bayesian ANOVA on aggregated data can-if the sphericity assumption is violated-likewise lead to biased Bayes factor results. Moreover, Bayes factors for by-subject aggregated data are biased (too liberal) when random item slope variance is present but ignored in the analysis. These problems can be circumvented or reduced by running Bayesian LMMs on nonaggregated data such as on individual trials, and by explicitly modeling the full random effects structure. Reproducible code is available from https://osf.io/mjf47/. (PsycInfo Database Record (c) 2024 APA, all rights reserved).

中文翻译：

数据聚合可能会导致贝叶斯线性混合模型和贝叶斯方差分析中出现有偏差的推论。

贝叶斯线性混合效应模型 (LMM) 和贝叶斯方差分析 (ANOVA) 越来越多地在认知科学中用于执行零假设检验，其中将效应为零的零假设与效应存在的备择假设进行比较并且不同于零。虽然贝叶斯因子零假设检验的软件工具很容易获得，但如何正确指定数据和模型通常不清楚。在贝叶斯方法中，许多作者在按主题级别使用数据聚合，并根据聚合数据估计贝叶斯因子。在这里，我们使用基于模拟的校准来将模型推理应用于几个示例实验设计，以证明与频率分析一样，这种对聚合数据的零假设检验在贝叶斯分析中可能会出现问题。具体来说，当随机斜率方差不同时（即，违反球形度假设），贝叶斯因子对于方差较小的对比过于保守，而对于方差较大的对比则过于自由。如果违反球形假设，对聚合数据运行贝叶斯方差分析同样会导致有偏差的贝叶斯因子结果。此外，当存在随机项斜率方差但在分析中被忽略时，按主题聚合数据的贝叶斯因子存在偏差（过于自由）。通过对非聚合数据（例如单个试验）运行贝叶斯 LMM，以及对完整随机效应结构进行显式建模，可以避免或减少这些问题。可从 https://osf.io/mjf47/ 获取可重现的代码。（PsycInfo 数据库记录 (c) 2024 APA，保留所有权利）。

更新日期：2024-01-25

点击分享查看原文

点击收藏

阅读更多本刊新发论文