Trying to outrun causality with machine learning: Limitations of model explainability techniques for exploratory research.,Psychological Methods

当前位置： X-MOL 学术 › Psychological Methods › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Trying to outrun causality with machine learning: Limitations of model explainability techniques for exploratory research.
Psychological Methods ( IF 7.6 ) Pub Date : 2024-09-09 , DOI: 10.1037/met0000699
Matthew J Vowels ₁

Affiliation

Machine learning explainability techniques have been proposed as a means for psychologists to "explain" or interrogate a model in order to gain an understanding of a phenomenon of interest. Researchers concerned with imposing overly restrictive functional form (e.g., as would be the case in a linear regression) may be motivated to use machine learning algorithms in conjunction with explainability techniques, as part of exploratory research, with the goal of identifying important variables that are associated with/predictive of an outcome of interest. However, and as we demonstrate, machine learning algorithms are highly sensitive to the underlying causal structure in the data. The consequences of this are that predictors which are deemed by the explainability technique to be unrelated/unimportant/unpredictive, may actually be highly associated with the outcome. Rather than this being a limitation of explainability techniques per se, we show that it is rather a consequence of the mathematical implications of regression, and the interaction of these implications with the associated conditional independencies of the underlying causal structure. We provide some alternative recommendations for psychologists wanting to explore the data for important variables. (PsycInfo Database Record (c) 2024 APA, all rights reserved).

中文翻译：

试图通过机器学习超越因果关系：探索性研究的模型可解释性技术的局限性。

机器学习可解释性技术已被提议作为心理学家“解释”或询问模型以获得对感兴趣现象的理解的一种手段。关注施加过度限制的函数形式（例如，线性回归中的情况）的研究人员可能会被激励将机器学习算法与可解释性技术结合使用，作为探索性研究的一部分，其目标是识别重要变量与感兴趣的结果相关/预测感兴趣的结果。然而，正如我们所证明的，机器学习算法对数据中潜在的因果结构高度敏感。这样做的后果是，可解释性技术认为不相关/不重要/不可预测的预测因子实际上可能与结果高度相关。我们表明，这并不是可解释性技术本身的限制，而是回归的数学含义以及这些含义与潜在因果结构的相关条件独立性相互作用的结果。我们为想要探索重要变量数据的心理学家提供了一些替代建议。（PsycInfo 数据库记录 (c) 2024 APA，保留所有权利）。

更新日期：2024-09-09

点击分享查看原文

点击收藏

阅读更多本刊新发论文