当前位置: X-MOL 学术Psychological Review › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Linking confidence biases to reinforcement-learning processes.
Psychological Review ( IF 5.1 ) Pub Date : 2023-05-08 , DOI: 10.1037/rev0000424
Nahuel Salem-Garcia 1 , Stefano Palminteri 2 , Maël Lebreton 1
Affiliation  

We systematically misjudge our own performance in simple economic tasks. First, we generally overestimate our ability to make correct choices-a bias called overconfidence. Second, we are more confident in our choices when we seek gains than when we try to avoid losses-a bias we refer to as the valence-induced confidence bias. Strikingly, these two biases are also present in reinforcement-learning (RL) contexts, despite the fact that outcomes are provided trial-by-trial and could, in principle, be used to recalibrate confidence judgments online. How confidence biases emerge and are maintained in reinforcement-learning contexts is thus puzzling and still unaccounted for. To explain this paradox, we propose that confidence biases stem from learning biases, and test this hypothesis using data from multiple experiments, where we concomitantly assessed instrumental choices and confidence judgments, during learning and transfer phases. Our results first show that participants' choices in both tasks are best accounted for by a reinforcement-learning model featuring context-dependent learning and confirmatory updating. We then demonstrate that the complex, biased pattern of confidence judgments elicited during both tasks can be explained by an overweighting of the learned value of the chosen option in the computation of confidence judgments. We finally show that, consequently, the individual learning model parameters responsible for the learning biases-confirmatory updating and outcome context-dependency-are predictive of the individual metacognitive biases. We conclude suggesting that the metacognitive biases originate from fundamentally biased learning computations. (PsycInfo Database Record (c) 2023 APA, all rights reserved).

中文翻译:

将置信偏差与强化学习过程联系起来。

在简单的经济任务中,我们系统性地错误判断了自己的表现。首先,我们通常会高估自己做出正确选择的能力——这种偏见被称为过度自信。其次,当我们寻求收益时,我们对自己的选择比试图避免损失时更有信心——我们将这种偏差称为价诱导的置信偏差。引人注目的是,这两种偏见也存在于强化学习(RL)环境中,尽管结果是通过试验提供的,并且原则上可以用于重新校准在线置信度判断。因此,在强化学习环境中,置信偏差是如何出现和维持的,令人费解且仍然无法解释。为了解释这个悖论,我们提出置信偏差源于学习偏差,并使用来自多个实验的数据来检验这个假设,在学习和迁移阶段,我们同时评估了工具选择和信心判断。我们的结果首先表明,参与者在这两项任务中的选择最好通过具有上下文依赖学习和确认性更新特征的强化学习模型来解释。然后,我们证明,在这两项任务中引起的复杂且有偏见的置信判断模式可以通过在置信判断计算中所选选项的学习值的过重来解释。因此,我们最终表明,负责学习偏差的个体学习模型参数(确认更新和结果上下文依赖性)可以预测个体元认知偏差。我们的结论是,元认知偏差源于根本上有偏差的学习计算。(PsycInfo 数据库记录 (c) 2023 APA,保留所有权利)。
更新日期:2023-05-08
down
wechat
bug