npj Digital Medicine ( IF 12.4 ) Pub Date : 2024-08-02 , DOI: 10.1038/s41746-024-01200-x Myura Nagendran 1, 2, 3 , Paul Festor 1, 3, 4 , Matthieu Komorowski 2 , Anthony C Gordon 2 , Aldo A Faisal 1, 3, 4, 5
We studied clinical AI-supported decision-making as an example of a high-stakes setting in which explainable AI (XAI) has been proposed as useful (by theoretically providing physicians with context for the AI suggestion and thereby helping them to reject unsafe AI recommendations). Here, we used objective neurobehavioural measures (eye-tracking) to see how physicians respond to XAI with N = 19 ICU physicians in a hospital’s clinical simulation suite. Prescription decisions were made both pre- and post-reveal of either a safe or unsafe AI recommendation and four different types of simultaneously presented XAI. We used overt visual attention as a marker for where physician mental attention was directed during the simulations. Unsafe AI recommendations attracted significantly greater attention than safe AI recommendations. However, there was no appreciably higher level of attention placed onto any of the four types of explanation during unsafe AI scenarios (i.e. XAI did not appear to ‘rescue’ decision-makers). Furthermore, self-reported usefulness of explanations by physicians did not correlate with the level of attention they devoted to the explanations reinforcing the notion that using self-reports alone to evaluate XAI tools misses key aspects of the interaction behaviour between human and machine.
中文翻译:
眼动追踪通过安全和不安全的可解释人工智能建议来洞察医生行为
我们研究了人工智能支持的临床决策,作为高风险环境的一个例子,其中可解释的人工智能(XAI)被认为是有用的(理论上为医生提供人工智能建议的背景,从而帮助他们拒绝不安全的人工智能建议)。在这里,我们使用客观的神经行为测量(眼动追踪)来观察医生在医院临床模拟套件中对N = 19 名 ICU 医生的 XAI 的反应。处方决策是在安全或不安全的 AI 建议以及四种不同类型的同时呈现的 XAI 公布之前和之后做出的。我们使用明显的视觉注意力作为模拟过程中医生精神注意力定向的标记。不安全的人工智能建议比安全的人工智能建议吸引了更多的关注。然而,在不安全的人工智能场景中,四种解释类型中的任何一种都没有受到明显更高水平的关注(即 XAI 似乎没有“拯救”决策者)。此外,医生自我报告的解释的有用性与他们对解释的关注程度并不相关,这强化了这样一种观念,即仅使用自我报告来评估 XAI 工具会忽略人与机器之间交互行为的关键方面。