当前位置: X-MOL 学术Cyberpsychology, Behavior, and Social Networking › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Evaluating for Evidence of Sociodemographic Bias in Conversational AI for Mental Health Support.
Cyberpsychology, Behavior, and Social Networking ( IF 4.2 ) Pub Date : 2024-10-24 , DOI: 10.1089/cyber.2024.0199
Yee Hui Yeo,Yuxin Peng,Muskaan Mehra,Jamil Samaan,Joshua Hakimian,Allistair Clark,Karisma Suchak,Zoe Krut,Taiga Andersson,Susan Persky,Omer Liran,Brennan Spiegel

The integration of large language models (LLMs) into healthcare highlights the need to ensure their efficacy while mitigating potential harms, such as the perpetuation of biases. Current evidence on the existence of bias within LLMs remains inconclusive. In this study, we present an approach to investigate the presence of bias within an LLM designed for mental health support. We simulated physician-patient conversations by using a communication loop between an LLM-based conversational agent and digital standardized patients (DSPs) that engaged the agent in dialogue while remaining agnostic to sociodemographic characteristics. In contrast, the conversational agent was made aware of each DSP's characteristics, including age, sex, race/ethnicity, and annual income. The agent's responses were analyzed to discern potential systematic biases using the Linguistic Inquiry and Word Count tool. Multivariate regression analysis, trend analysis, and group-based trajectory models were used to quantify potential biases. Among 449 conversations, there was no evidence of bias in both descriptive assessments and multivariable linear regression analyses. Moreover, when evaluating changes in mean tone scores throughout a dialogue, the conversational agent exhibited a capacity to show understanding of the DSPs' chief complaints and to elevate the tone scores of the DSPs throughout conversations. This finding did not vary by any sociodemographic characteristics of the DSP. Using an objective methodology, our study did not uncover significant evidence of bias within an LLM-enabled mental health conversational agent. These findings offer a complementary approach to examining bias in LLM-based conversational agents for mental health support.

中文翻译:


评估对话式 AI 中社会人口学偏见的证据,以支持心理健康。



将大型语言模型 (LLMs医疗保健中凸显了确保其有效性同时减轻潜在危害(例如偏见的持续存在)的必要性。目前关于 LLMs尚无定论。在这项研究中,我们提出了一种方法来调查专为心理健康支持而设计的 LLM。我们通过使用基于 LLM 的对话代理和数字标准化患者 (DSP) 之间的通信循环来模拟医患对话,该对话与代理进行对话,同时保持对社会人口学特征的不可知性。相比之下,对话代理了解每个 DSP 的特征,包括年龄、性别、种族/民族和年收入。使用语言调查和字数统计工具分析代理的反应以识别潜在的系统性偏差。使用多变量回归分析、趋势分析和基于组的轨迹模型来量化潜在偏差。在 449 次对话中,描述性评估和多变量线性回归分析均无证据表明存在偏倚。此外,在评估整个对话中平均语气分数的变化时,对话代理表现出对 DSP 主要抱怨的理解并在整个对话中提高 DSP 的语气分数的能力。这一发现不因 DSP 的任何社会人口学特征而变化。使用客观方法,我们的研究没有发现 LLM 启用的心理健康对话代理中存在偏见的重要证据。这些发现为检查基于 LLM。
更新日期:2024-10-24
down
wechat
bug