当前位置: X-MOL 学术Br. J. Psychiatry › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Detection of suicidality from medical text using privacy-preserving large language models
The British Journal of Psychiatry ( IF 8.7 ) Pub Date : 2024-11-05 , DOI: 10.1192/bjp.2024.134
Isabella Catharina Wiest, Falk Gerrik Verhees, Dyke Ferber, Jiefu Zhu, Michael Bauer, Ute Lewitzka, Andrea Pfennig, Pavol Mikolas, Jakob Nikolas Kather

Background

Attempts to use artificial intelligence (AI) in psychiatric disorders show moderate success, highlighting the potential of incorporating information from clinical assessments to improve the models. This study focuses on using large language models (LLMs) to detect suicide risk from medical text in psychiatric care.

Aims

To extract information about suicidality status from the admission notes in electronic health records (EHRs) using privacy-sensitive, locally hosted LLMs, specifically evaluating the efficacy of Llama-2 models.

Method

We compared the performance of several variants of the open source LLM Llama-2 in extracting suicidality status from 100 psychiatric reports against a ground truth defined by human experts, assessing accuracy, sensitivity, specificity and F1 score across different prompting strategies.

Results

A German fine-tuned Llama-2 model showed the highest accuracy (87.5%), sensitivity (83.0%) and specificity (91.8%) in identifying suicidality, with significant improvements in sensitivity and specificity across various prompt designs.

Conclusions

The study demonstrates the capability of LLMs, particularly Llama-2, in accurately extracting information on suicidality from psychiatric records while preserving data privacy. This suggests their application in surveillance systems for psychiatric emergencies and improving the clinical management of suicidality by improving systematic quality control and research.



中文翻译:


使用隐私保护大型语言模型从医学文本中检测自杀倾向


 背景


在精神疾病中使用人工智能 (AI) 的尝试显示出中等成功率,这凸显了整合临床评估信息以改进模型的潜力。本研究的重点是使用大型语言模型 (LLMs) 来检测精神病护理中医学文本的自杀风险。

 目标


使用隐私敏感、本地托管的 LLMs,特别是评估 Llama-2 模型的疗效。

 方法


我们比较了开源 LLM Llama-2 的几种变体在从 100 份精神病报告中提取自杀状态与人类专家定义的基本事实的性能,评估不同提示策略的准确性、敏感性、特异性和 F1 评分。

 结果


德国微调的 Llama-2 模型在识别自杀倾向方面显示出最高的准确性 (87.5%) 、敏感性 (83.0%) 和特异性 (91.8%),在各种提示设计中敏感性和特异性都有显著提高。

 结论


该研究表明LLMs尤其是 Llama-2,能够在保护数据隐私的同时从精神病记录中准确提取自杀信息。这表明它们应用于精神急症监测系统,并通过改进系统质量控制和研究来改善自杀倾向的临床管理。

更新日期:2024-11-05
down
wechat
bug