Nature Medicine ( IF 58.7 ) Pub Date : 2024-08-07 , DOI: 10.1038/s41591-024-03185-2 Kai Zhang 1 , Rong Zhou 1 , Eashan Adhikarla 1 , Zhiling Yan 1 , Yixin Liu 1 , Jun Yu 1 , Zhengliang Liu 2 , Xun Chen 3 , Brian D Davison 1 , Hui Ren 4 , Jing Huang 5, 6 , Chen Chen 7 , Yuyin Zhou 8 , Sunyang Fu 9 , Wei Liu 10 , Tianming Liu 2 , Xiang Li 4 , Yong Chen 5, 11, 12, 13 , Lifang He 1 , James Zou 14, 15 , Quanzheng Li 4 , Hongfang Liu 9 , Lichao Sun 1
Traditional biomedical artificial intelligence (AI) models, designed for specific tasks or modalities, often exhibit limited flexibility in real-world deployment and struggle to utilize holistic information. Generalist AI holds the potential to address these limitations due to its versatility in interpreting different data types and generating tailored outputs for diverse needs. However, existing biomedical generalist AI solutions are typically heavyweight and closed source to researchers, practitioners and patients. Here, we describe BiomedGPT, the first open-source and lightweight vision–language foundation model, designed as a generalist capable of performing various biomedical tasks. BiomedGPT achieved state-of-the-art results in 16 out of 25 experiments while maintaining a computing-friendly model scale. We also conducted human evaluations to assess the capabilities of BiomedGPT in radiology visual question answering, report generation and summarization. BiomedGPT exhibits robust prediction ability with a low error rate of 3.8% in question answering, satisfactory performance with an error rate of 8.3% in writing complex radiology reports, and competitive summarization ability with a nearly equivalent preference score to human experts. Our method demonstrates that effective training with diverse data can lead to more practical biomedical AI for improving diagnosis and workflow efficiency.
中文翻译:
适用于各种生物医学任务的通才视觉-语言基础模型
专为特定任务或模式设计的传统生物医学人工智能 (AI) 模型在实际部署中通常表现出有限的灵活性,并且难以利用整体信息。通用 AI 具有解决这些限制的潜力,因为它在解释不同的数据类型和为不同需求生成定制输出方面具有多功能性。然而,现有的生物医学通才 AI 解决方案通常是重量级的,并且对研究人员、从业者和患者来说是闭源的。在这里,我们描述了 BiomedGPT,这是第一个开源和轻量级的视觉语言基础模型,被设计为能够执行各种生物医学任务的通才。BiomedGPT 在 25 个实验中的 16 个实验中取得了最先进的结果,同时保持了计算友好的模型规模。我们还进行了人工评估,以评估 BiomedGPT 在放射学视觉问答、报告生成和总结方面的能力。BiomedGPT 表现出强大的预测能力,问答错误率低至 3.8%,性能令人满意,撰写复杂放射学报告错误率为 8.3%,以及具有竞争力的总结能力,偏好分数与人类专家几乎相同。我们的方法表明,使用不同数据进行有效训练可以带来更实用的生物医学 AI,从而提高诊断和工作流程效率。