npj Digital Medicine ( IF 12.4 ) Pub Date : 2024-11-07 , DOI: 10.1038/s41746-024-01302-6 Agathe Zecevic, Laurence Jackson, Xinyue Zhang, Polychronis Pavlidis, Jason Dunn, Nigel Trudgill, Shahd Ahmed, Pierfrancesco Visaggi, Zanil YoonusNizar, Angus Roberts, Sebastian S. Zeki
Manual decisions regarding the timing of surveillance endoscopy for premalignant Barrett’s oesophagus (BO) is error-prone. This leads to inefficient resource usage and safety risks. To automate decision-making, we fine-tuned Bidirectional Encoder Representations from Transformers (BERT) models to categorize BO length (EndoBERT) and worst histopathological grade (PathBERT) on 4,831 endoscopy and 4,581 pathology reports from Guy’s and St Thomas’ Hospital (GSTT). The accuracies for EndoBERT test sets from GSTT, King’s College Hospital (KCH), and Sandwell and West Birmingham Hospitals (SWB) were 0.95, 0.86, and 0.99, respectively. Average accuracies for PathBERT were 0.93, 0.91, and 0.92, respectively. A retrospective analysis of 1640 GSTT reports revealed a 27% discrepancy between endoscopists’ decisions and model recommendations. This study underscores the development and deployment of NLP-based software in BO surveillance, demonstrating high performance at multiple sites. The analysis emphasizes the potential efficiency of automation in enhancing precision and guideline adherence in clinical decision-making.
中文翻译:
巴雷特食管的自动决策:自然语言处理工具的开发和部署
关于癌前 Barrett 食管 (BO) 监测内窥镜检查时间的手动决定容易出错。这会导致资源使用效率低下和安全风险。为了实现决策自动化,我们对来自 Transformers 的双向编码器表示 (BERT) 模型进行了微调,以对来自盖伊和圣托马斯医院 (GSTT) 的 4,831 份内窥镜检查和 4,581 份病理报告对 BO 长度 (EndoBERT) 和最差组织病理学分级 (PathBERT) 进行分类。来自 GSTT、国王学院医院 (KCH) 以及桑德韦尔和西伯明翰医院 (SWB) 的 EndoBERT 测试集的准确性分别为 0.95、0.86 和 0.99。PathBERT 的平均准确度分别为 0.93 、 0.91 和 0.92。对 1640 份 GSTT 报告的回顾性分析显示,内窥镜医师的决定和模型建议之间存在 27% 的差异。本研究强调了基于 NLP 的软件在 BO 监控中的开发和部署,在多个站点展示了高性能。该分析强调了自动化在提高临床决策的精确度和指南依从性方面的潜在效率。