Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Deep Learning Algorithms for Breast Cancer Detection in a UK Screening Cohort: As Stand-alone Readers and Combined with Human Readers.
Radiology ( IF 12.1 ) Pub Date : 2024-11-01 , DOI: 10.1148/radiol.233147 Sarah E Hickman,Nicholas R Payne,Richard T Black,Yuan Huang,Andrew N Priest,Sue Hudson,Bahman Kasmai,Arne Juette,Muzna Nanaa,Fiona J Gilbert
Radiology ( IF 12.1 ) Pub Date : 2024-11-01 , DOI: 10.1148/radiol.233147 Sarah E Hickman,Nicholas R Payne,Richard T Black,Yuan Huang,Andrew N Priest,Sue Hudson,Bahman Kasmai,Arne Juette,Muzna Nanaa,Fiona J Gilbert
Background Deep learning (DL) algorithms have shown promising results in mammographic screening either compared to a single reader or, when deployed in conjunction with a human reader, compared with double reading. Purpose To externally validate the performance of three DL algorithms as mammographic screen readers in an independent UK data set. Materials and Methods Three commercial DL algorithms (DL-1, DL-2, and DL-3) were retrospectively investigated from January 2022 to June 2022 using consecutive full-field digital mammograms collected at two UK sites during 1 year (2017). Normal cases with 3-year follow-up and histopathologically proven cancer cases detected either at screening (that round or next) or within the 3-year interval were included. A preset specificity threshold equivalent to a single reader was applied. Performance was evaluated for stand-alone DL reading compared with single human reading, and for DL reading combined with human reading compared with double reading, using sensitivity and specificity as the primary metrics. P < .025 was considered to indicate statistical significance for noninferiority testing. Results A total of 26 722 cases (median patient age, 59.0 years [IQR, 54.0-63.0 years]) with mammograms acquired using machines from two vendors were included. Cases included 332 screen-detected, 174 interval, and 254 next-round cancers. Two of three stand-alone DL algorithms achieved noninferior sensitivity (DL-1: 64.8%, P < .001; DL-2: 56.7%, P = .03; DL-3: 58.9%, P < .001) compared with the single first reader (62.8%), and specificity was noninferior for DL-1 (92.8%; P < .001) and DL-2 (96.8%; P < .001) and superior for DL-3 (97.9%; P < .001) compared with the single first reader (96.5%). Combining the DL algorithms with human readers achieved noninferior sensitivity (67.0%, 65.6%, and 65.4% for DL-1, DL-2, and DL-3, respectively; P < .001 for all) compared with double reading (67.4%), and superior specificity (97.4%, 97.6%, and 97.6%; P < .001 for all) compared with double reading (97.1%). Conclusion Use of stand-alone DL algorithms in combination with a human reader could maintain screening accuracy while reducing workload. Published under a CC BY 4.0 license. Supplemental material is available for this article.
中文翻译:
用于英国筛查队列中乳腺癌检测的深度学习算法:作为独立读者并与人类读者相结合。
背景深度学习 (DL) 算法在乳腺 X 线摄影筛查中显示出有希望的结果,无论是与单个阅读器相比,还是与人类阅读器一起部署时,与双重读数相比。目的 在独立的英国数据集中对三种 DL 算法作为乳腺 X 线屏幕阅读器的性能进行外部验证。材料和方法 回顾性研究了 2022 年 1 月至 2022 年 6 月的三种商业 DL 算法 (DL-1 、 DL-2 和 DL-3) 使用 1 年 (2017 年) 期间在英国两个站点收集的连续全视野数字乳腺 X 光片。包括随访 3 年的正常病例和在筛选(该轮或下一轮)或间隔 3 年内检测到的组织病理学证实的癌症病例。应用了相当于单个读数仪的预设特异性阈值。使用灵敏度和特异性作为主要指标,评估独立 DL 读数与单人阅读相比的性能,以及 DL 阅读与人工阅读相结合与双重阅读相比的性能。P < .025 被认为表明非劣效性检验的统计学意义。结果 共纳入 26 722 例 (中位患者年龄,59.0 岁 [IQR,54.0-63.0 岁]),使用两家供应商的机器进行乳腺 X 线摄影。病例包括 332 例筛查发现的癌症、174 例间期癌和 254 例下一轮癌症。三种独立 DL 算法中的两种实现了非劣效敏感性 (DL-1: 64.8%,P < .001;DL-2:56.7%,P = .03;DL-3:58.9%,P < .001)与单一第一读者 (62.8%) 相比,DL-1 的特异性不劣于两者 (92.8%;P < .001) 和 DL-2 (96.8%;P < .001) 和更优的 DL-3 (97.9%;P < .001) 与单个第一读者 (96.5%) 相比。 将 DL 算法与人类读者相结合实现了非劣效敏感性 (DL-1、DL-2 和 DL-3 分别为 67.0%、65.6% 和 65.4%;P < .001 与双读数 (67.4%) 相比,特异性更高 (97.4%、97.6% 和 97.6%;P < .001 为全部),与双读数 (97.1%) 相比。结论 将独立的 DL 算法与人类读者结合使用可以在减少工作量的同时保持筛查准确性。在 CC BY 4.0 许可证下发布。本文提供了补充材料。
更新日期:2024-11-01
中文翻译:
用于英国筛查队列中乳腺癌检测的深度学习算法:作为独立读者并与人类读者相结合。
背景深度学习 (DL) 算法在乳腺 X 线摄影筛查中显示出有希望的结果,无论是与单个阅读器相比,还是与人类阅读器一起部署时,与双重读数相比。目的 在独立的英国数据集中对三种 DL 算法作为乳腺 X 线屏幕阅读器的性能进行外部验证。材料和方法 回顾性研究了 2022 年 1 月至 2022 年 6 月的三种商业 DL 算法 (DL-1 、 DL-2 和 DL-3) 使用 1 年 (2017 年) 期间在英国两个站点收集的连续全视野数字乳腺 X 光片。包括随访 3 年的正常病例和在筛选(该轮或下一轮)或间隔 3 年内检测到的组织病理学证实的癌症病例。应用了相当于单个读数仪的预设特异性阈值。使用灵敏度和特异性作为主要指标,评估独立 DL 读数与单人阅读相比的性能,以及 DL 阅读与人工阅读相结合与双重阅读相比的性能。P < .025 被认为表明非劣效性检验的统计学意义。结果 共纳入 26 722 例 (中位患者年龄,59.0 岁 [IQR,54.0-63.0 岁]),使用两家供应商的机器进行乳腺 X 线摄影。病例包括 332 例筛查发现的癌症、174 例间期癌和 254 例下一轮癌症。三种独立 DL 算法中的两种实现了非劣效敏感性 (DL-1: 64.8%,P < .001;DL-2:56.7%,P = .03;DL-3:58.9%,P < .001)与单一第一读者 (62.8%) 相比,DL-1 的特异性不劣于两者 (92.8%;P < .001) 和 DL-2 (96.8%;P < .001) 和更优的 DL-3 (97.9%;P < .001) 与单个第一读者 (96.5%) 相比。 将 DL 算法与人类读者相结合实现了非劣效敏感性 (DL-1、DL-2 和 DL-3 分别为 67.0%、65.6% 和 65.4%;P < .001 与双读数 (67.4%) 相比,特异性更高 (97.4%、97.6% 和 97.6%;P < .001 为全部),与双读数 (97.1%) 相比。结论 将独立的 DL 算法与人类读者结合使用可以在减少工作量的同时保持筛查准确性。在 CC BY 4.0 许可证下发布。本文提供了补充材料。