Nature ( IF 50.5 ) Pub Date : 2024-09-04 , DOI: 10.1038/s41586-024-07894-z Xiyue Wang 1, 2 , Junhan Zhao 1, 3 , Eliana Marostica 1, 4 , Wei Yuan 5 , Jietian Jin 6 , Jiayu Zhang 5 , Ruijiang Li 2 , Hongping Tang 7 , Kanran Wang 8 , Yu Li 9 , Fang Wang 10 , Yulong Peng 11 , Junyou Zhu 12 , Jing Zhang 5 , Christopher R Jackson 1, 13, 14 , Jun Zhang 15 , Deborah Dillon 16 , Nancy U Lin 17 , Lynette Sholl 16, 18 , Thomas Denize 16, 18 , David Meredith 16 , Keith L Ligon 16, 18 , Sabina Signoretti 16, 18 , Shuji Ogino 16, 19, 20 , Jeffrey A Golden 16, 21 , MacLean P Nasrallah 22 , Xiao Han 15 , Sen Yang 1, 2 , Kun-Hsing Yu 1, 16, 23
Histopathology image evaluation is indispensable for cancer diagnoses and subtype classification. Standard artificial intelligence methods for histopathology image analyses have focused on optimizing specialized models for each diagnostic task1,2. Although such methods have achieved some success, they often have limited generalizability to images generated by different digitization protocols or samples collected from different populations3. Here, to address this challenge, we devised the Clinical Histopathology Imaging Evaluation Foundation (CHIEF) model, a general-purpose weakly supervised machine learning framework to extract pathology imaging features for systematic cancer evaluation. CHIEF leverages two complementary pretraining methods to extract diverse pathology representations: unsupervised pretraining for tile-level feature identification and weakly supervised pretraining for whole-slide pattern recognition. We developed CHIEF using 60,530 whole-slide images spanning 19 anatomical sites. Through pretraining on 44 terabytes of high-resolution pathology imaging datasets, CHIEF extracted microscopic representations useful for cancer cell detection, tumour origin identification, molecular profile characterization and prognostic prediction. We successfully validated CHIEF using 19,491 whole-slide images from 32 independent slide sets collected from 24 hospitals and cohorts internationally. Overall, CHIEF outperformed the state-of-the-art deep learning methods by up to 36.1%, showing its ability to address domain shifts observed in samples from diverse populations and processed by different slide preparation methods. CHIEF provides a generalizable foundation for efficient digital pathology evaluation for patients with cancer.
中文翻译:
用于癌症诊断和预后的预测的病理学基础模型
组织病理学图像评估对于癌症诊断和亚型分类是必不可少的。用于组织病理学图像分析的标准人工智能方法侧重于优化每项诊断任务的专用模型1,2。尽管这些方法取得了一些成功,但它们对不同数字化协议生成的图像或从不同人群收集的样本的普遍性往往有限3。在这里,为了应对这一挑战,我们设计了临床组织病理学影像学评估基金会 (CHIEF) 模型,这是一个通用的弱监督机器学习框架,用于提取病理成像特征以进行系统的癌症评估。CHIEF 利用两种互补的预训练方法来提取不同的病理表示:用于切片级特征识别的无监督预训练和用于全玻片模式识别的弱监督预训练。我们使用跨越 19 个解剖部位的 60,530 张全玻片图像开发了 CHIEF。通过对 44 TB 高分辨率病理成像数据集的预训练,CHIEF 提取了可用于癌细胞检测、肿瘤起源鉴定、分子谱表征和预后预测的微观表示。我们使用从国际 24 家医院和队列收集的 32 个独立幻灯片集的 19,491 张全玻片图像成功验证了 CHIEF。总体而言,CHIEF 的性能比最先进的深度学习方法高出 36.1%,这表明它能够解决在不同人群的样本中观察到的领域偏移,并由不同的玻片制备方法处理。CHIEF 为癌症患者的高效数字病理学评估提供了通用的基础。