当前位置: X-MOL 学术Med. Image Anal. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Metadata-enhanced contrastive learning from retinal optical coherence tomography images
Medical Image Analysis ( IF 10.7 ) Pub Date : 2024-08-10 , DOI: 10.1016/j.media.2024.103296
Robbie Holland 1 , Oliver Leingang 2 , Hrvoje Bogunović 3 , Sophie Riedl 2 , Lars Fritsche 4 , Toby Prevost 5 , Hendrik P N Scholl 6 , Ursula Schmidt-Erfurth 2 , Sobha Sivaprasad 7 , Andrew J Lotery 8 , Daniel Rueckert 1 , Martin J Menten 9 ,
Affiliation  

Deep learning has potential to automate screening, monitoring and grading of disease in medical images. Pretraining with contrastive learning enables models to extract robust and generalisable features from natural image datasets, facilitating label-efficient downstream image analysis. However, the direct application of conventional contrastive methods to medical datasets introduces two domain-specific issues. Firstly, several image transformations which have been shown to be crucial for effective contrastive learning do not translate from the natural image to the medical image domain. Secondly, the assumption made by conventional methods, that any two images are dissimilar, is systematically misleading in medical datasets depicting the same anatomy and disease. This is exacerbated in longitudinal image datasets that repeatedly image the same patient cohort to monitor their disease progression over time. In this paper we tackle these issues by extending conventional contrastive frameworks with a novel metadata-enhanced strategy. Our approach employs widely available patient metadata to approximate the true set of inter-image contrastive relationships. To this end we employ records for patient identity, eye position (i.e. left or right) and time series information. In experiments using two large longitudinal datasets containing 170,427 retinal optical coherence tomography (OCT) images of 7912 patients with age-related macular degeneration (AMD), we evaluate the utility of using metadata to incorporate the temporal dynamics of disease progression into pretraining. Our metadata-enhanced approach outperforms both standard contrastive methods and a retinal image foundation model in five out of six image-level downstream tasks related to AMD. We find benefits in both a low-data and high-data regime across tasks ranging from AMD stage and type classification to prediction of visual acuity. Due to its modularity, our method can be quickly and cost-effectively tested to establish the potential benefits of including available metadata in contrastive pretraining.

中文翻译:


视网膜光学相干断层扫描图像的元数据增强对比学习



深度学习有潜力自动筛选、监测和分级医学图像中的疾病。通过对比学习进行预训练使模型能够从自然图像数据集中提取鲁棒且可概括的特征,从而促进标签高效的下游图像分析。然而,将传统对比方法直接应用于医学数据集会带来两个特定领域的问题。首先,已被证明对有效对比学习至关重要的几种图像变换并没有从自然图像转换到医学图像领域。其次,传统方法做出的假设,即任何两个图像都不相似,在描述相同解剖结构和疾病的医学数据集中会系统性地误导。在纵向图像数据集中,这种情况更加严重,这些数据集重复对同一患者队列进行成像以监测他们随时间的疾病进展。在本文中,我们通过使用新颖的元数据增强策略扩展传统的对比框架来解决这些问题。我们的方法采用广泛可用的患者元数据来近似图像间对比关系的真实集合。为此,我们使用患者身份、眼睛位置(即左或右)和时间序列信息的记录。在使用两个大型纵向数据集(包含 7912 名年龄相关性黄斑变性 (AMD) 患者的 170,427 张视网膜光学相干断层扫描 (OCT) 图像)的实验中,我们评估了使用元数据将疾病进展的时间动态纳入预训练的效用。我们的元数据增强方法在与 AMD 相关的六分之三的图像级下游任务中优于标准对比方法和视网膜图像基础模型。 我们发现低数据和高数据机制在从 AMD 阶段和类型分类到视力预测等任务中都有好处。由于其模块化,我们的方法可以快速且经济有效地进行测试,以确定在对比预训练中包含可用元数据的潜在好处。
更新日期:2024-08-10
down
wechat
bug