当前位置: X-MOL 学术Med. Image Anal. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Image-level supervision and self-training for transformer-based cross-modality tumor segmentation
Medical Image Analysis ( IF 10.7 ) Pub Date : 2024-07-31 , DOI: 10.1016/j.media.2024.103287
Malo Alefsen de Boisredon d'Assier 1 , Aloys Portafaix 2 , Eugene Vorontsov 3 , William Trung Le 2 , Samuel Kadoury 2
Affiliation  

Deep neural networks are commonly used for automated medical image segmentation, but models will frequently struggle to generalize well across different imaging modalities. This issue is particularly problematic due to the limited availability of annotated data, both in the target as well as the source modality, making it difficult to deploy these models on a larger scale. To overcome these challenges, we propose a new semi-supervised training strategy called MoDATTS. Our approach is designed for accurate cross-modality 3D tumor segmentation on unpaired bi-modal datasets. An image-to-image translation strategy between modalities is used to produce synthetic but annotated images and labels in the desired modality and improve generalization to the unannotated target modality. We also use powerful vision transformer architectures for both image translation (TransUNet) and segmentation (Medformer) tasks and introduce an iterative self-training procedure in the later task to further close the domain gap between modalities, thus also training on unlabeled images in the target modality. MoDATTS additionally allows the possibility to exploit image-level labels with a semi-supervised objective that encourages the model to disentangle tumors from the background. This semi-supervised methodology helps in particular to maintain downstream segmentation performance when pixel-level label scarcity is also present in the source modality dataset, or when the source dataset contains healthy controls. The proposed model achieves superior performance compared to other methods from participating teams in the CrossMoDA 2022 vestibular schwannoma (VS) segmentation challenge, as evidenced by its reported top Dice score of 0.87±0.04 for the VS segmentation. MoDATTS also yields consistent improvements in Dice scores over baselines on a cross-modality adult brain gliomas segmentation task composed of four different contrasts from the BraTS 2020 challenge dataset, where 95% of a target supervised model performance is reached when no target modality annotations are available. We report that 99% and 100% of this maximum performance can be attained if 20% and 50% of the target data is additionally annotated, which further demonstrates that MoDATTS can be leveraged to reduce the annotation burden.

中文翻译:


基于变压器的跨模态肿瘤分割的图像级监督和自我训练



深度神经网络通常用于自动医学图像分割,但模型经常难以在不同的成像模式中很好地泛化。由于目标和源模态中注释数据的可用性有限,这个问题尤其成问题,使得大规模部署这些模型变得困难。为了克服这些挑战,我们提出了一种新的半监督训练策略,称为 MoDATTS。我们的方法旨在对未配对的双模态数据集进行准确的跨模态 3D 肿瘤分割。模态之间的图像到图像转换策略用于生成所需模态的合成但带注释的图像和标签,并提高对未注释的目标模态的泛化。我们还使用强大的视觉转换器架构来执行图像翻译(TransUNet)和分割(Medformer)任务,并在后续任务中引入迭代自训练过程,以进一步缩小模态之间的域差距,从而也对目标中的未标记图像进行训练情态。 MoDATTS还允许利用具有半监督目标的图像级标签,鼓励模型将肿瘤从背景中分离出来。当源模态数据集中也存在像素级标签稀缺性或当源数据集包含健康对照时,这种半监督方法特别有助于维持下游分割性能。与 CrossMoDA 2022 前庭神经鞘瘤 (VS) 分割挑战赛中参赛团队的其他方法相比,所提出的模型实现了卓越的性能,其报告的 VS 分割最高 Dice 得分为 0.87±0.04 就证明了这一点。 在由 BraTS 2020 挑战数据集的四种不同对比组成的跨模态成人脑胶质瘤分割任务中,MoDATTS 的 Dice 分数也比基线有了一致的改进,其中当没有可用的目标模态注释时,达到了 95% 的目标监督模型性能。我们报告说,如果额外注释 20% 和 50% 的目标数据,则可以实现 99% 和 100% 的最大性能,这进一步表明可以利用 MoDATTS 来减轻注释负担。
更新日期:2024-07-31
down
wechat
bug