Artificial Intelligence in Medicine ( IF 6.1 ) Pub Date : 2023-11-30 , DOI: 10.1016/j.artmed.2023.102737 Joana Rocha 1 , Sofia Cardoso Pereira 1 , João Pedrosa 1 , Aurélio Campilho 1 , Ana Maria Mendonça 1
Chest X-ray scans are frequently requested to detect the presence of abnormalities, due to their low-cost and non-invasive nature. The interpretation of these images can be automated to prioritize more urgent exams through deep learning models, but the presence of image artifacts, e.g. lettering, often generates a harmful bias in the classifiers and an increase of false positive results. Consequently, healthcare would benefit from a system that selects the thoracic region of interest prior to deciding whether an image is possibly pathologic. The current work tackles this binary classification exercise, in which an image is either normal or abnormal, using an attention-driven and spatially unsupervised Spatial Transformer Network (STERN), that takes advantage of a novel domain-specific loss to better frame the region of interest. Unlike the state of the art, in which this type of networks is usually employed for image alignment, this work proposes a spatial transformer module that is used specifically for attention, as an alternative to the standard object detection models that typically precede the classifier to crop out the region of interest. In sum, the proposed end-to-end architecture dynamically scales and aligns the input images to maximize the classifier’s performance, by selecting the thorax with translation and non-isotropic scaling transformations, and thus eliminating artifacts. Additionally, this paper provides an extensive and objective analysis of the selected regions of interest, by proposing a set of mathematical evaluation metrics. The results indicate that the STERN achieves similar results to using YOLO-cropped images, with reduced computational cost and without the need for localization labels. More specifically, the system is able to distinguish abnormal frontal images from the CheXpert dataset, with a mean AUC of 85.67% - a 2.55% improvement vs. the 0.98% improvement achieved by the YOLO-based counterpart in comparison to a standard baseline classifier. At the same time, the STERN approach requires less than 2/3 of the training parameters, while increasing the inference time per batch in less than 2 ms. Code available via GitHub.
中文翻译:
STERN:用于胸部 X 射线图像异常检测的注意力驱动空间变换网络
由于胸部 X 光扫描成本低且非侵入性,因此经常需要进行胸部 X 光扫描来检测是否存在异常。这些图像的解释可以自动化,通过深度学习模型优先处理更紧急的检查,但是图像伪影(例如字母)的存在通常会在分类器中产生有害的偏差,并增加误报结果。因此,医疗保健将受益于在决定图像是否可能是病理性之前选择感兴趣的胸部区域的系统。当前的工作解决了这种二元分类练习,其中图像要么是正常的,要么是异常的,使用注意力驱动和空间无监督的空间变换网络(STERN),该网络利用一种新颖的特定领域损失来更好地框出感兴趣的区域。与现有技术中这种类型的网络通常用于图像对齐不同,这项工作提出了一种专门用于注意力的空间变换器模块,作为通常在分类器进行裁剪之前的标准对象检测模型的替代方案出感兴趣的区域。总之,所提出的端到端架构通过选择具有平移和非各向同性缩放变换的胸部来动态缩放和对齐输入图像,以最大化分类器的性能,从而消除伪影。此外,本文通过提出一组数学评估指标,对选定的感兴趣区域进行了广泛而客观的分析。 结果表明,STERN 取得了与使用 YOLO 裁剪图像类似的结果,同时降低了计算成本,并且不需要定位标签。更具体地说,该系统能够从 CheXpert 数据集中区分异常的正面图像,平均 AUC 为 85.67%,与标准基线分类器相比,基于 YOLO 的对应模型实现了 0.98% 的改进,提高了 2.55%。同时,STERN 方法需要不到 2/3 的训练参数,同时将每批的推理时间增加到不到 2 毫秒。代码可通过 GitHub 获取。