当前位置:
X-MOL 学术
›
Med. Image Anal.
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
DACG: Dual Attention and Context Guidance model for radiology report generation
Medical Image Analysis ( IF 10.7 ) Pub Date : 2024-10-23 , DOI: 10.1016/j.media.2024.103377 Wangyu Lang, Zhi Liu, Yijia Zhang
Medical Image Analysis ( IF 10.7 ) Pub Date : 2024-10-23 , DOI: 10.1016/j.media.2024.103377 Wangyu Lang, Zhi Liu, Yijia Zhang
Medical images are an essential basis for radiologists to write radiology reports and greatly help subsequent clinical treatment. The task of generating automatic radiology reports aims to alleviate the burden of clinical doctors writing reports and has received increasing attention this year, becoming an important research hotspot. However, there are severe issues of visual and textual data bias and long text generation in the medical field. Firstly, Abnormal areas in radiological images only account for a small portion, and most radiological reports only involve descriptions of normal findings. Secondly, there are still significant challenges in generating longer and more accurate descriptive texts for radiology report generation tasks. In this paper, we propose a new Dual Attention and Context Guidance (DACG) model to alleviate visual and textual data bias and promote the generation of long texts. We use a Dual Attention Module, including a Position Attention Block and a Channel Attention Block, to extract finer position and channel features from medical images, enhancing the image feature extraction ability of the encoder. We use the Context Guidance Module to integrate contextual information into the decoder and supervise the generation of long texts. The experimental results show that our proposed model achieves state-of-the-art performance on the most commonly used IU X-ray and MIMIC-CXR datasets. Further analysis also proves that our model can improve reporting through more accurate anomaly detection and more detailed descriptions. The source code is available at https://github.com/LangWY/DACG .
中文翻译:
DACG:用于生成放射学报告的双重注意和上下文引导模型
医学影像是放射科医生撰写放射学报告的重要依据,对后续的临床治疗有很大帮助。自动生成放射学报告的任务旨在减轻临床医生撰写报告的负担,今年受到越来越多的关注,成为重要的研究热点。然而,医学领域存在严重的视觉和文本数据偏差以及长文本生成问题。首先,放射学图像中的异常区域仅占一小部分,大多数放射学报告仅涉及正常结果的描述。其次,为放射学报告生成任务生成更长、更准确的描述性文本仍然存在重大挑战。在本文中,我们提出了一种新的双重注意力和上下文指导 (DACG) 模型,以减轻视觉和文本数据偏差并促进长文本的生成。我们使用双注意力模块,包括 Position Attention Block 和 Channel Attention Block,从医学图像中提取更精细的位置和通道特征,增强编码器的图像特征提取能力。我们使用上下文指导模块将上下文信息集成到解码器中,并监督长文本的生成。实验结果表明,我们提出的模型在最常用的 IU X 射线和 MIMIC-CXR 数据集上取得了最先进的性能。进一步的分析还证明,我们的模型可以通过更准确的异常检测和更详细的描述来改进报告。源代码可在 https://github.com/LangWY/DACG 上获得。
更新日期:2024-10-23
中文翻译:
DACG:用于生成放射学报告的双重注意和上下文引导模型
医学影像是放射科医生撰写放射学报告的重要依据,对后续的临床治疗有很大帮助。自动生成放射学报告的任务旨在减轻临床医生撰写报告的负担,今年受到越来越多的关注,成为重要的研究热点。然而,医学领域存在严重的视觉和文本数据偏差以及长文本生成问题。首先,放射学图像中的异常区域仅占一小部分,大多数放射学报告仅涉及正常结果的描述。其次,为放射学报告生成任务生成更长、更准确的描述性文本仍然存在重大挑战。在本文中,我们提出了一种新的双重注意力和上下文指导 (DACG) 模型,以减轻视觉和文本数据偏差并促进长文本的生成。我们使用双注意力模块,包括 Position Attention Block 和 Channel Attention Block,从医学图像中提取更精细的位置和通道特征,增强编码器的图像特征提取能力。我们使用上下文指导模块将上下文信息集成到解码器中,并监督长文本的生成。实验结果表明,我们提出的模型在最常用的 IU X 射线和 MIMIC-CXR 数据集上取得了最先进的性能。进一步的分析还证明,我们的模型可以通过更准确的异常检测和更详细的描述来改进报告。源代码可在 https://github.com/LangWY/DACG 上获得。