当前位置: X-MOL 学术Inform. Fusion › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Breaking through clouds: A hierarchical fusion network empowered by dual-domain cross-modality interactive attention for cloud-free image reconstruction
Information Fusion ( IF 14.7 ) Pub Date : 2024-08-22 , DOI: 10.1016/j.inffus.2024.102649
Congyu Li , Shutao Li , Xinxin Liu

Cloud obscuration undermines the availability of optical images for continuous monitoring in earth observation. Fusing features from synthetic aperture radar (SAR) has been recognized as a feasible strategy to guide the reconstruction of corrupted signals in cloud-contaminated regions. However, due to the different imaging mechanisms and reflection characteristics, the substantial domain gap between SAR images and optical images makes it a challenging problem to effectively perform cross-modality feature fusion. Although several SAR-assisted cloud removal methods have been proposed, most of them are often unable to achieve adequate information interaction between different modalities, which greatly limits the effectiveness and reliability of cross-modality fusion. In this paper, we proposed a novel hierarchical framework for cloud-free multispectral image reconstruction, which effectively integrates SAR and optical data by a dual-domain interactive attention mechanism. The overall encoder–decoder network is a W-shaped asymmetric structure with a two-branch encoder and one decoder. The encoder branches extract features from SAR images and optical images separately, while the decoder exploits multiscale residual block groups to expand the receptive field and multi-output strategy for reducing the training difficulty. The core cross-modality feature fusion module at the bottleneck adopts the dual-domain interactive attention (DDIA) mechanism which enhances the reciprocal infusion of SAR and optical features to encourage the reconstruction of spectral and structural information. Furthermore, features in the spatial and frequency domains are integrated to improve the effectiveness of the fusion process. To echo the overall network structure, the loss function is designed as a multiscale loss in dual domains. The proposed method can realize sufficient information communication and effective cross-modality fusion between SAR features and optical features. Extensive experiments on the SMILE-CR dataset and SEN12MS-CR dataset demonstrated that the proposed method can outperform the seven representative deep-learning comparative methods in terms of visual performance and quantitative accuracy.

中文翻译:


突破云层:双域跨模态交互关注赋能的分层融合网络,实现无云图像重建



云遮蔽破坏了地球观测中连续监测的光学图像的可用性。融合合成孔径雷达(SAR)的特征已被认为是指导重建云污染区域中损坏信号的可行策略。然而,由于成像机制和反射特性不同,SAR图像和光学图像之间巨大的域差距使得有效地进行跨模态特征融合成为一个具有挑战性的问题。尽管已经提出了几种SAR辅助去云方法,但大多数方法往往无法实现不同模态之间充分的信息交互,这极大地限制了跨模态融合的有效性和可靠性。在本文中,我们提出了一种用于无云多光谱图像重建的新型分层框架,该框架通过双域交互式注意机制有效地集成了SAR和光学数据。整个编码器-解码器网络是一个W形的不对称结构,有两个分支的编码器和一个解码器。编码器分支分别从SAR图像和光学图像中提取特征,而解码器利用多尺度残差块组来扩展感受野和多输出策略以降低训练难度。瓶颈处的核心跨模态特征融合模块采用双域交互注意(DDIA)机制,增强SAR和光学特征的相互注入,以鼓励光谱和结构信息的重建。此外,空间域和频率域的特征被集成以提高融合过程的有效性。 为了呼应整体网络结构,损失函数被设计为双域中的多尺度损失。该方法能够实现SAR特征与光学特征之间充分的信息通信和有效的跨模态融合。在SMILE-CR数据集和SEN12MS-CR数据集上的大量实验表明,该方法在视觉性能和定量准确性方面优于七种代表性的深度学习比较方法。
更新日期:2024-08-22
down
wechat
bug