当前位置:
X-MOL 学术
›
IEEE Trans. Geosci. Remote Sens.
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Multilevel Attention Dynamic-Scale Network for HSI and LiDAR Data Fusion Classification
IEEE Transactions on Geoscience and Remote Sensing ( IF 7.5 ) Pub Date : 2024-09-09 , DOI: 10.1109/tgrs.2024.3456754 Yufei He 1 , Bobo Xi 2 , Guocheng Li 1 , Tie Zheng 1 , Yunsong Li 3 , Changbin Xue 1 , Jocelyn Chanussot 4
IEEE Transactions on Geoscience and Remote Sensing ( IF 7.5 ) Pub Date : 2024-09-09 , DOI: 10.1109/tgrs.2024.3456754 Yufei He 1 , Bobo Xi 2 , Guocheng Li 1 , Tie Zheng 1 , Yunsong Li 3 , Changbin Xue 1 , Jocelyn Chanussot 4
Affiliation
Land use/land cover classification with multimodal data has attracted increasing attention. For hyperspectral images (HSIs) and light detection and ranging (LiDAR) data, the combination of them can make the classification more accurate and robust. However, how to effectively utilize their respective strengths and integrate them with the classification task is still a challenging problem. In this article, a multilevel attention dynamic-scale network (MADNet) is proposed. First, in the feature extraction stage, the two modalities are divided into two branches with different scales, which are then fed into the convolutional neural networks (CNNs) to learn shallow features. Then, considering the characteristics of the HSI, a spectral angle attention module (SAAM) with low-level attention is designed to highlight surrounding pixels that have similar spectra to the central pixel of the patch. After that, a dynamic-scale selection module (DSSM) is proposed to screen an appropriate scale for the patches by pixel similarity analysis. Next, combining the Transformer and the CNN, a global-local cross-attention module (GLCAM) is devised to investigate the fused deep-level multimodal features. Distinct from the vanilla Transformer, the GLCAM deploys a distance-weight operator to decrease the redundancies at long distances and effectively reduce misclassifications. Extensive experiments on three paired HSI and LiDAR datasets demonstrate that the proposed MADNet has certain advantages over the existing methods.
中文翻译:
用于 HSI 和 LiDAR 数据融合分类的多级关注动态规模网络
利用多模态数据进行土地利用/土地覆盖分类引起了越来越多的关注。对于高光谱图像(HSI)和光探测与测距(LiDAR)数据,它们的组合可以使分类更加准确和鲁棒。然而,如何有效地利用各自的优势并将其与分类任务相结合仍然是一个具有挑战性的问题。在本文中,提出了一种多级注意力动态尺度网络(MADNet)。首先,在特征提取阶段,两种模态被分为不同尺度的两个分支,然后输入到卷积神经网络(CNN)中学习浅层特征。然后,考虑到HSI的特性,设计了具有低级注意的光谱角度注意模块(SAAM)来突出显示与块的中心像素具有相似光谱的周围像素。之后,提出了动态尺度选择模块(DSSM),通过像素相似性分析来筛选合适的斑块尺度。接下来,结合 Transformer 和 CNN,设计了全局局部交叉注意模块(GLCAM)来研究融合的深层多模态特征。与普通的 Transformer 不同,GLCAM 部署了距离权重算子来减少长距离的冗余并有效减少错误分类。对三对 HSI 和 LiDAR 数据集的大量实验表明,所提出的 MADNet 比现有方法具有一定的优势。
更新日期:2024-09-09
中文翻译:
用于 HSI 和 LiDAR 数据融合分类的多级关注动态规模网络
利用多模态数据进行土地利用/土地覆盖分类引起了越来越多的关注。对于高光谱图像(HSI)和光探测与测距(LiDAR)数据,它们的组合可以使分类更加准确和鲁棒。然而,如何有效地利用各自的优势并将其与分类任务相结合仍然是一个具有挑战性的问题。在本文中,提出了一种多级注意力动态尺度网络(MADNet)。首先,在特征提取阶段,两种模态被分为不同尺度的两个分支,然后输入到卷积神经网络(CNN)中学习浅层特征。然后,考虑到HSI的特性,设计了具有低级注意的光谱角度注意模块(SAAM)来突出显示与块的中心像素具有相似光谱的周围像素。之后,提出了动态尺度选择模块(DSSM),通过像素相似性分析来筛选合适的斑块尺度。接下来,结合 Transformer 和 CNN,设计了全局局部交叉注意模块(GLCAM)来研究融合的深层多模态特征。与普通的 Transformer 不同,GLCAM 部署了距离权重算子来减少长距离的冗余并有效减少错误分类。对三对 HSI 和 LiDAR 数据集的大量实验表明,所提出的 MADNet 比现有方法具有一定的优势。