当前位置:
X-MOL 学术
›
IEEE Trans. Geosci. Remote Sens.
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
STRD-Net: A Dual-Encoder Semantic Segmentation Network for Urban Green Space Extraction
IEEE Transactions on Geoscience and Remote Sensing ( IF 7.5 ) Pub Date : 2024-09-10 , DOI: 10.1109/tgrs.2024.3456898 Mouzhe Yu 1 , Liheng He 1 , Zhehui Shen 1 , Meng Lv 1
IEEE Transactions on Geoscience and Remote Sensing ( IF 7.5 ) Pub Date : 2024-09-10 , DOI: 10.1109/tgrs.2024.3456898 Mouzhe Yu 1 , Liheng He 1 , Zhehui Shen 1 , Meng Lv 1
Affiliation
Urban green spaces significantly influence the production and lifestyle of individuals. Deep learning methods using convolutional neural network (CNN) as the encoder have weak global feature extraction capabilities, often missing individual trees or small areas of low vegetation. Transformer series models have weak local feature extraction capabilities and perform poorly in distinguishing between small categories such as trees and low vegetation. Therefore, we propose a novel dual-encoder semantic segmentation model, swin transformer and resnet50 dual-encoder net (STRD-Net), which integrates a parallel swin transformer (ST) framework and a CNN framework, capable of accepting two different channel ratio images as input, enabling the model to capture both global and local features. In the ST encoder, a convolutional block attention module (CBAM) is added to the head to overcome the “salt-and-pepper” noise effect in extraction results. A new patch merging (NPM) module is added after each ST module to further enhance the local feature extraction capabilities of the ST encoder for urban green spaces. In the CNN encoder, an enhanced atrous spatial pyramid pooling (EASPP) module is added after the Resnet50 backbone extraction network to expand the receptive field of the CNN encoder and enhance the global feature extraction capabilities for urban green spaces. The model includes a single skip connection to ensure extraction accuracy while saving computational resources. Results on the Vaihingen and Potsdam datasets indicate that STRD-Net improves both local and global feature extraction capabilities in the extraction of urban green spaces. The code will be available at https://github.com/learn-zhezhe/STRD-Net
.
中文翻译:
STRD-Net:用于城市绿地提取的双编码器语义分割网络
城市绿地显着影响着个体的生产和生活方式。使用卷积神经网络(CNN)作为编码器的深度学习方法全局特征提取能力较弱,经常会丢失个别树木或小片低矮植被区域。 Transformer系列模型局部特征提取能力较弱,在区分树木、低矮植被等小类别时表现较差。因此,我们提出了一种新颖的双编码器语义分割模型,swin Transformer 和 resnet50 双编码器网络(STRD-Net),它集成了并行 swin Transformer(ST)框架和 CNN 框架,能够接受两个不同通道比例的图像作为输入,使模型能够捕获全局和局部特征。在ST编码器中,头部添加了卷积块注意力模块(CBAM),以克服提取结果中的“椒盐”噪声效应。在每个ST模块之后添加了一个新的补丁合并(NPM)模块,以进一步增强ST编码器对城市绿地的局部特征提取能力。在CNN编码器中,在Resnet50主干提取网络之后添加了增强型空洞空间金字塔池化(EASPP)模块,以扩大CNN编码器的感受野,增强对城市绿地的全局特征提取能力。该模型包括单个跳跃连接,以确保提取精度,同时节省计算资源。 Vaihingen 和 Potsdam 数据集的结果表明,STRD-Net 提高了城市绿色空间提取中的局部和全局特征提取能力。该代码将在 https://github.com/learn-zhezhe/STRD-Net 上提供。
更新日期:2024-09-10
中文翻译:
STRD-Net:用于城市绿地提取的双编码器语义分割网络
城市绿地显着影响着个体的生产和生活方式。使用卷积神经网络(CNN)作为编码器的深度学习方法全局特征提取能力较弱,经常会丢失个别树木或小片低矮植被区域。 Transformer系列模型局部特征提取能力较弱,在区分树木、低矮植被等小类别时表现较差。因此,我们提出了一种新颖的双编码器语义分割模型,swin Transformer 和 resnet50 双编码器网络(STRD-Net),它集成了并行 swin Transformer(ST)框架和 CNN 框架,能够接受两个不同通道比例的图像作为输入,使模型能够捕获全局和局部特征。在ST编码器中,头部添加了卷积块注意力模块(CBAM),以克服提取结果中的“椒盐”噪声效应。在每个ST模块之后添加了一个新的补丁合并(NPM)模块,以进一步增强ST编码器对城市绿地的局部特征提取能力。在CNN编码器中,在Resnet50主干提取网络之后添加了增强型空洞空间金字塔池化(EASPP)模块,以扩大CNN编码器的感受野,增强对城市绿地的全局特征提取能力。该模型包括单个跳跃连接,以确保提取精度,同时节省计算资源。 Vaihingen 和 Potsdam 数据集的结果表明,STRD-Net 提高了城市绿色空间提取中的局部和全局特征提取能力。该代码将在 https://github.com/learn-zhezhe/STRD-Net 上提供。