当前位置: X-MOL 学术Int. J. Appl. Earth Obs. Geoinf. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Point cloud semantic segmentation with adaptive spatial structure graph transformer
International Journal of Applied Earth Observation and Geoinformation ( IF 7.6 ) Pub Date : 2024-09-07 , DOI: 10.1016/j.jag.2024.104105
Ting Han, Yiping Chen, Jin Ma, Xiaoxue Liu, Wuming Zhang, Xinchang Zhang, Huajuan Wang

With the rapid development of LiDAR and artificial intelligence technologies, 3D point cloud semantic segmentation has become a highlight research topic. This technology is able to significantly enhance the capabilities of building information modeling, navigation and environmental perception. However, current deep learning-based methods primarily rely on voxelization or multi-layer convolution for feature extraction. These methods often face challenges in effectively differentiating between homogeneous objects or structurally adherent targets in complex real-world scenes. To this end, we propose a Graph Transformer point cloud semantic segmentation network (ASGFormer) tailored for structurally adherent objects. Firstly, ASGFormer combines Graph and Transformer to promote global correlation understanding in the graph. Secondly, spatial index and position embedding are constructed based on distance relationships and feature differences. Through a learnable mechanism, the structural weights between points are dynamically adjusted, achieving adaptive spatial structure within the graph. Finally, dummy nodes are introduced to facilitate global information storage and transmission between layers, effectively addressing the issue of information loss at the terminal nodes of the graph. Comprehensive experiments are conducted on the various real-world 3D point cloud datasets, analyzing the effectiveness of proposed ASGFormer through qualitative and quantitative evaluations. ASGFormer outperforms existing approaches with of 91.3% for OA, 78.0% for mAcc, and 72.3% for mIoU on S3DIS dataset. Moreover, ASGFormer achieves 72.8%, 45.5%, 81.6%, 70.1% mIoU on ScanNet, City-Facade, Toronto 3D and Semantic KITTI dataset, respectively. Notably, the proposed method demonstrates effective differentiation of homogeneous structurally adherent objects, further contributing to the intelligent perception and modeling of complex scenes.

中文翻译:


具有自适应空间结构图转换器的点云语义分割



随着激光雷达和人工智能技术的快速发展,3D点云语义分割已成为热点研究课题。该技术能够显着增强建筑信息建模、导航和环境感知的能力。然而,当前基于深度学习的方法主要依靠体素化或多层卷积来进行特征提取。这些方法通常面临着有效区分复杂现实场景中同质对象或结构粘附目标的挑战。为此,我们提出了一种专为结构粘附对象量身定制的 Graph Transformer 点云语义分割网络(ASGFormer)。首先,ASGFormer结合了Graph和Transformer来促进图中的全局相关性理解。其次,根据距离关系和特征差异构建空间索引和位置嵌入。通过可学习的机制,动态调整点之间的结构权重,实现图中的自适应空间结构。最后引入虚拟节点,方便全局信息存储和层间传输,有效解决图终端节点的信息丢失问题。对各种现实世界的 3D 点云数据集进行了综合实验,通过定性和定量评估分析了所提出的 ASGFormer 的有效性。 ASGFormer 优于现有方法,在 S3DIS 数据集上,OA 为 91.3%,mAcc 为 78.0%,mIoU 为 72.3%。此外,ASGFormer 在 ScanNet、City-Facade、Toronto 3D 和 Semantic KITTI 数据集上分别实现了 72.8%、45.5%、81.6%、70.1% mIoU。 值得注意的是,所提出的方法证明了同质结构粘附对象的有效区分,进一步有助于复杂场景的智能感知和建模。
更新日期:2024-09-07
down
wechat
bug