当前位置:
X-MOL 学术
›
IEEE Trans. Geosci. Remote Sens.
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
P2PFormer: A Primitive-to-Polygon Method for Regular Building Contour Extraction From Remote Sensing Images
IEEE Transactions on Geoscience and Remote Sensing ( IF 7.5 ) Pub Date : 2024-09-12 , DOI: 10.1109/tgrs.2024.3459011 Tao Zhang 1 , Shiqing Wei 2 , Yikang Zhou 1 , Muying Luo 1 , Wenling Yu 3 , Shunping Ji 1
IEEE Transactions on Geoscience and Remote Sensing ( IF 7.5 ) Pub Date : 2024-09-12 , DOI: 10.1109/tgrs.2024.3459011 Tao Zhang 1 , Shiqing Wei 2 , Yikang Zhou 1 , Muying Luo 1 , Wenling Yu 3 , Shunping Ji 1
Affiliation
Extracting building contours from remote sensing imagery is a significant challenge due to buildings’ complex and diverse shapes, occlusions, and noise. Existing methods often struggle with irregular contours, rounded corners, and redundancy points, necessitating extensive postprocessing to produce regular polygonal building contours. To address these challenges, we introduce a novel, streamlined pipeline that generates regular building contours without postprocessing. Our approach begins with the segmentation of generic geometric primitives (which can include vertices, lines, and corners), followed by the prediction of their sequence. This allows for the direct construction of regular building contours by sequentially connecting the segmented primitives. Building on this pipeline, we developed primitive-to-polygon using transformer (P2PFormer), which uses a transformer-based architecture to segment geometric primitives and predict their order. To enhance the segmentation of primitives, we introduce a unique representation called group queries. This representation comprises a set of queries and a singular query position, which improve the focus on multiple midpoints of primitives and their efficient linkage. Furthermore, we propose an innovative implicit update strategy for the query position embedding aimed at sharpening the focus of queries on the correct positions and, consequently, enhancing the quality of primitive segmentation. Our experiments demonstrate that P2PFormer achieves new state-of-the-art (SOTA) performance on the WHU, CrowdAI, and WHU-Mix datasets, surpassing the previous SOTA PolyWorld by a margin of 2.7 AP and 6.5 AP75 on the largest CrowdAI dataset. We intend to make the code and trained weights publicly available to promote their use and facilitate further research.
中文翻译:
P2PFormer:一种从遥感图像中提取规则建筑物轮廓的基元到多边形方法
由于建筑物复杂多样的形状、遮挡和噪声,从遥感图像中提取建筑物轮廓是一项重大挑战。现有方法通常难以处理不规则轮廓、圆角和冗余点,需要进行大量后处理才能生成规则的多边形建筑轮廓。为了应对这些挑战,我们引入了一种新颖的、简化的管道,无需后处理即可生成规则的建筑轮廓。我们的方法首先对通用几何图元(可以包括顶点、线和角)进行分割,然后预测它们的序列。这允许通过顺序连接分段基元来直接构建规则的建筑轮廓。在此管道的基础上,我们使用变压器(P2PFormer)开发了图元到多边形,它使用基于变压器的架构来分割几何图元并预测它们的顺序。为了增强基元的分割,我们引入了一种称为组查询的独特表示。该表示包括一组查询和单个查询位置,这提高了对基元的多个中点及其有效链接的关注。此外,我们提出了一种用于查询位置嵌入的创新隐式更新策略,旨在将查询的焦点集中在正确的位置上,从而提高基元分割的质量。我们的实验表明,P2PFormer 在 WHU、CrowdAI 和 WHU-Mix 数据集上实现了新的最先进 (SOTA) 性能,在最大的 CrowdAI 数据集上超过了之前的 SOTA PolyWorld 2.7 AP 和 6.5 AP75。 我们打算公开代码和训练后的权重,以促进其使用并促进进一步的研究。
更新日期:2024-09-12
中文翻译:
P2PFormer:一种从遥感图像中提取规则建筑物轮廓的基元到多边形方法
由于建筑物复杂多样的形状、遮挡和噪声,从遥感图像中提取建筑物轮廓是一项重大挑战。现有方法通常难以处理不规则轮廓、圆角和冗余点,需要进行大量后处理才能生成规则的多边形建筑轮廓。为了应对这些挑战,我们引入了一种新颖的、简化的管道,无需后处理即可生成规则的建筑轮廓。我们的方法首先对通用几何图元(可以包括顶点、线和角)进行分割,然后预测它们的序列。这允许通过顺序连接分段基元来直接构建规则的建筑轮廓。在此管道的基础上,我们使用变压器(P2PFormer)开发了图元到多边形,它使用基于变压器的架构来分割几何图元并预测它们的顺序。为了增强基元的分割,我们引入了一种称为组查询的独特表示。该表示包括一组查询和单个查询位置,这提高了对基元的多个中点及其有效链接的关注。此外,我们提出了一种用于查询位置嵌入的创新隐式更新策略,旨在将查询的焦点集中在正确的位置上,从而提高基元分割的质量。我们的实验表明,P2PFormer 在 WHU、CrowdAI 和 WHU-Mix 数据集上实现了新的最先进 (SOTA) 性能,在最大的 CrowdAI 数据集上超过了之前的 SOTA PolyWorld 2.7 AP 和 6.5 AP75。 我们打算公开代码和训练后的权重,以促进其使用并促进进一步的研究。