International Journal of Computer Vision ( IF 11.6 ) Pub Date : 2024-10-06 , DOI: 10.1007/s11263-024-02235-z Bencheng Liao, Shaoyu Chen, Yunchi Zhang, Bo Jiang, Qian Zhang, Wenyu Liu, Chang Huang, Xinggang Wang
High-definition (HD) map provides abundant and precise static environmental information of the driving scene, serving as a fundamental and indispensable component for planning in autonomous driving system. In this paper, we present Map TRansformer, an end-to-end framework for online vectorized HD map construction. We propose a unified permutation-equivalent modeling approach, i.e., modeling map element as a point set with a group of equivalent permutations, which accurately describes the shape of map element and stabilizes the learning process. We design a hierarchical query embedding scheme to flexibly encode structured map information and perform hierarchical bipartite matching for map element learning. To speed up convergence, we further introduce auxiliary one-to-many matching and dense supervision. The proposed method well copes with various map elements with arbitrary shapes. It runs at real-time inference speed and achieves state-of-the-art performance on both nuScenes and Argoverse2 datasets. Abundant qualitative results show stable and robust map construction quality in complex and various driving scenes. Code and more demos are available at https://github.com/hustvl/MapTR for facilitating further studies and applications.
中文翻译:
MapTRv2:在线矢量化高精地图构建的端到端框架
高清地图提供了丰富且精确的驾驶场景静态环境信息,是自动驾驶系统规划中不可或缺的基础组成部分。在本文中,我们提出了Map TR ansformer,这是一种用于在线矢量化高清地图构建的端到端框架。我们提出了一种统一的排列等效建模方法,即。即,将图元建模为一组等价排列的点集,准确地描述了图元的形状,稳定了学习过程。我们设计了一种分层查询嵌入方案来灵活地编码结构化地图信息并为地图元素学习执行分层二分匹配。为了加速收敛,我们进一步引入辅助的一对多匹配和密集监督。所提出的方法可以很好地处理具有任意形状的各种地图元素。它以实时推理速度运行,并在 nuScenes 和 Argoverse2 数据集上实现了最先进的性能。丰富的定性结果表明,在复杂多样的驾驶场景中,地图构建质量稳定、稳健。 https://github.com/hustvl/MapTR 提供了代码和更多演示,以方便进一步的研究和应用。