当前位置:
X-MOL 学术
›
IEEE Trans. Image Process.
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
MA-ST3D: Motion Associated Self-Training for Unsupervised Domain Adaptation on 3D Object Detection
IEEE Transactions on Image Processing ( IF 10.8 ) Pub Date : 2024-10-24 , DOI: 10.1109/tip.2024.3482976 Chi Zhang, Wenbo Chen, Wei Wang, Zhaoxiang Zhang
IEEE Transactions on Image Processing ( IF 10.8 ) Pub Date : 2024-10-24 , DOI: 10.1109/tip.2024.3482976 Chi Zhang, Wenbo Chen, Wei Wang, Zhaoxiang Zhang
Recently, unsupervised domain adaptation (UDA) for 3D object detectors has increasingly garnered attention as a method to eliminate the prohibitive costs associated with generating extensive 3D annotations, which are crucial for effective model training. Self-training (ST) has emerged as a simple and effective technique for UDA. The major issue involved in ST-UDA for 3D object detection is refining the imprecise predictions caused by domain shift and generating accurate pseudo labels as supervisory signals. This study presents a novel ST-UDA framework to generate high-quality pseudo labels by associating predictions of 3D point cloud sequences during ego-motion according to spatial and temporal consistency, named motion-associated self-training for 3D object detection (MA-ST3D). MA-ST3D maintains a global-local pathway (GLP) architecture to generate high-quality pseudo-labels by leveraging both intra-frame and inter-frame consistencies along the spatial dimension of the LiDAR’s ego-motion. It also equips two memory modules for both global and local pathways, called global memory and local memory, to suppress the temporal fluctuation of pseudo-labels during self-training iterations. In addition, a motion-aware loss is introduced to impose discriminated regulations on pseudo labels with different motion statuses, which mitigates the harmful spread of false positive pseudo labels. Finally, our method is evaluated on three representative domain adaptation tasks on authoritative 3D benchmark datasets (i.e. Waymo, Kitti, and nuScenes). MA-ST3D achieved SOTA performance on all evaluated UDA settings and even surpassed the weakly supervised DA methods on the Kitti and NuScenes object detection benchmark.
中文翻译:
MA-ST3D: 用于 3D 目标检测的无监督域自适应的运动相关自训练
最近,用于 3D 对象检测器的无监督域自适应 (UDA) 作为一种消除与生成大量 3D 注释相关的高昂成本的方法越来越受到关注,这对于有效的模型训练至关重要。自我训练 (ST) 已成为一种简单有效的 UDA 技术。用于 3D 目标检测的 ST-UDA 涉及的主要问题是改进域偏移引起的不精确预测,并生成准确的伪标签作为监督信号。本研究提出了一种新的 ST-UDA 框架,通过根据空间和时间一致性将自我运动期间对 3D 点云序列的预测相关联来生成高质量的伪标签,称为用于 3D 对象检测的运动相关自我训练 (MA-ST3D)。MA-ST3D 保持全局-局部路径 (GLP) 架构,通过利用 LiDAR 自我运动空间维度的帧内和帧间一致性来生成高质量的伪标签。它还为全局和局部通路配备了两个内存模块,称为全局内存和本地内存,以抑制自训练迭代过程中伪标签的时间波动。此外,引入了运动感知损失,对具有不同运动状态的伪标签施加了区分性规定,从而减轻了假阳性伪标签的有害传播。最后,在权威的 3D 基准数据集 (即 Waymo、Kitti 和 nuScenes) 上对三个具有代表性的域适应任务进行了评估。MA-ST3D 在所有评估的 UDA 设置上都实现了 SOTA 性能,甚至在 Kitti 和 NuScenes 对象检测基准测试中超过了弱监督 DA 方法。
更新日期:2024-10-24
中文翻译:
MA-ST3D: 用于 3D 目标检测的无监督域自适应的运动相关自训练
最近,用于 3D 对象检测器的无监督域自适应 (UDA) 作为一种消除与生成大量 3D 注释相关的高昂成本的方法越来越受到关注,这对于有效的模型训练至关重要。自我训练 (ST) 已成为一种简单有效的 UDA 技术。用于 3D 目标检测的 ST-UDA 涉及的主要问题是改进域偏移引起的不精确预测,并生成准确的伪标签作为监督信号。本研究提出了一种新的 ST-UDA 框架,通过根据空间和时间一致性将自我运动期间对 3D 点云序列的预测相关联来生成高质量的伪标签,称为用于 3D 对象检测的运动相关自我训练 (MA-ST3D)。MA-ST3D 保持全局-局部路径 (GLP) 架构,通过利用 LiDAR 自我运动空间维度的帧内和帧间一致性来生成高质量的伪标签。它还为全局和局部通路配备了两个内存模块,称为全局内存和本地内存,以抑制自训练迭代过程中伪标签的时间波动。此外,引入了运动感知损失,对具有不同运动状态的伪标签施加了区分性规定,从而减轻了假阳性伪标签的有害传播。最后,在权威的 3D 基准数据集 (即 Waymo、Kitti 和 nuScenes) 上对三个具有代表性的域适应任务进行了评估。MA-ST3D 在所有评估的 UDA 设置上都实现了 SOTA 性能,甚至在 Kitti 和 NuScenes 对象检测基准测试中超过了弱监督 DA 方法。