International Journal of Computer Vision ( IF 11.6 ) Pub Date : 2024-09-19 , DOI: 10.1007/s11263-024-02166-9 Anirudh S. Chakravarthy, Meghana Reddy Ganesina, Peiyun Hu, Laura Leal-Taixé, Shu Kong, Deva Ramanan, Aljosa Osep
Addressing Lidar Panoptic Segmentation (LPS) is crucial for safe deployment of autnomous vehicles. LPS aims to recognize and segment lidar points w.r.t. a pre-defined vocabulary of semantic classes, including thing classes of countable objects (e.g., pedestrians and vehicles) and stuff classes of amorphous regions (e.g., vegetation and road). Importantly, LPS requires segmenting individual thing instances (e.g., every single vehicle). Current LPS methods make an unrealistic assumption that the semantic class vocabulary is fixed in the real open world, but in fact, class ontologies usually evolve over time as robots encounter instances of novel classes that are considered to be unknowns w.r.t. thepre-defined class vocabulary. To address this unrealistic assumption, we study LPS in the Open World (LiPSOW): we train models on a dataset with a pre-defined semantic class vocabulary and study their generalization to a larger dataset where novel instances of thing and stuff classes can appear. This experimental setting leads to interesting conclusions. While prior art train class-specific instance segmentation methods and obtain state-of-the-art results on known classes, methods based on class-agnostic bottom-up grouping perform favorably on classes outside of the initial class vocabulary (i.e., unknown classes). Unfortunately, these methods do not perform on-par with fully data-driven methods on known classes. Our work suggests a middle ground: we perform class-agnostic point clustering and over-segment the input cloud in a hierarchical fashion, followed by binary point segment classification, akin to Region Proposal Network (Ren et al. NeurIPS, 2015). We obtain the final point cloud segmentation by computing a cut in the weighted hierarchical tree of point segments, independently of semantic classification. Remarkably, this unified approach leads to strong performance on both known and unknown classes.
中文翻译:
开放世界中的激光雷达全景分割
解决激光雷达全景分割 ( LPS ) 对于自动驾驶车辆的安全部署至关重要。 LPS旨在识别和分割语义类别的预定义词汇表中的激光雷达点,包括可数对象的事物类别(例如行人和车辆)和无定形区域的事物类别(例如植被和道路)。重要的是, LPS需要分割单个事物实例(例如,每辆车)。当前的LPS方法做出了一个不切实际的假设,即语义类词汇表在真实的开放世界中是固定的,但事实上,当机器人遇到新类的实例时,类本体通常会随着时间的推移而演变,而这些新类的实例被认为是预定义类词汇表的未知数。为了解决这个不切实际的假设,我们研究了开放世界中的LPS (LiPSOW):我们在具有预定义语义类词汇的数据集上训练模型,并研究它们对更大数据集的泛化,其中可以出现事物和东西类的新实例。这个实验设置得出了有趣的结论。虽然现有技术训练特定于类的实例分割方法并在已知类上获得最先进的结果,但基于与类无关的自下而上分组的方法在初始类词汇表之外的类(即未知类)上表现良好)。不幸的是,这些方法在已知类上的性能无法与完全数据驱动的方法相提并论。 我们的工作提出了一个中间立场:我们执行与类别无关的点聚类,并以分层方式对输入云进行过度分割,然后进行二进制点段分类,类似于区域提议网络(Ren 等人 NeurIPS,2015)。我们通过计算点段的加权分层树中的切割来获得最终的点云分割,独立于语义分类。值得注意的是,这种统一的方法在已知和未知类别上都具有出色的性能。