当前位置: X-MOL 学术IEEE Trans. Image Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
PIG: Prompt Images Guidance for Night-Time Scene Parsing
IEEE Transactions on Image Processing ( IF 10.8 ) Pub Date : 6-24-2024 , DOI: 10.1109/tip.2024.3415963
Zhifeng Xie 1 , Rui Qiu 1 , Sen Wang 1 , Xin Tan 2 , Yuan Xie 2 , Lizhuang Ma 3

Night-time scene parsing aims to extract pixel-level semantic information in night images, aiding downstream tasks in understanding scene object distribution. Due to limited labeled night image datasets, unsupervised domain adaptation (UDA) has become the predominant method for studying night scenes. UDA typically relies on paired day-night image pairs to guide adaptation, but this approach hampers dataset construction and restricts generalization across night scenes in different datasets. Moreover, UDA, focusing on network architecture and training strategies, faces difficulties in handling classes with few domain similarities. In this paper, we leverage Prompt Images Guidance (PIG) to enhance UDA with supplementary night knowledge. We propose a Night-Focused Network (NFNet) to learn night-specific features from both target domain images and prompt images. To generate high-quality pseudo-labels, we propose Pseudo-label Fusion via Domain Similarity Guidance (FDSG). Classes with fewer domain similarities are predicted by NFNet, which excels in parsing night features, while classes with more domain similarities are predicted by UDA, which has rich labeled semantics. Additionally, we propose two data augmentation strategies: the Prompt Mixture Strategy (PMS) and the Alternate Mask Strategy (AMS), aimed at mitigating the overfitting of the NFNet to a few prompt images. We conduct extensive experiments on four night-time datasets: NightCity, NightCity+, Dark Zurich, and ACDC. The results indicate that utilizing PIG can enhance the parsing accuracy of UDA. The code is available at https://github.com/qiurui4shu/PIG .



夜间场景解析旨在提取夜间图像中的像素级语义信息,帮助下游任务理解场景对象分布。由于标记夜间图像数据集有限,无监督域适应(UDA)已成为研究夜景的主要方法。 UDA 通常依赖成对的昼夜图像对来指导适应,但这种方法阻碍了数据集构建并限制了不同数据集中夜间场景的泛化。此外,UDA专注于网络架构和训练策略,在处理领域相似性很少的类时面临着困难。在本文中,我们利用即时图像引导(PIG)来通过补充夜间知识来增强 UDA。我们提出了一个夜间聚焦网络(NFNet)来从目标域图像和提示图像中学习夜间特定特征。为了生成高质量的伪标签,我们提出通过域相似性指导(FDSG)进行伪标签融合。领域相似性较少的类由 NFNet 预测,该类在解析夜间特征方面表现出色,而领域相似性较多的类由 UDA 预测,其具有丰富的标记语义。此外,我们提出了两种数据增强策略:即时混合策略(PMS)和替代掩模策略(AMS),旨在减轻 NFNet 对一些即时图像的过度拟合。我们对四个夜间数据集进行了广泛的实验:NightCity、NightCity+、Dark Zurich 和 ACDC。结果表明,利用PIG可以提高UDA的解析精度。该代码可在https://github.com/qiurui4shu/PIG 。