Transportation Research Part C: Emerging Technologies ( IF 7.6 ) Pub Date : 2024-01-15 , DOI: 10.1016/j.trc.2023.104465 Chaoqun Ma , Jia Zeng , Penghui Shao , Anyong Qing , Yang Wang
The objective of unlabeled scene adaptive crowd counting (USACC) is to adapt the crowd counting model to a particular scene by utilizing only a handful of unlabeled images from that scene, rather than considering all the diverse scenarios that may occur in the unknown environment at once. The resolution of this problem facilitates the fast widespread deployment of crowd counting models, mitigating the issue of performance deterioration caused by domain shift. To tackle the USACC problem, we propose a novel method called meta-ensemble learning that incorporates ensemble learning into the meta-learning paradigm. Specifically, we pass the input data through the network with stochasticity multiple times, implicitly creating an ensemble of multiple models, to produce multiple distinct outputs which can be averaged as pseudo labels to adapt the model. In an iteration of offline training, the scene-specific parameters are learned by minimizing the consistency loss between the actual predictions and the pseudo labels generated from a few unlabeled images belonging to that scene. Then, we optimize the model using the remaining labeled images from the same scene to alleviate error accumulation caused by pseudo labels, and thus improve the accuracy of the pseudo labels in subsequent iterations. The training process explicitly simulates the process of adapting to a particular scene during the test. Therefore, the model is able to adapt to the target scene using a handful of unlabeled images by minimizing the consistency loss during the test. Extensive experiments on several benchmarks demonstrate our method surpasses both baselines and state-of-the-art methods.
中文翻译:
通过元集成学习的未标记场景自适应人群计数
无标记场景自适应人群计数(USACC)的目标是通过仅利用该场景中的少量未标记图像来使人群计数模型适应特定场景,而不是同时考虑未知环境中可能发生的所有不同场景。该问题的解决有助于人群计数模型的快速广泛部署,缓解域转移导致的性能下降问题。为了解决 USACC 问题,我们提出了一种称为元集成学习的新方法,它将集成学习纳入元学习范式中。具体来说,我们将输入数据多次随机地通过网络,隐式创建多个模型的集合,以产生多个不同的输出,这些输出可以被平均为伪标签以适应模型。在离线训练的迭代中,通过最小化实际预测与从属于该场景的一些未标记图像生成的伪标签之间的一致性损失来学习特定于场景的参数。然后,我们使用来自同一场景的剩余标记图像来优化模型,以减轻伪标签引起的误差累积,从而提高后续迭代中伪标签的准确性。训练过程明确地模拟了测试过程中适应特定场景的过程。因此,该模型能够通过最小化测试过程中的一致性损失,使用少量未标记的图像来适应目标场景。对多个基准的广泛实验表明我们的方法超越了基线和最先进的方法。