International Journal of Computer Vision ( IF 11.6 ) Pub Date : 2024-10-28 , DOI: 10.1007/s11263-024-02279-1 Tianshan Liu, Kin-Man Lam, Bing-Kun Bao
As a crucial topic of high-level video understanding, weakly supervised online activity detection (WS-OAD) involves identifying the ongoing behaviors moment-to-moment in streaming videos, trained with solely cheap video-level annotations. It is essentially a challenging task, which requires addressing the entangled issues of the weakly supervised settings and online constraints. In this paper, we tackle the WS-OAD task from the knowledge-distillation (KD) perspective, which trains an online student detector to distill dual-level knowledge from a weakly supervised offline teacher model. To guarantee the completeness of knowledge transfer, we improve the vanilla KD framework from two aspects. First, we introduce an external memory bank to maintain the long-term activity prototypes, which serves as a bridge to align the activity semantics learned from the offline teacher and online student models. Second, to compensate the missing contexts of unseen near future, we leverage a curriculum learning paradigm to gradually train the online student detector to anticipate the future activity semantics. By dynamically scheduling the provided auxiliary future states, the online detector progressively distills contextual information from the offline model in an easy-to-hard course. Extensive experimental results on three public data sets demonstrate the superiority of our proposed method over the competing methods.
中文翻译:
一种具有课程预期功能的记忆辅助知识迁移框架,用于弱监督在线活动检测
作为高级视频理解的一个关键主题,弱监督在线活动检测 (WS-OAD) 涉及识别流媒体视频中每时每刻正在进行的行为,仅使用廉价的视频级注释进行训练。它本质上是一项具有挑战性的任务,需要解决监督较弱的设置和在线限制等纠缠不清的问题。在本文中,我们从知识蒸馏 (KD) 的角度处理 WS-OAD 任务,该任务训练在线学生检测器从弱监督的离线教师模型中提取双级知识。为了保证知识传递的完整性,我们从两个方面对原版 KD 框架进行了改进。首先,我们引入了一个外部记忆库来维护长期活动原型,它作为一座桥梁,使从离线教师和在线学生模型中学习的活动语义保持一致。其次,为了弥补看不见的近未来缺失的上下文,我们利用课程学习范式逐步训练在线学生检测器来预测未来的活动语义。通过动态调度提供的辅助 future state,在线检测器在易懂难的过程中逐步从离线模型中提取上下文信息。在三个公共数据集上的广泛实验结果表明,我们提出的方法优于竞争方法。