Self-Supervised Sub-Action Parsing Network for Semi-Supervised Action Quality Assessment,IEEE Transactions on Image Processing

当前位置： X-MOL 学术 › IEEE Trans. Image Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Self-Supervised Sub-Action Parsing Network for Semi-Supervised Action Quality Assessment
IEEE Transactions on Image Processing ( IF 10.8 ) Pub Date : 2024-10-07 , DOI: 10.1109/tip.2024.3468870
Kumie Gedamu, Yanli Ji, Yang Yang, Jie Shao, Heng Tao Shen

Semi-supervised Action Quality Assessment (AQA) using limited labeled and massive unlabeled samples to achieve high-quality assessment is an attractive but challenging task. The main challenge relies on how to exploit solid and consistent representations of action sequences for building a bridge between labeled and unlabeled samples in the semi-supervised AQA. To address the issue, we propose a Self-supervised sub-Action Parsing Network (SAP-Net) that employs a teacher-student network structure to learn consistent semantic representations between labeled and unlabeled samples for semi-supervised AQA. We perform actor-centric region detection and generate high-quality pseudo-labels in the teacher branch and assists the student branch in learning discriminative action features. We further design a self-supervised sub-action parsing solution to locate and parse fine-grained sub-action sequences. Then, we present the group contrastive learning with pseudo-labels to capture consistent motion-oriented action features in the two branches. We evaluate our proposed SAP-Net on four public datasets: the MTL-AQA, FineDiving, Rhythmic Gymnastics, and FineFS datasets. The experiment results show that our approach outperforms state-of-the-art semi-supervised methods by a significant margin.

中文翻译：

用于半监督动作质量评估的自监督子动作解析网络

半监督行动质量评估（AQA）使用有限的标记和大量未标记样本来实现高质量的评估是一项有吸引力但具有挑战性的任务。主要挑战在于如何利用动作序列的可靠和一致的表示，在半监督 AQA 中的标记和未标记样本之间架起桥梁。为了解决这个问题，我们提出了一个自我监督的子动作解析网络（SAP-Net），它采用师生网络结构来学习半监督 AQA 的标记和未标记样本之间的一致语义表示。我们执行以演员为中心的区域检测，并在教师分支中生成高质量的伪标签，并协助学生分支学习歧视性动作特征。我们进一步设计了一个自监督的子动作解析解决方案，以定位和解析细粒度的子动作序列。然后，我们提出了带有伪标签的组对比学习，以捕捉两个分支中一致的面向运动的动作特征。我们在四个公共数据集上评估了我们提出的 SAP-Net：MTL-AQA、FineDiving、Rhythmic Gymnastics 和 FineFS 数据集。实验结果表明，我们的方法明显优于最先进的半监督方法。

更新日期：2024-10-07

点击分享查看原文

点击收藏

阅读更多本刊新发论文本刊介绍/投稿指南