SparsePoser: Real-time Full-body Motion Reconstruction from Sparse Data
ACM Transactions on Graphics  ( IF 7.8 ) Pub Date : 2023-10-31 , DOI: 10.1145/3625264
Jose Luis Ponton 1 , Haoran Yun 1 , Andreas Aristidou 2 , Carlos Andujar 1 , Nuria Pelechano 1

Accurate and reliable human motion reconstruction is crucial for creating natural interactions of full-body avatars in Virtual Reality (VR) and entertainment applications. As the Metaverse and social applications gain popularity, users are seeking cost-effective solutions to create full-body animations that are comparable in quality to those produced by commercial motion capture systems. In order to provide affordable solutions though, it is important to minimize the number of sensors attached to the subject’s body. Unfortunately, reconstructing the full-body pose from sparse data is a heavily under-determined problem. Some studies that use IMU sensors face challenges in reconstructing the pose due to positional drift and ambiguity of the poses. In recent years, some mainstream VR systems have released 6-degree-of-freedom (6-DoF) tracking devices providing positional and rotational information. Nevertheless, most solutions for reconstructing full-body poses rely on traditional inverse kinematics (IK) solutions, which often produce non-continuous and unnatural poses. In this article, we introduce SparsePoser, a novel deep learning-based solution for reconstructing a full-body pose from a reduced set of six tracking devices. Our system incorporates a convolutional-based autoencoder that synthesizes high-quality continuous human poses by learning the human motion manifold from motion capture data. Then, we employ a learned IK component, made of multiple lightweight feed-forward neural networks, to adjust the hands and feet toward the corresponding trackers. We extensively evaluate our method on publicly available motion capture datasets and with real-time live demos. We show that our method outperforms state-of-the-art techniques using IMU sensors or 6-DoF tracking devices, and can be used for users with different body dimensions and proportions.



准确可靠的人体运动重建对于在虚拟现实 (VR) 和娱乐应用中创建全身化身的自然交互至关重要。随着元宇宙和社交应用程序的普及,用户正在寻求经济高效的解决方案来创建质量可与商业动作捕捉系统制作的全身动画相媲美的动画。然而,为了提供负担得起的解决方案,重要的是要尽量减少附着在受试者身体上的传感器数量。不幸的是,从稀疏数据重建全身姿势是一个严重不确定的问题。由于位置漂移和姿势的模糊性,一些使用 IMU 传感器的研究在重建姿势方面面临着挑战。近年来,一些主流VR系统发布了六自由度(6-DoF)跟踪设备,提供位置和旋转信息。然而,大多数重建全身姿势的解决方案都依赖于传统的逆运动学 (IK) 解决方案,这通常会产生不连续和不自然的姿势。在本文中,我们介绍了 SparsePoser,这是一种基于深度学习的新型解决方案,用于从六个跟踪设备的精简组中重建全身姿势。我们的系统采用了基于卷积的自动编码器,通过从运动捕捉数据中学习人体运动流形来合成高质量的连续人体姿势。然后,我们采用由多个轻量级前馈神经网络组成的学习 IK 组件,将手和脚调整到相应的跟踪器。我们在公开的动作捕捉数据集和实时现场演示中广泛评估了我们的方法。我们表明,我们的方法优于使用 IMU 传感器或 6-DoF 跟踪设备的最先进技术,并且可用于不同身体尺寸和比例的用户。
