TWLip: Exploring Through-Wall Word-Level Lip Reading Based on Coherent SISO Radar,IEEE Internet of Things Journal

当前位置： X-MOL 学术 › IEEE Internet Things J. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

TWLip: Exploring Through-Wall Word-Level Lip Reading Based on Coherent SISO Radar
IEEE Internet of Things Journal ( IF 8.2 ) Pub Date : 7-12-2024 , DOI: 10.1109/jiot.2024.3427329
Dongsheng Zhu ₁ , Chong Han ₁ , Jian Guo ₁ , Lijuan Sun ₁

Affiliation

Recently emerged radio frequency-based lip-reading recognition technologies leverage their independence from lighting and penetration capabilities to expand the applications of lip-reading. Unlike visual-based lip-reading, this technology penetrates barriers such as masks, glass, and wood to detect lip movements. However, previous studies utilizing devices like millimeter-wave radar face limitations due to frequency and power consumption, which restrict the type of penetrable materials and application scenarios. Although low-frequency through-wall radar offers significant penetration capabilities, it has reduced sensitivity to small movements, posing a challenge for lip-reading recognition. Moreover, the high cost of radar equipment and the scarcity of commercially available devices hinder the technology’s development. To address these challenges, we propose TWLip, a word-level lip-reading recognition system utilizing coherent single-input, single-output through-wall radar to detect tiny lip movements behind walls. We utilize I/Q 3D curves derived from radar signals corresponding to lip movements as network input. These curves reflect the amplitude, frequency and rotational characteristics of lip movements in the complex plane. Furthermore, we designed the IQResNet, built with 1D preactivation residual bottleneck units, to extract and classify lip-movement features from I/Q 3D curves. We propose a data augmentation method for radar lip-reading to enhance model efficacy and generalizability. We created a through-wall radar lip-reading dataset containing 20 words from 8 volunteers, totaling 9583 samples. The TWLip demonstrated the ability to recognize these words through a 24 cm brick wall from two meters away with 88.51% accuracy, validating the algorithm’s superiority through detailed comparative studies.

中文翻译：

TWLip：基于相干 SISO 雷达探索穿墙字级唇读

最近出现的基于射频的唇读识别技术利用其独立于照明和穿透能力来扩展唇读的应用。与基于视觉的唇读不同，该技术可以穿透面具、玻璃和木材等障碍物来检测嘴唇的运动。然而，先前利用毫米波雷达等设备的研究由于频率和功耗而受到限制，这限制了可穿透材料的类型和应用场景。尽管低频穿墙雷达具有显着的穿透能力，但它降低了对微小运动的敏感度，对唇读识别提出了挑战。此外，雷达设备的高成本和商用设备的稀缺阻碍了该技术的发展。为了应对这些挑战，我们提出了 TWLip，这是一种单词级唇读识别系统，利用相干单输入、单输出穿墙雷达来检测墙后微小的嘴唇运动。我们利用源自与嘴唇运动相对应的雷达信号的 I/Q 3D 曲线作为网络输入。这些曲线反映了复杂平面中嘴唇运动的幅度、频率和旋转特征。此外，我们设计了采用 1D 预激活残留瓶颈单元构建的 IQResNet，用于从 I/Q 3D 曲线中提取和分类嘴唇运动特征。我们提出了一种用于雷达唇读的数据增强方法，以增强模型的有效性和通用性。我们创建了一个穿墙雷达唇读数据集，其中包含来自 8 位志愿者的 20 个单词，总共 9583 个样本。 TWLip 展示了从两米外透过 24 厘米砖墙识别这些单词的能力（88）。51%的准确率，通过详细的对比研究验证了算法的优越性。

更新日期：2024-08-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>