Multimedia Tools and Applications ( IF 3.0 ) Pub Date : 2020-09-18 , DOI: 10.1007/s11042-020-09829-y Abul Abbas Barbhuiya , Ram Kumar Karsh , Rahul Jain
Hand gesture is one of the most prominent ways of communication since the beginning of the human era. Hand gesture recognition extends human-computer interaction (HCI) more convenient and flexible. Therefore, it is important to identify each character correctly for calm and error-free HCI. Literature survey reveals that most of the existing hand gesture recognition (HGR) systems have considered only a few simple discriminating gestures for recognition performance. This paper applies deep learning-based convolutional neural networks (CNNs) for robust modeling of static signs in the context of sign language recognition. In this work, CNN is employed for HGR where both alphabets and numerals of ASL are considered simultaneously. The pros and cons of CNNs used for HGR are also highlighted. The CNN architecture is based on modified AlexNet and modified VGG16 models for classification. Modified pre-trained AlexNet and modified pre-trained VGG16 based architectures are used for feature extraction followed by a multiclass support vector machine (SVM) classifier. The results are evaluated based on different layer features for best recognition performance. To examine the accuracy of the HGR schemes, both the leave-one-subject-out and a random 70–30 form of cross-validation approach were adopted. This work also highlights the recognition accuracy of each character, and their similarities with identical gestures. The experiments are performed in a simple CPU system instead of high-end GPU systems to demonstrate the cost-effectiveness of this work. The proposed system has achieved a recognition accuracy of 99.82%, which is better than some of the state-of-art methods.
中文翻译:
基于CNN的手语特征提取与分类
自人类时代以来,手势是最重要的交流方式之一。手势识别更方便,更灵活地扩展了人机交互(HCI)。因此,重要的是要正确识别每个字符,以确保HCI稳定且无错误。文献调查显示,大多数现有的手势识别(HGR)系统都只考虑了几种简单的识别手势来提高识别性能。本文将基于深度学习的卷积神经网络(CNN)应用到手语识别上下文中的静态符号的鲁棒建模中。在这项工作中,CNN用于HGR,其中同时考虑了ASL的字母和数字。还强调了用于HGR的CNN的优缺点。CNN体系结构基于修改后的AlexNet和修改后的VGG16模型进行分类。改进的经过预训练的AlexNet和改进的基于VGG16的经过预训练的体系结构用于特征提取,然后使用多类支持向量机(SVM)分类器。基于不同的图层特征对结果进行评估,以获得最佳识别性能。为了检验HGR计划的准确性,采用了留一法和随机70-30形式的交叉验证方法。这项工作还强调了每个字符的识别准确性,以及它们在相同手势下的相似性。实验是在简单的CPU系统而不是高端GPU系统中进行的,以证明这项工作的成本效益。拟议的系统已经达到了99.82%的识别精度