Multimedia Tools and Applications ( IF 3.0 ) Pub Date : 2024-03-08 , DOI: 10.1007/s11042-024-18691-1 Upendra Kumar
This work showed the capability of handling large number of classes for classification with human cognition inspired methods. A cognition based techniques for both feature extraction, (self-similarity feature, Intensity Level Multi Fractal Dimension (ILMFD)) as well as classification purpose (decision tree clustering based multi-level Artificial Neural Network classifier-MLANN-DTC) were employed to implement facial recognition based object detection system. A DTC based approach reduces the search space time and also provides opportunity for very less amount of classes (a smaller part of the large number of classes) to be handled by the respective classifier for classification. It also mimics fast recognition capability of humans. In this work, two different databases were used for experiment, first one is our own collected facial images from rotation based video clips (117 persons and 40 facial images per person) named as NS database, and other is standard ORL database (40 persons and 10 facial images per person). In pre-processing step, the facial images were segmented to obtain facial part using context window based texture of pixels (CWTP) & back-propagation neural network (BPNN) based model and then a scale and rotation independent ILMFD feature was computed from each segmented image. Further, a combination of K-means and hierarchal clustering was used to build super classes. All classes’ data were distributed among these 6 super classes (heuristically chosen) for own NS database and 3 for ORL database as per their similarity based on ILMFD features. Multi-level ANNs models were employed for all super classes and further their classification results were fed into decision clustering based model to obtain fine-tuned results, which showed significant improvement in terms of classification efficiency. This approach believes in center tendency of largest cluster to refer the actual class decision from multiple decisions obtain corresponding to multiple input data of the same class. In this work, the MLANN-DTC based proposed model has produced 89.542 ± 1.167% and 87.098 ± 2.066% classification efficiency (± standard deviation) for single input and for group based decision (decision clustering), 95.042 ± 0.719% and 89 ± 2.549% for NS and ORL database respectively. This improved classification results motivate its application for other object recognition and classification problems. The basic idea of this work also supports better handling of classification which deals with a large number of classes.
中文翻译:
在多级人工神经网络分类器中使用基于认知的决策树聚类和自相似性作为特征标准进行对象识别
这项工作展示了使用人类认知启发方法处理大量类别进行分类的能力。采用基于认知的技术来实现特征提取(自相似特征、强度级多分形维数(ILMFD))和分类目的(基于决策树聚类的多级人工神经网络分类器-MLANN-DTC)基于面部识别的物体检测系统。基于 DTC 的方法减少了搜索空间时间,并且还为相应分类器处理极少量的类别(大量类别中的较小部分)提供了机会以进行分类。它还模仿人类的快速识别能力。在这项工作中,使用了两个不同的数据库进行实验,第一个是我们自己从基于旋转的视频剪辑中收集的面部图像(117 个人和每人 40 个面部图像),命名为 NS 数据库,另一个是标准 ORL 数据库(40 个人和每人 40 个面部图像)。每人 10 张面部图像)。在预处理步骤中,使用基于上下文窗口的像素纹理 (CWTP) 和基于反向传播神经网络 (BPNN) 的模型对面部图像进行分割以获得面部部分,然后从每个分割中计算与比例和旋转无关的 ILMFD 特征图像。此外,K-means 和层次聚类的组合被用来构建超类。所有类的数据根据 ILMFD 特征的相似性分布在自己的 NS 数据库的 6 个超类(启发式选择)和 ORL 数据库的 3 个超类中。所有超类均采用多级人工神经网络模型,并将其分类结果进一步输入基于决策聚类的模型以获得微调结果,这在分类效率方面显示出显着提高。该方法相信最大簇的中心趋势是指从与同一类别的多个输入数据相对应的多个决策中获得的实际类别决策。在这项工作中,基于 MLANN-DTC 的提出模型对于单输入和基于组的决策(决策聚类)产生了 89.542 ± 1.167% 和 87.098 ± 2.066% 的分类效率(±标准差),95.042 ± 0.719% 和 89 ± 2.549分别为 NS 和 ORL 数据库的%。这种改进的分类结果激发了其在其他对象识别和分类问题中的应用。这项工作的基本思想还支持更好地处理涉及大量类别的分类。