Journal of the Korean Statistical Society ( IF 0.6 ) Pub Date : 2023-12-12 , DOI: 10.1007/s42952-023-00246-z Edward Kanuti Ngailo , Saralees Nadarajah
This paper introduces a novel approach for approximating misclassification probabilities in Euclidean distance classifier when the group means exhibit a bilinear structure such as in the growth curve model first proposed by Potthoff and Roy (Biometrika 51:313–326, 1964). Initially, by leveraging certain statistical relationships, we establish two general results for the improved Euclidean discriminant function in both weighted and unweighted growth curve mean structures. We derive these approximations for the expected misclassification probabilities with respect to the distribution of the improved Euclidean discriminant function. Additionally, we compare the misclassification probabilities of the improved Euclidean discriminant function, the standard Euclidean discriminant function, and the linear discriminant function. It is important to note that in cases where the mean structure is weighted, a higher number of repeated measurements yields better classification results with the improved Euclidean discriminant function and the standard Euclidean discriminant function, allowing for more information to be acquired, as opposed to the linear discriminant function, which performs well with a smaller number of repeated measurements. Furthermore, we evaluate the accuracy of the suggested approximations by Monte Carlo simulations.
中文翻译:
使用偏差校正欧几里德距离判别函数对重复测量进行分类
本文介绍了一种新颖的方法,当组均值表现出双线性结构时,例如 Potthoff 和 Roy 首次提出的增长曲线模型(Biometrika 51:313–326, 1964),用于近似欧几里德距离分类器中的错误分类概率。最初,通过利用某些统计关系,我们在加权和未加权增长曲线均值结构中建立了改进的欧几里德判别函数的两个一般结果。我们根据改进的欧几里德判别函数的分布推导了预期误分类概率的近似值。此外,我们还比较了改进的欧氏判别函数、标准欧氏判别函数和线性判别函数的误分类概率。需要注意的是,在对平均结构进行加权的情况下,使用改进的欧几里德判别函数和标准欧几里德判别函数,重复测量次数越多,产生的分类结果就越好,从而能够获取更多信息,这与线性判别函数,在重复测量次数较少的情况下表现良好。此外,我们通过蒙特卡罗模拟评估了建议近似值的准确性。