Clustering methods: To optimize or to not optimize?,Psychological Methods

当前位置： X-MOL 学术 › Psychological Methods › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Clustering methods: To optimize or to not optimize?
Psychological Methods ( IF 7.6 ) Pub Date : 2024-09-12 , DOI: 10.1037/met0000688
Michael Brusco ₁ , Douglas Steinley ₂ , Ashley L Watts ₃

Affiliation

Many clustering problems are associated with a particular objective criterion that is sought to be optimized. There are often several methods that can be used to tackle the optimization problem, and one or more of them might guarantee a globally optimal solution. However, it is quite possible that, relative to one or more suboptimal solutions, a globally optimal solution might be less interpretable from the standpoint of psychological theory or be less in accordance with some known (i.e., true) cluster structure. For example, in simulation experiments, it has sometimes been observed that there is not a perfect correspondence between the optimized clustering criterion and recovery of the underlying known cluster structure. This can lead to the misconception that clustering methods with a tendency to produce suboptimal solutions might, in some instances, be preferable to superior methods that provide globally optimal (or at least better locally optimal) solutions. In this article, we present results from simulation studies in the context of K-median clustering where departure from global optimality was carefully controlled. Although the results showed that suboptimal solutions sometimes produced marginally better recovery for experimental cells where the known cluster structure was less well-defined, capriciously accepting inferior solutions is an unwise practice. However, there are instances in which some sacrifice in the optimization criterion value to meet certain desirable constraints or to improve the value of one or more other relevant criteria is principled. (PsycInfo Database Record (c) 2024 APA, all rights reserved).

中文翻译：

聚类方法：优化还是不优化？

许多聚类问题与寻求优化的特定客观标准相关。通常有多种方法可用于解决优化问题，其中一种或多种方法可以保证全局最优解。然而，很可能的是，相对于一个或多个次优解，全局最优解从心理学理论的角度来看可能不太可解释，或者不太符合某些已知的（即真实的）集群结构。例如，在模拟实验中，有时观察到优化的聚类标准与底层已知聚类结构的恢复之间不存在完美的对应关系。这可能会导致一种误解，即在某些情况下，倾向于产生次优解决方案的聚类方法可能比提供全局最优（或至少更好的局部最优）解决方案的高级方法更可取。在本文中，我们展示了 K 中值聚类背景下的模拟研究结果，其中对全局最优性的偏离得到了仔细控制。尽管结果表明，对于已知簇结构不太明确的实验细胞来说，次优解决方案有时会产生稍微更好的恢复，但随意接受较差的解决方案是一种不明智的做法。然而，存在原则上原则上牺牲优化标准值以满足某些期望的约束或提高一个或多个其他相关标准的值的情况。（PsycInfo 数据库记录 (c) 2024 APA，保留所有权利）。

更新日期：2024-09-12

点击分享查看原文

点击收藏

阅读更多本刊新发论文