首页 | 本学科首页   官方微博 | 高级检索  
     


Assessing and Quantifying Clusteredness: The OPTICS Cordillera
Authors:Thomas Rusch  Kurt Hornik  Patrick Mair
Affiliation:1. Competence Center for Empirical Research Methods, WU (Vienna University of Economics and Business), Vienna, Austria;2. Institute for Statistics and Mathematics, WU (Vienna University of Economics and Business), Vienna, Austria;3. Department of Psychology, Harvard University, Cambridge, MA
Abstract:This article provides a framework for assessing and quantifying “clusteredness” of a data representation. Clusteredness is a global univariate property defined as a layout diverging from equidistance of points to the closest neighboring point set. The OPTICS algorithm encodes the global clusteredness as a pair of clusteredness-representative distances and an algorithmic ordering. We use this to construct an index for quantification of clusteredness, coined the OPTICS Cordillera, as the norm of subsequent differences over the pair. We provide lower and upper bounds and a normalization for the index. We show the index captures important aspects of clusteredness such as cluster compactness, cluster separation, and number of clusters simultaneously. The index can be used as a goodness-of-clusteredness statistic, as a function over a grid or to compare different representations. For illustration, we apply our suggestion to dimensionality reduced 2D representations of Californian counties with respect to 48 climate change related variables. Online supplementary material is available (including an R package, the data and additional mathematical details).
Keywords:Cluster analysis  Dimensionality reduction  Index  Perception
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号