A Unified Formulation of k-Means,Fuzzy c-Means and Gaussian Mixture Model by the Kolmogorov–Nagumo Average |
| |
Authors: | Osamu Komori Shinto Eguchi |
| |
Institution: | 1.Department of Computer and Information Science, Seikei University, 3-3-1 Kichijoji-Kitamachi, Musashino-shi, Tokyo 180-8633, Japan;2.The Institute of Statistical Mathematics, 10-3 Midori-cho, Tachikawa, Tokyo 190-8562, Japan; |
| |
Abstract: | Clustering is a major unsupervised learning algorithm and is widely applied in data mining and statistical data analyses. Typical examples include k-means, fuzzy c-means, and Gaussian mixture models, which are categorized into hard, soft, and model-based clusterings, respectively. We propose a new clustering, called Pareto clustering, based on the Kolmogorov–Nagumo average, which is defined by a survival function of the Pareto distribution. The proposed algorithm incorporates all the aforementioned clusterings plus maximum-entropy clustering. We introduce a probabilistic framework for the proposed method, in which the underlying distribution to give consistency is discussed. We build the minorize-maximization algorithm to estimate the parameters in Pareto clustering. We compare the performance with existing methods in simulation studies and in benchmark dataset analyses to demonstrate its highly practical utilities. |
| |
Keywords: | k-means fuzzy-c Gaussian mixture model Kolmogorov– Nagumo average generalized energy function Pareto distribution |
|
|