共查询到20条相似文献,搜索用时 202 毫秒
1.
二维有序样本的有约束系统聚类 总被引:4,自引:0,他引:4
二维有序样本进行聚类必须满足两个要求:(1)类内各单元的相似性和类间的差异性;(2)各单元在位置上的有序性和类内的连通性。根据这些要求,将各单元观测指标间的距离矩阵作为聚类的指示矩将各单元之间的区位联系矩阵作为聚类的约束矩阵,在约束矩阵给出的约束条件之下,以类间单元指标的最大距离作为类间相似性指标,在指示矩阵中通过逐步聚并而将全部单元合并归类,即可得出满足要求的样本分类。 相似文献
2.
基于指标信息量的灰色可能度聚类模型研究 总被引:1,自引:0,他引:1
《数学的实践与认识》2020,(14)
针对灰色可能度聚类方法中指标权重确定问题,利用改进的CRITIC法确定指标权重,构建了改进的灰色可能度聚类模型.对于传统的CRITIC法,运用变异系数和点距离差值代替传统CRITIC法中的标准差和相关系数,以此代表指标间对比强度和冲突性并基于此提出了新型CRITIC法,解决了标准差不能有效反映指标变异程度,避免当指标数量较少时,相关系数波动性较大的问题.改进的灰色聚类模型通过各指标间信息量水平的差异从而确定指标的客观权重,有效体现了指标信息的重要作用,使权重的确定更加科学,最后运用算例验证改进灰色聚类模型的有效性. 相似文献
3.
基于聚类的多属性群决策专家权重确定方法 总被引:1,自引:0,他引:1
对于多属性群决策中专家权重确定的问题,本文提出了基于聚类的专家权重确定方法,将专家权重分为类别间权重和类别内权重,对专家聚类步骤和类别间权重的计算方法进行了改进。通过专家给出的判断矩阵构建相容度矩阵,利用系统聚类原理,对相容度矩阵进行聚类,得到最大相容度谱系图。通过最大相容度间的距离和给定阈值的比较,对专家进行恰当分类,从而避免了根据现有研究步骤只能将专家分为两类的不足。此外,在确定类别间权重时,除继续对类容量较大的类赋予较大的类别间权重系数外,还引入专家判断矩阵的属性权重一致性来反映类别间的差异,从而有效避免了当某几类专家中含有相等数目专家时,赋予这几类专家相同类别间权重系数的问题。所提方法结构清晰、计算简便,并使得专家权重计算结果更为合理准确。最后运用一个算例对比验证了该方法的可行性和有效性。 相似文献
4.
5.
有类间距离因素聚类结果的比较分析 总被引:4,自引:0,他引:4
本文对于有类间距离因素聚类结果的比较,提出了类结构的空间描述方法和比较相似度的度量指标──夹角余弦,并推导出它的一些性质.最后,用蒙特卡洛模拟的结果阐明用夹角余弦作为聚类结果的相似性度量指标是合理的. 相似文献
6.
本文采用多路归一化割谱聚类方法、单变量GARCH模型和Granger因果检验相结合的模型,分阶段研究了1994-2014年间全球主要股市波动率的聚类特征。首先,利用单变量GARCH模型分别提取全球主要股市的波动率;其次,借助多路归一化割谱聚类方法的特殊性质刻画了全球主要股市波动率的聚类数目、聚类质量以及聚类结果的稳定性等特征;最后,利用Granger因果检验模型分析不同类的代表元股市间的波动溢出效应和同一类内股市间的波动溢出效应。实证结果表明,与非金融危机阶段相比,在金融危机期间全球主要股市波动率的聚类数目较多、聚类质量较高、聚类结果相对稳定、并且全球主要股市间的波动溢出效应增强。 相似文献
7.
8.
硬聚类和模糊聚类的结合——双层FCM快速算法 总被引:3,自引:0,他引:3
模糊c均值(FCM)聚类算法在模式识别领域中得到了广泛的应用,但FCM算法在大数据集的情况下需要大量的CPU时间,令用户感到十分不便,提高算法的速度是一个急待解决的问题。本文提出的双层FCM聚类算法是一种快速算法,它体现了硬聚类和模糊聚类的结合,以硬聚类的结果对模糊聚类的初始值进行指导,从而明显地缩短了迭代过程。双层FCM算法所用的CPU时间仅为FCM算法的十三分之一,因而具有很强的实用价值。 相似文献
9.
10.
FCM和PCM的混合模型可以克服它们单独聚类时的缺点,在聚类效果上有很大改进,但是对于特征不明显的样本而言,这种混合模型的聚类效果并不太好,为了克服这一缺点,本文引入Mercer核,提出了一种新的基于核的混合c-均值聚类模型(KIPCM),运用核函数使得在原始空间不可分的数据点在核空间变得可分。通过数值实验,得到了较为合理的中心值以及较高的正确分类率,证实了本文算法的可行性和有效性。 相似文献
11.
In this study, we present a comprehensive comparative analysis of kernel-based fuzzy clustering and fuzzy clustering. Kernel based clustering has emerged as an interesting and quite visible alternative in fuzzy clustering, however, the effectiveness of this extension vis-à-vis some generic methods of fuzzy clustering has neither been discussed in a complete manner nor the performance of clustering quantified through a convincing comparative analysis. Our focal objective is to understand the performance gains and the importance of parameter selection for kernelized fuzzy clustering. Generic Fuzzy C-Means (FCM) and Gustafson–Kessel (GK) FCM are compared with two typical generalizations of kernel-based fuzzy clustering: one with prototypes located in the feature space (KFCM-F) and the other where the prototypes are distributed in the kernel space (KFCM-K). Both generalizations are studied when dealing with the Gaussian kernel while KFCM-K is also studied with the polynomial kernel. Two criteria are used in evaluating the performance of the clustering method and the resulting clusters, namely classification rate and reconstruction error. Through carefully selected experiments involving synthetic and Machine Learning repository (http://archive.ics.uci.edu/beta/) data sets, we demonstrate that the kernel-based FCM algorithms produce a marginal improvement over standard FCM and GK for most of the analyzed data sets. It has been observed that the kernel-based FCM algorithms are in a number of cases highly sensitive to the selection of specific values of the kernel parameters. 相似文献
12.
Clustering algorithms divide up a dataset into a set of classes/clusters, where similar data objects are assigned to the same
cluster. When the boundary between clusters is ill defined, which yields situations where the same data object belongs to
more than one class, the notion of fuzzy clustering becomes relevant. In this course, each datum belongs to a given class
with some membership grade, between 0 and 1. The most prominent fuzzy clustering algorithm is the fuzzy c-means introduced
by Bezdek (Pattern recognition with fuzzy objective function algorithms, 1981), a fuzzification of the k-means or ISODATA
algorithm. On the other hand, several research issues have been raised regarding both the objective function to be minimized
and the optimization constraints, which help to identify proper cluster shape (Jain et al., ACM Computing Survey 31(3):264–323,
1999). This paper addresses the issue of clustering by evaluating the distance of fuzzy sets in a feature space. Especially,
the fuzzy clustering optimization problem is reformulated when the distance is rather given in terms of divergence distance,
which builds a bridge to the notion of probabilistic distance. This leads to a modified fuzzy clustering, which implicitly
involves the variance–covariance of input terms. The solution of the underlying optimization problem in terms of optimal solution
is determined while the existence and uniqueness of the solution are demonstrated. The performances of the algorithm are assessed
through two numerical applications. The former involves clustering of Gaussian membership functions and the latter tackles
the well-known Iris dataset. Comparisons with standard fuzzy c-means (FCM) are evaluated and discussed. 相似文献
13.
In this paper, we propose a new optimization framework for improving feature selection in medical data classification. We
call this framework Support Feature Machine (SFM). The use of SFM in feature selection is to find the optimal group of features
that show strong separability between two classes. The separability is measured in terms of inter-class and intra-class distances.
The objective of SFM optimization model is to maximize the correctly classified data samples in the training set, whose intra-class
distances are smaller than inter-class distances. This concept can be incorporated with the modified nearest neighbor rule
for unbalanced data. In addition, a variation of SFM that provides the feature weights (prioritization) is also presented.
The proposed SFM framework and its extensions were tested on 5 real medical datasets that are related to the diagnosis of
epilepsy, breast cancer, heart disease, diabetes, and liver disorders. The classification performance of SFM is compared with
those of support vector machine (SVM) classification and Logical Data Analysis (LAD), which is also an optimization-based
feature selection technique. SFM gives very good classification results, yet uses far fewer features to make the decision
than SVM and LAD. This result provides a very significant implication in diagnostic practice. The outcome of this study suggests
that the SFM framework can be used as a quick decision-making tool in real clinical settings. 相似文献
14.
选取两个效果指标,结合模糊C均值算法和组合赋权法实现难采储量的分类.首先基于效果指标运用模糊C均值算法自动搜索储量的最佳类别,再利用主客观赋权偏差最小的思想,构建组合赋权模型,确定属性指标的权重,并计算储量效益指标值,结合模糊C均值结果判别难采储量类别.最后以大庆某油田为实例,对其难采储量进行了分类,有效指导难采储量滚动开发决策. 相似文献
15.
16.
An new initialization method for fuzzy c-means algorithm 总被引:1,自引:0,他引:1
In this paper an initialization method for fuzzy c-means (FCM) algorithm is proposed in order to solve the two problems of
clustering performance affected by initial cluster centers and lower computation speed for FCM. Grid and density are needed
to extract approximate clustering center from sample space. Then, an initialization method for fuzzy c-means algorithm is
proposed by using amount of approximate clustering centers to initialize classification number, and using approximate clustering
centers to initialize initial clustering centers. Experiment shows that this method can improve clustering result and shorten
clustering time validly. 相似文献
17.
18.
Luca Scrucca 《Advances in Data Analysis and Classification》2014,8(2):147-165
The paper introduces a methodology for visualizing on a dimension reduced subspace the classification structure and the geometric characteristics induced by an estimated Gaussian mixture model for discriminant analysis. In particular, we consider the case of mixture of mixture models with varying parametrization which allow for parsimonious models. The approach is an extension of an existing work on reducing dimensionality for model-based clustering based on Gaussian mixtures. Information on the dimension reduction subspace is provided by the variation on class locations and, depending on the estimated mixture model, on the variation on class dispersions. Projections along the estimated directions provide summary plots which help to visualize the structure of the classes and their characteristics. A suitable modification of the method allows us to recover the most discriminant directions, i.e., those that show maximal separation among classes. The approach is illustrated using simulated and real data. 相似文献
19.
Issam Dagher 《Fuzzy Optimization and Decision Making》2018,17(2):159-176
In this paper, we propose a new kernel-based fuzzy clustering algorithm which tries to find the best clustering results using optimal parameters of each kernel in each cluster. It is known that data with nonlinear relationships can be separated using one of the kernel-based fuzzy clustering methods. Two common fuzzy clustering approaches are: clustering with a single kernel and clustering with multiple kernels. While clustering with a single kernel doesn’t work well with “multiple-density” clusters, multiple kernel-based fuzzy clustering tries to find an optimal linear weighted combination of kernels with initial fixed (not necessarily the best) parameters. Our algorithm is an extension of the single kernel-based fuzzy c-means and the multiple kernel-based fuzzy clustering algorithms. In this algorithm, there is no need to give “good” parameters of each kernel and no need to give an initial “good” number of kernels. Every cluster will be characterized by a Gaussian kernel with optimal parameters. In order to show its effective clustering performance, we have compared it to other similar clustering algorithms using different databases and different clustering validity measures. 相似文献
20.
本文定义了样品之间的距离,采用重心法确定类与类之间的距离。给出了动态聚类的计算方法;利用类与类之间的距离定义了评估大气颗粒污染的相对尺度—相对污染率;由此得到了哈尔滨市区大气颗粒物中一些元素含量的相对分布;并得到了一些有意义的结论。 相似文献