首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 202 毫秒
1.
二维有序样本的有约束系统聚类   总被引:4,自引:0,他引:4  
二维有序样本进行聚类必须满足两个要求:(1)类内各单元的相似性和类间的差异性;(2)各单元在位置上的有序性和类内的连通性。根据这些要求,将各单元观测指标间的距离矩阵作为聚类的指示矩将各单元之间的区位联系矩阵作为聚类的约束矩阵,在约束矩阵给出的约束条件之下,以类间单元指标的最大距离作为类间相似性指标,在指示矩阵中通过逐步聚并而将全部单元合并归类,即可得出满足要求的样本分类。  相似文献   

2.
基于指标信息量的灰色可能度聚类模型研究   总被引:1,自引:0,他引:1  
针对灰色可能度聚类方法中指标权重确定问题,利用改进的CRITIC法确定指标权重,构建了改进的灰色可能度聚类模型.对于传统的CRITIC法,运用变异系数和点距离差值代替传统CRITIC法中的标准差和相关系数,以此代表指标间对比强度和冲突性并基于此提出了新型CRITIC法,解决了标准差不能有效反映指标变异程度,避免当指标数量较少时,相关系数波动性较大的问题.改进的灰色聚类模型通过各指标间信息量水平的差异从而确定指标的客观权重,有效体现了指标信息的重要作用,使权重的确定更加科学,最后运用算例验证改进灰色聚类模型的有效性.  相似文献   

3.
基于聚类的多属性群决策专家权重确定方法   总被引:1,自引:0,他引:1  
对于多属性群决策中专家权重确定的问题,本文提出了基于聚类的专家权重确定方法,将专家权重分为类别间权重和类别内权重,对专家聚类步骤和类别间权重的计算方法进行了改进。通过专家给出的判断矩阵构建相容度矩阵,利用系统聚类原理,对相容度矩阵进行聚类,得到最大相容度谱系图。通过最大相容度间的距离和给定阈值的比较,对专家进行恰当分类,从而避免了根据现有研究步骤只能将专家分为两类的不足。此外,在确定类别间权重时,除继续对类容量较大的类赋予较大的类别间权重系数外,还引入专家判断矩阵的属性权重一致性来反映类别间的差异,从而有效避免了当某几类专家中含有相等数目专家时,赋予这几类专家相同类别间权重系数的问题。所提方法结构清晰、计算简便,并使得专家权重计算结果更为合理准确。最后运用一个算例对比验证了该方法的可行性和有效性。  相似文献   

4.
基于正交函数系和FCM算法,提出了一种新的时间序列聚类的方法.该方法首先通过一个非线性映射,将长度为n的时间序列映射到L_2空间,然后通过计算函数之间的距离得到时间序列之间的相似度.在此基础上,经过FCM算法实现时间序列的聚类.该方法克服了时间序列的高维数特征为时间序列聚类带来的计算困难.实验结果表明,对高维的时间序列,该方法在压缩率达到80%的情况下,依然具有良好的聚类效果.  相似文献   

5.
有类间距离因素聚类结果的比较分析   总被引:4,自引:0,他引:4  
本文对于有类间距离因素聚类结果的比较,提出了类结构的空间描述方法和比较相似度的度量指标──夹角余弦,并推导出它的一些性质.最后,用蒙特卡洛模拟的结果阐明用夹角余弦作为聚类结果的相似性度量指标是合理的.  相似文献   

6.
苏木亚 《运筹与管理》2017,26(11):134-144
本文采用多路归一化割谱聚类方法、单变量GARCH模型和Granger因果检验相结合的模型,分阶段研究了1994-2014年间全球主要股市波动率的聚类特征。首先,利用单变量GARCH模型分别提取全球主要股市的波动率;其次,借助多路归一化割谱聚类方法的特殊性质刻画了全球主要股市波动率的聚类数目、聚类质量以及聚类结果的稳定性等特征;最后,利用Granger因果检验模型分析不同类的代表元股市间的波动溢出效应和同一类内股市间的波动溢出效应。实证结果表明,与非金融危机阶段相比,在金融危机期间全球主要股市波动率的聚类数目较多、聚类质量较高、聚类结果相对稳定、并且全球主要股市间的波动溢出效应增强。  相似文献   

7.
基于AFS拓扑和AFCM的模糊聚类分析   总被引:1,自引:0,他引:1  
在分析AFS方法和AFCM算法的基础上,设计了一个新的模糊聚类算法.它首先应用AFS拓扑理论计算得到数据的相对距离,然后将相对距离应用于改进后的AFCM算法中,并进行了聚类实验.实验结果证明这样的聚类算法优于传统的HCM、FCM聚类算法,而且该方法能应用于含有布尔值或模糊概念的聚类分析中.  相似文献   

8.
硬聚类和模糊聚类的结合——双层FCM快速算法   总被引:3,自引:0,他引:3  
模糊c均值(FCM)聚类算法在模式识别领域中得到了广泛的应用,但FCM算法在大数据集的情况下需要大量的CPU时间,令用户感到十分不便,提高算法的速度是一个急待解决的问题。本文提出的双层FCM聚类算法是一种快速算法,它体现了硬聚类和模糊聚类的结合,以硬聚类的结果对模糊聚类的初始值进行指导,从而明显地缩短了迭代过程。双层FCM算法所用的CPU时间仅为FCM算法的十三分之一,因而具有很强的实用价值。  相似文献   

9.
《数理统计与管理》2019,(3):450-459
时间序列数据的聚类是对面板数据或多维时间序列根据序列相似度进行分组。聚在同一组的时间序列具有相近的模型参数,尤其是当序列较短时聚类后能够得到更精确的参数估计。现存的时间序列聚类方法的距离度量大都基于时间序列的线性假设,但是现实中时间序列通常是非线性的。本文提出了一种基于Copula距离测度的非线性时间序列数据的聚类方法,它利用了Copula函数获取时间序列的非线性相依结构。作为一种非参数的距离度量,基于Copula函数的距离度量能够识别动态相关结构的相似性。大量的模拟实验和实证研究验证了我们所提方法的有效性。  相似文献   

10.
FCM和PCM的混合模型可以克服它们单独聚类时的缺点,在聚类效果上有很大改进,但是对于特征不明显的样本而言,这种混合模型的聚类效果并不太好,为了克服这一缺点,本文引入Mercer核,提出了一种新的基于核的混合c-均值聚类模型(KIPCM),运用核函数使得在原始空间不可分的数据点在核空间变得可分。通过数值实验,得到了较为合理的中心值以及较高的正确分类率,证实了本文算法的可行性和有效性。  相似文献   

11.
In this study, we present a comprehensive comparative analysis of kernel-based fuzzy clustering and fuzzy clustering. Kernel based clustering has emerged as an interesting and quite visible alternative in fuzzy clustering, however, the effectiveness of this extension vis-à-vis some generic methods of fuzzy clustering has neither been discussed in a complete manner nor the performance of clustering quantified through a convincing comparative analysis. Our focal objective is to understand the performance gains and the importance of parameter selection for kernelized fuzzy clustering. Generic Fuzzy C-Means (FCM) and Gustafson–Kessel (GK) FCM are compared with two typical generalizations of kernel-based fuzzy clustering: one with prototypes located in the feature space (KFCM-F) and the other where the prototypes are distributed in the kernel space (KFCM-K). Both generalizations are studied when dealing with the Gaussian kernel while KFCM-K is also studied with the polynomial kernel. Two criteria are used in evaluating the performance of the clustering method and the resulting clusters, namely classification rate and reconstruction error. Through carefully selected experiments involving synthetic and Machine Learning repository (http://archive.ics.uci.edu/beta/) data sets, we demonstrate that the kernel-based FCM algorithms produce a marginal improvement over standard FCM and GK for most of the analyzed data sets. It has been observed that the kernel-based FCM algorithms are in a number of cases highly sensitive to the selection of specific values of the kernel parameters.  相似文献   

12.
Clustering algorithms divide up a dataset into a set of classes/clusters, where similar data objects are assigned to the same cluster. When the boundary between clusters is ill defined, which yields situations where the same data object belongs to more than one class, the notion of fuzzy clustering becomes relevant. In this course, each datum belongs to a given class with some membership grade, between 0 and 1. The most prominent fuzzy clustering algorithm is the fuzzy c-means introduced by Bezdek (Pattern recognition with fuzzy objective function algorithms, 1981), a fuzzification of the k-means or ISODATA algorithm. On the other hand, several research issues have been raised regarding both the objective function to be minimized and the optimization constraints, which help to identify proper cluster shape (Jain et al., ACM Computing Survey 31(3):264–323, 1999). This paper addresses the issue of clustering by evaluating the distance of fuzzy sets in a feature space. Especially, the fuzzy clustering optimization problem is reformulated when the distance is rather given in terms of divergence distance, which builds a bridge to the notion of probabilistic distance. This leads to a modified fuzzy clustering, which implicitly involves the variance–covariance of input terms. The solution of the underlying optimization problem in terms of optimal solution is determined while the existence and uniqueness of the solution are demonstrated. The performances of the algorithm are assessed through two numerical applications. The former involves clustering of Gaussian membership functions and the latter tackles the well-known Iris dataset. Comparisons with standard fuzzy c-means (FCM) are evaluated and discussed.  相似文献   

13.
In this paper, we propose a new optimization framework for improving feature selection in medical data classification. We call this framework Support Feature Machine (SFM). The use of SFM in feature selection is to find the optimal group of features that show strong separability between two classes. The separability is measured in terms of inter-class and intra-class distances. The objective of SFM optimization model is to maximize the correctly classified data samples in the training set, whose intra-class distances are smaller than inter-class distances. This concept can be incorporated with the modified nearest neighbor rule for unbalanced data. In addition, a variation of SFM that provides the feature weights (prioritization) is also presented. The proposed SFM framework and its extensions were tested on 5 real medical datasets that are related to the diagnosis of epilepsy, breast cancer, heart disease, diabetes, and liver disorders. The classification performance of SFM is compared with those of support vector machine (SVM) classification and Logical Data Analysis (LAD), which is also an optimization-based feature selection technique. SFM gives very good classification results, yet uses far fewer features to make the decision than SVM and LAD. This result provides a very significant implication in diagnostic practice. The outcome of this study suggests that the SFM framework can be used as a quick decision-making tool in real clinical settings.  相似文献   

14.
选取两个效果指标,结合模糊C均值算法和组合赋权法实现难采储量的分类.首先基于效果指标运用模糊C均值算法自动搜索储量的最佳类别,再利用主客观赋权偏差最小的思想,构建组合赋权模型,确定属性指标的权重,并计算储量效益指标值,结合模糊C均值结果判别难采储量类别.最后以大庆某油田为实例,对其难采储量进行了分类,有效指导难采储量滚动开发决策.  相似文献   

15.
马氏距离聚类分析中协方差矩阵估算的改进   总被引:1,自引:0,他引:1  
本文考虑了变量权重和样本类别的影响,建立了马氏距离聚类过程中评估协方差矩阵的迭代法。以Fisher的iris数据为样本,运用欧氏距离一般聚类、主成分聚类、改进前后的马氏距离聚类方法,进行实证分析和比较,结果表明本文所提出的新方法准确率至少提高了6.63%。最后,运用该方法对35个国家的相关指标数据进行聚类分析,确定了各国的卫生保健状况等级。  相似文献   

16.
An new initialization method for fuzzy c-means algorithm   总被引:1,自引:0,他引:1  
In this paper an initialization method for fuzzy c-means (FCM) algorithm is proposed in order to solve the two problems of clustering performance affected by initial cluster centers and lower computation speed for FCM. Grid and density are needed to extract approximate clustering center from sample space. Then, an initialization method for fuzzy c-means algorithm is proposed by using amount of approximate clustering centers to initialize classification number, and using approximate clustering centers to initialize initial clustering centers. Experiment shows that this method can improve clustering result and shorten clustering time validly.  相似文献   

17.
曾倩  张锦 《运筹与管理》2017,26(6):10-15
针对资源分配问题,本文提出了基于分类的决策方法以实现效率与公平的权衡。首先,研究了效率最优分配与完全均等分配过程,阐明了边际效用对分配结果的影响,指出个体间边际效用相差越小,两种分配的结果越接近。其次,构建0-1整数规划模型求解分类结果,目标是使类与类之间边际效用相差尽可能小。然后,按类对资源进行分配,类间采用完全均等分配,类中采用效率最优分配。通过选择分类数量可以实现不同程度的公平。最后,运用算例验证方法的有效性,说明分类数量对效率与公平的影响。  相似文献   

18.
The paper introduces a methodology for visualizing on a dimension reduced subspace the classification structure and the geometric characteristics induced by an estimated Gaussian mixture model for discriminant analysis. In particular, we consider the case of mixture of mixture models with varying parametrization which allow for parsimonious models. The approach is an extension of an existing work on reducing dimensionality for model-based clustering based on Gaussian mixtures. Information on the dimension reduction subspace is provided by the variation on class locations and, depending on the estimated mixture model, on the variation on class dispersions. Projections along the estimated directions provide summary plots which help to visualize the structure of the classes and their characteristics. A suitable modification of the method allows us to recover the most discriminant directions, i.e., those that show maximal separation among classes. The approach is illustrated using simulated and real data.  相似文献   

19.
In this paper, we propose a new kernel-based fuzzy clustering algorithm which tries to find the best clustering results using optimal parameters of each kernel in each cluster. It is known that data with nonlinear relationships can be separated using one of the kernel-based fuzzy clustering methods. Two common fuzzy clustering approaches are: clustering with a single kernel and clustering with multiple kernels. While clustering with a single kernel doesn’t work well with “multiple-density” clusters, multiple kernel-based fuzzy clustering tries to find an optimal linear weighted combination of kernels with initial fixed (not necessarily the best) parameters. Our algorithm is an extension of the single kernel-based fuzzy c-means and the multiple kernel-based fuzzy clustering algorithms. In this algorithm, there is no need to give “good” parameters of each kernel and no need to give an initial “good” number of kernels. Every cluster will be characterized by a Gaussian kernel with optimal parameters. In order to show its effective clustering performance, we have compared it to other similar clustering algorithms using different databases and different clustering validity measures.  相似文献   

20.
本文定义了样品之间的距离,采用重心法确定类与类之间的距离。给出了动态聚类的计算方法;利用类与类之间的距离定义了评估大气颗粒污染的相对尺度—相对污染率;由此得到了哈尔滨市区大气颗粒物中一些元素含量的相对分布;并得到了一些有意义的结论。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号