首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Traditional c-means clustering partitions a group of objects into a number of non-overlapping sets. Rough sets provide more flexible and objective representation than classical sets with hard partition and fuzzy sets with subjective membership function for a given dataset. Rough c-means clustering and its extensions were introduced and successfully applied in many real life applications in recent years. Each cluster is represented by a reasonable pair of lower and upper approximations. However, the most available algorithms pay no attention to the influence of the imbalanced spatial distribution within a cluster. The limitation of the mean iterative calculation function, with the same weight for all the data objects in a lower or upper approximation, is analyzed. A hybrid imbalanced measure of distance and density for the rough c-means clustering is defined, and a modified rough c-means clustering algorithm is presented in this paper. To evaluate the proposed algorithm, it has been applied to several real world data sets from UCI. The validity of this algorithm is demonstrated by the results of comparative experiments.  相似文献   

2.
A modified approach had been developed in this study by combining two well-known algorithms of clustering, namely fuzzy c-means algorithm and entropy-based algorithm. Fuzzy c-means algorithm is one of the most popular algorithms for fuzzy clustering. It could yield compact clusters but might not be able to generate distinct clusters. On the other hand, entropy-based algorithm could obtain distinct clusters, which might not be compact. However, the clusters need to be both distinct as well as compact. The present paper proposes a modified approach of clustering by combining the above two algorithms. A genetic algorithm was utilized for tuning of all three clustering algorithms separately. The proposed approach was found to yield both distinct as well as compact clusters on two data sets.  相似文献   

3.
The field of cluster analysis is primarily concerned with the partitioning of data points into different clusters so as to optimize a certain criterion. Rapid advances in technology have made it possible to address clustering problems via optimization theory. In this paper, we present a global optimization algorithm to solve the fuzzy clustering problem, where each data point is to be assigned to (possibly) several clusters, with a membership grade assigned to each data point that reflects the likelihood of the data point belonging to that cluster. The fuzzy clustering problem is formulated as a nonlinear program, for which a tight linear programming relaxation is constructed via the Reformulation-Linearization Technique (RLT) in concert with additional valid inequalities. This construct is embedded within a specialized branch-and-bound (B&B) algorithm to solve the problem to global optimality. Computational experience is reported using several standard data sets from the literature as well as using synthetically generated larger problem instances. The results validate the robustness of the proposed algorithmic procedure and exhibit its dominance over the popular fuzzy c-means algorithmic technique and the commercial global optimizer BARON.  相似文献   

4.
In this paper, we propose a grayscale image segmentation method based on a multiobjective optimization approach that optimizes two complementary criteria (region and edge based). The region-based fitness used is the improved spatial fuzzy c-means clustering measure that is shown performing better than the standard fuzzy c-means (FCM) measure. The edge-based fitness used is based on the contour statistics and the number of connected components in the image segmentation result. The optimization algorithm used is the multiobjective particle swarm optimization (MOPSO), which is well suited to handle continuous variables problems, the case of FCM clustering. In our case, each particle of the swarm codes the centers of clusters. The result of the multiobjective optimization technique is a set of Pareto-optimal solutions, where each solution represents a segmentation result. Instead of selecting one solution from the Pareto front, we propose a method that combines all solutions to get a better segmentation. The combination method takes place in two steps. The first step is the detection of high-confidence points by exploiting the similarity between the results and the membership degrees. The second step is the classification of the remaining points by using the high-confidence extracted points. The proposed method was evaluated on three types of images: synthetic images, simulated MRI brain images and real-world MRI brain images. This method was compared to the most widely used FCM-based algorithms of the literature. The results demonstrate the effectiveness of the proposed technique.  相似文献   

5.
In this paper, we propose a new kernel-based fuzzy clustering algorithm which tries to find the best clustering results using optimal parameters of each kernel in each cluster. It is known that data with nonlinear relationships can be separated using one of the kernel-based fuzzy clustering methods. Two common fuzzy clustering approaches are: clustering with a single kernel and clustering with multiple kernels. While clustering with a single kernel doesn’t work well with “multiple-density” clusters, multiple kernel-based fuzzy clustering tries to find an optimal linear weighted combination of kernels with initial fixed (not necessarily the best) parameters. Our algorithm is an extension of the single kernel-based fuzzy c-means and the multiple kernel-based fuzzy clustering algorithms. In this algorithm, there is no need to give “good” parameters of each kernel and no need to give an initial “good” number of kernels. Every cluster will be characterized by a Gaussian kernel with optimal parameters. In order to show its effective clustering performance, we have compared it to other similar clustering algorithms using different databases and different clustering validity measures.  相似文献   

6.
The paper advocates the use of a new fuzzy-based clustering algorithm for document categorization. Each document/datum will be represented as a fuzzy set. In this respect, the fuzzy clustering algorithm, will be constrained additionally in order to cluster fuzzy sets. Then, one needs to find a metric measure in order to detect the overlapping between documents and the cluster prototype (category). In this respect, we use one of the interclass probabilistic reparability measures known as Bhattacharyya distance, which will be incorporated in the general scheme of the fuzzy c-means algorithm for measuring the overlapping between fuzzy sets. This enables the introduction of fuzziness in the document clustering in the sense that it allows a single document to belong to more than one category. This is in line with semantic multiple interpretations conveyed by single words, which support multiple membership to several classes. Performances of the algorithms will be illustrated using a case study from the construction sector.  相似文献   

7.
Fuzzy c-means clustering algorithm (FCM) can provide a non-parametric and unsupervised approach to the cluster analysis of data. Several efforts of fuzzy clustering have been undertaken by Bezdek and other researchers. Earlier studies in this field have reported problems due to the setting of optimum initial condition, cluster validity measure, and high computational load. More recently, the fuzzy clustering has benefited of a synergistic approach with Genetic Algorithms (GA) that play the role of an useful optimization technique that helps to better tolerate some classical drawbacks, such as sensitivity to initialization, noise and outliers, and susceptibility to local minima. We propose a genetic-level clustering methodology able to cluster objects represented by R p spaces. The unsupervised cluster algorithm, called SFCM (Spatial Fuzzy c-Means), is based on a fuzzy clustering c-means method that searches the best fuzzy partition of the universe assuming that the evaluation of each object with respect to some features is unknown, but knowing that it belongs to circular regions of R 2 space. Next we present a Java implementation of the algorithm, which provides a complete and efficient visual interaction for the setting of the parameters involved into the system. To demonstrate the applications of SFCM, we discuss a case study where it is shown the generality of our model by treating a simple 3-way data fuzzy clustering as example of a multicriteria optimization problem.  相似文献   

8.
Based on inter-cluster separation clustering (ICSC) fuzzy inter-cluster separation clustering (FICSC) deals with all the distances between the cluster centers, maximizes these distances and obtains the better performances of clustering. However, FICSC is sensitive to noises the same as fuzzy c-means (FCM) clustering. Possibilistic type of FICSC is proposed to combine FICSC and possibilistic c-means (PCM) clustering. Mixed fuzzy inter-cluster separation clustering (MFICSC) is presented to extend possibilistic type of FICSC because possibilistic type of FICSC is sensitive to initial cluster centers and always generates coincident clusters. MFICSC can produce both fuzzy membership values and typicality values simultaneously. MFICSC shows good performances in dealing with noisy data and overcoming the problem of coincident clusters. The experimental results with data sets show that our proposed MFICSC holds better clustering accuracy, little clustering time and the exact cluster centers.  相似文献   

9.
土壤是一个多性状的连续体,其分类的首选方法是模糊聚类分析.但是模糊聚类分析中现有的基于模糊等价关系的动态聚类法和模糊c-均值法各有利弊,采用其中一种方法聚类肯定存在不足.为此集成两种聚类方法的优点,避其缺点,提出了用基于模糊等价关系的动态聚类方法和方差分析方法确定聚类数目和初始聚类中心,再用模糊c-均值法决定最终分类结果的集成算法,并将其应用到松花江流域土壤分类中,得到了较为切合实际的分类结果.  相似文献   

10.
Data clustering, also called unsupervised learning, is a fundamental issue in data mining that is used to understand and mine the structure of an untagged assemblage of data into separate groups based on their similarity. Recent studies have shown that clustering techniques that optimize a single objective may not provide satisfactory result because no single validity measure works well on different kinds of data sets. Moreover, the performance of clustering algorithms degrades with more and more overlaps among clusters in a data set. These facts have motivated us to develop a fuzzy multi-objective particle swarm optimization framework in an innovative fashion for data clustering, termed as FMOPSO, which is able to deliver more effective results than state-of-the-art clustering algorithms. The key challenge in designing FMOPSO framework for data clustering is how to resolve cluster assignments confusion with such points in the data set which have significant belongingness to more than one cluster. The proposed framework addresses this problem by identification of points having significant membership to multiple classes, excluding them, and re-classifying them into single class assignments. To ascertain the superiority of the proposed algorithm, statistical tests have been performed on a variety of numerical and categorical real life data sets. Our empirical study shows that the performance of the proposed framework (in both terms of efficiency and effectiveness) significantly outperforms the state-of-the-art data clustering algorithms.  相似文献   

11.
There are many data clustering techniques available to extract meaningful information from real world data, but the obtained clustering results of the available techniques, running time for the performance of clustering techniques in clustering real world data are highly important. This work is strongly felt that fuzzy clustering technique is suitable one to find meaningful information and appropriate groups into real world datasets. In fuzzy clustering the objective function controls the groups or clusters and computation parts of clustering. Hence researchers in fuzzy clustering algorithm aim is to minimize the objective function that usually has number of computation parts, like calculation of cluster prototypes, degree of membership for objects, computation part for updating and stopping algorithms. This paper introduces some new effective fuzzy objective functions with effective fuzzy parameters that can help to minimize the running time and to obtain strong meaningful information or clusters into the real world datasets. Further this paper tries to introduce new way for predicting membership, centres by minimizing the proposed new fuzzy objective functions. And experimental results of proposed algorithms are given to illustrate the effectiveness of proposed methods.  相似文献   

12.
There exist many data clustering algorithms, but they can not adequately handle the number of clusters or cluster shapes. Their performance mainly depends on a choice of algorithm parameters. Our approach to data clustering and algorithm does not require the parameter choice; it can be treated as a natural adaptation to the existing structure of distances between data points. The outlier factor introduced by the author specifies a degree of being an outlier for each data point. The outlier factor notion is based on the difference between the frequency distribution of interpoint distances in a given dataset and the corresponding distribution of uniformly distributed points. Then data clusters can be determined by maximizing the outlier factor function. The data points in dataset are divided into clusters according to the attractor regions of local optima. An experimental evaluation of the proposed algorithm shows that the proposed method can identify complex cluster shapes. Key advantages of the approach are: good clustering properties for datasets with comparatively large amount of noise (an additional data points), and an absence of important parameters which adequate choice determines the quality of results.  相似文献   

13.
Clustering is one of the most widely used approaches in data mining with real life applications in virtually any domain. The huge interest in clustering has led to a possibly three-digit number of algorithms with the k-means family probably the most widely used group of methods. Besides classic bivalent approaches, clustering algorithms belonging to the domain of soft computing have been proposed and successfully applied in the past four decades. Bezdek’s fuzzy c-means is a prominent example for such soft computing cluster algorithms with many effective real life applications. More recently, Lingras and West enriched this area by introducing rough k-means. In this article we compare k-means to fuzzy c-means and rough k-means as important representatives of soft clustering. On the basis of this comparison, we then survey important extensions and derivatives of these algorithms; our particular interest here is on hybrid clustering, merging fuzzy and rough concepts. We also give some examples where k-means, rough k-means, and fuzzy c-means have been used in studies.  相似文献   

14.
硬聚类和模糊聚类的结合——双层FCM快速算法   总被引:3,自引:0,他引:3  
模糊c均值(FCM)聚类算法在模式识别领域中得到了广泛的应用,但FCM算法在大数据集的情况下需要大量的CPU时间,令用户感到十分不便,提高算法的速度是一个急待解决的问题。本文提出的双层FCM聚类算法是一种快速算法,它体现了硬聚类和模糊聚类的结合,以硬聚类的结果对模糊聚类的初始值进行指导,从而明显地缩短了迭代过程。双层FCM算法所用的CPU时间仅为FCM算法的十三分之一,因而具有很强的实用价值。  相似文献   

15.
We propose a new technique to perform unsupervised data classification (clustering) based on density induced metric and non-smooth optimization. Our goal is to automatically recognize multidimensional clusters of non-convex shape. We present a modification of the fuzzy c-means algorithm, which uses the data induced metric, defined with the help of Delaunay triangulation. We detail computation of the distances in such a metric using graph algorithms. To find optimal positions of cluster prototypes we employ the discrete gradient method of non-smooth optimization. The new clustering method is capable to identify non-convex overlapped d-dimensional clusters.  相似文献   

16.
在给定的度量空间中, 单位聚类问题就是寻找最少的单位球来覆盖给定的所有点。这是一个众所周知的组合优化问题, 其在线版本为: 给定一个度量空间, 其中的n个点会一个接一个的到达任何可能的位置, 在点到达的时候必须给该点分配一个单位聚类, 而此时未来点的相关信息都是未知的, 问题的目标是最后使用的单位聚类数目最少。本文考虑的是带如下假设的一类一维在线单位聚类问题: 在相应离线问题的最优解中任意两个相邻聚类之间的距离都大于0.5。本文首先给出了两个在线算法和一些引理, 接着通过0.5的概率分别运行两个在线算法得到一个组合随机算法, 最后证明了这个组合随机算法的期望竞争比不超过1.5。  相似文献   

17.
在给定的度量空间中, 单位聚类问题就是寻找最少的单位球来覆盖给定的所有点。这是一个众所周知的组合优化问题, 其在线版本为: 给定一个度量空间, 其中的n个点会一个接一个的到达任何可能的位置, 在点到达的时候必须给该点分配一个单位聚类, 而此时未来点的相关信息都是未知的, 问题的目标是最后使用的单位聚类数目最少。本文考虑的是带如下假设的一类一维在线单位聚类问题: 在相应离线问题的最优解中任意两个相邻聚类之间的距离都大于0.5。本文首先给出了两个在线算法和一些引理, 接着通过0.5的概率分别运行两个在线算法得到一个组合随机算法, 最后证明了这个组合随机算法的期望竞争比不超过1.5。  相似文献   

18.
An new initialization method for fuzzy c-means algorithm   总被引:1,自引:0,他引:1  
In this paper an initialization method for fuzzy c-means (FCM) algorithm is proposed in order to solve the two problems of clustering performance affected by initial cluster centers and lower computation speed for FCM. Grid and density are needed to extract approximate clustering center from sample space. Then, an initialization method for fuzzy c-means algorithm is proposed by using amount of approximate clustering centers to initialize classification number, and using approximate clustering centers to initialize initial clustering centers. Experiment shows that this method can improve clustering result and shorten clustering time validly.  相似文献   

19.
基于微分进化算法的FCM图像分割算法   总被引:1,自引:1,他引:0  
为提高模糊C均值(FCM)算法的自动化程度,提出基于微分进化算法的FCM图像分割算法(DEFCM),利用微分进化算法全局性和鲁棒性的特点自动确定分类数和初始聚类中心,再将其作为模糊c均值聚类的初始聚类中心,弥补FCM算法的不足.实验表明该算法不仅能够正确地对图像分类,而且能获得较好的图像分割效果和质量.  相似文献   

20.
Mixed-integer optimization models for chemical process planning typically assume that model parameters can be accurately predicted. As precise forecasts are difficult to obtain, process planning usually involves uncertainty and ambiguity in the data. This paper presents an application of fuzzy programming to process planning. The forecast parameters are assumed to be fuzzy with a linear or triangular membership function. The process planning problem is then formulated in terms of decision making in a fuzzy environment with fuzzy constraints and fuzzy net present value goals. The model is transformed to a deterministic mixed-integer linear program or mixed-integer nonlinear program depending on the type of uncertainty involved in the problem. For the nonlinear case, a global optimization algorithm is developed for its solution. This algorithm is applicable to general possibilistic programs and can be used as an alternative to the commonly used bisection method. Illustrative examples and computational results for a petrochemical complex with 38 processes and 24 products illustrate the applicability of the developed models and algorithms.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号