首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
In this study, we present a comprehensive comparative analysis of kernel-based fuzzy clustering and fuzzy clustering. Kernel based clustering has emerged as an interesting and quite visible alternative in fuzzy clustering, however, the effectiveness of this extension vis-à-vis some generic methods of fuzzy clustering has neither been discussed in a complete manner nor the performance of clustering quantified through a convincing comparative analysis. Our focal objective is to understand the performance gains and the importance of parameter selection for kernelized fuzzy clustering. Generic Fuzzy C-Means (FCM) and Gustafson–Kessel (GK) FCM are compared with two typical generalizations of kernel-based fuzzy clustering: one with prototypes located in the feature space (KFCM-F) and the other where the prototypes are distributed in the kernel space (KFCM-K). Both generalizations are studied when dealing with the Gaussian kernel while KFCM-K is also studied with the polynomial kernel. Two criteria are used in evaluating the performance of the clustering method and the resulting clusters, namely classification rate and reconstruction error. Through carefully selected experiments involving synthetic and Machine Learning repository (http://archive.ics.uci.edu/beta/) data sets, we demonstrate that the kernel-based FCM algorithms produce a marginal improvement over standard FCM and GK for most of the analyzed data sets. It has been observed that the kernel-based FCM algorithms are in a number of cases highly sensitive to the selection of specific values of the kernel parameters.  相似文献   

2.
Traditional means of studying environmental economics and management problems consist of optimal control and dynamic game models that are solved for optimal or equilibrium strategies. Notwithstanding the possibility of multiple equilibria, the models’ users—managers or planners—will usually be provided with a single optimal or equilibrium strategy no matter how reliable, or unreliable, the underlying models and their parameters are. In this paper we follow an alternative approach to policy making that is based on viability theory. It establishes “satisficing” (in the sense of Simon), or viable, policies that keep the dynamic system in a constraint set and are, generically, multiple and amenable to each manager’s own prioritisation. Moreover, they can depend on fewer parameters than the optimal or equilibrium strategies and hence be more robust. For the determination of these (viable) policies, computation of “viability kernels” is crucial. We introduce a MATLAB application, under the name of VIKAASA, which allows us to compute approximations to viability kernels. We discuss two algorithms implemented in VIKAASA. One approximates the viability kernel by the locus of state space positions for which solutions to an auxiliary cost-minimising optimal control problem can be found. The lack of any solution implies the infinite value function and indicates an evolution which leaves the constraint set in finite time, therefore defining the point from which the evolution originates as belonging to the kernel’s complement. The other algorithm accepts a point as viable if the system’s dynamics can be stabilised from this point. We comment on the pros and cons of each algorithm. We apply viability theory and the VIKAASA software to a problem of by-catch fisheries exploited by one or two fleets and provide rules concerning the proportion of fish biomass and the fishing effort that a sustainable fishery’s exploitation should follow.  相似文献   

3.
One of the most significant discussions in the field of machine learning today is on the clustering ensemble. The clustering ensemble combines multiple partitions generated by different clustering algorithms into a single clustering solution. Genetic algorithms are known for their high ability to solve optimization problems, especially the problem of the clustering ensemble. To date, despite the major contributions to find consensus cluster partitions with application of genetic algorithms, there has been little discussion on population initialization through generative mechanisms in genetic-based clustering ensemble algorithms as well as the production of cluster partitions with favorable fitness values in first phase clustering ensembles. In this paper, a threshold fuzzy C-means algorithm, named TFCM, is proposed to solve the problem of diversity of clustering, one of the most common problems in clustering ensembles. Moreover, TFCM is able to increase the fitness of cluster partitions, such that it improves performance of genetic-based clustering ensemble algorithms. The fitness average of cluster partitions generated by TFCM are evaluated by three different objective functions and compared against other clustering algorithms. In this paper, a simple genetic-based clustering ensemble algorithm, named SGCE, is proposed, in which cluster partitions generated by the TFCM and other clustering algorithms are used as the initial population used by the SGCE. The performance of the SGCE is evaluated and compared based on the different initial populations used. The experimental results based on eleven real world datasets demonstrate that TFCM improves the fitness of cluster partitions and that the performance of the SGCE is enhanced using initial populations generated by the TFCM.  相似文献   

4.
There are many data clustering techniques available to extract meaningful information from real world data, but the obtained clustering results of the available techniques, running time for the performance of clustering techniques in clustering real world data are highly important. This work is strongly felt that fuzzy clustering technique is suitable one to find meaningful information and appropriate groups into real world datasets. In fuzzy clustering the objective function controls the groups or clusters and computation parts of clustering. Hence researchers in fuzzy clustering algorithm aim is to minimize the objective function that usually has number of computation parts, like calculation of cluster prototypes, degree of membership for objects, computation part for updating and stopping algorithms. This paper introduces some new effective fuzzy objective functions with effective fuzzy parameters that can help to minimize the running time and to obtain strong meaningful information or clusters into the real world datasets. Further this paper tries to introduce new way for predicting membership, centres by minimizing the proposed new fuzzy objective functions. And experimental results of proposed algorithms are given to illustrate the effectiveness of proposed methods.  相似文献   

5.
The paper advocates the use of a new fuzzy-based clustering algorithm for document categorization. Each document/datum will be represented as a fuzzy set. In this respect, the fuzzy clustering algorithm, will be constrained additionally in order to cluster fuzzy sets. Then, one needs to find a metric measure in order to detect the overlapping between documents and the cluster prototype (category). In this respect, we use one of the interclass probabilistic reparability measures known as Bhattacharyya distance, which will be incorporated in the general scheme of the fuzzy c-means algorithm for measuring the overlapping between fuzzy sets. This enables the introduction of fuzziness in the document clustering in the sense that it allows a single document to belong to more than one category. This is in line with semantic multiple interpretations conveyed by single words, which support multiple membership to several classes. Performances of the algorithms will be illustrated using a case study from the construction sector.  相似文献   

6.
We study a class of partial differential equations (PDEs) in the family of the so‐called Euler–Poincaré differential systems, with the aim of developing a foundation for numerical algorithms of their solutions. This requires particular attention to the mathematical properties of this system when the associated class of elliptic operators possesses nonsmooth kernels. By casting the system in its Lagrangian (or characteristics) form, we first formulate a particle system algorithm in free space with homogeneous Dirichlet boundary conditions for the evolving fields. We next examine the deformation of the system when nonhomogeneous “constant stream” boundary conditions are assumed. We show how this simple change at the boundary deeply affects the nature of the evolution, from hyperbolic‐like to dispersive with a nontrivial dispersion relation, and examine the potentially regularizing properties of singular kernels offered by this deformation. From the particle algorithm viewpoint, kernel singularities affect the existence and uniqueness of solutions to the corresponding ordinary differential equations systems. We illustrate this with the case when the operator kernel assumes a conical shape over the spatial variables, and examine in detail two‐particle dynamics under the resulting lack of Lipschitz continuity. Curiously, we find that for the conically shaped kernels the motion of the related two‐dimensional waves can become completely integrable under appropriate initial data. This reduction projects the two‐dimensional system to the one‐dimensional completely integrable Shallow‐Water equation [1], while retaining the full dependence on two spatial dimensions for the single channel solutions. Finally, by comparing with an operator‐splitting pseudospectral method we illustrate the performance of the particle algorithms with respect to their Eulerian counterpart for this class of nonsmooth kernels.  相似文献   

7.
探讨基因表达数据的聚类分析方法,结合一种聚类结果的评判准则,应用于胎儿小脑基因表达数据,得到了最优的聚类结果,并做出了生物学解释.利用Matlab软件进行了仿真,利用模糊聚类Xie-Beni指数得到了最优聚类数,并把每一类对应的基因标号输出到txt文件,最后进行生物学解释.得到的小脑基因最优聚类数为3类,与生物学意义比较吻合,各类中的基因功能接近.基于FCM算法的基因模糊聚类是有效的,结果具有一定生物学意义,能对生物学基因聚类有一定指导作用.  相似文献   

8.
We propose a new technique to perform unsupervised data classification (clustering) based on density induced metric and non-smooth optimization. Our goal is to automatically recognize multidimensional clusters of non-convex shape. We present a modification of the fuzzy c-means algorithm, which uses the data induced metric, defined with the help of Delaunay triangulation. We detail computation of the distances in such a metric using graph algorithms. To find optimal positions of cluster prototypes we employ the discrete gradient method of non-smooth optimization. The new clustering method is capable to identify non-convex overlapped d-dimensional clusters.  相似文献   

9.
Fuzzy c-means clustering algorithm (FCM) can provide a non-parametric and unsupervised approach to the cluster analysis of data. Several efforts of fuzzy clustering have been undertaken by Bezdek and other researchers. Earlier studies in this field have reported problems due to the setting of optimum initial condition, cluster validity measure, and high computational load. More recently, the fuzzy clustering has benefited of a synergistic approach with Genetic Algorithms (GA) that play the role of an useful optimization technique that helps to better tolerate some classical drawbacks, such as sensitivity to initialization, noise and outliers, and susceptibility to local minima. We propose a genetic-level clustering methodology able to cluster objects represented by R p spaces. The unsupervised cluster algorithm, called SFCM (Spatial Fuzzy c-Means), is based on a fuzzy clustering c-means method that searches the best fuzzy partition of the universe assuming that the evaluation of each object with respect to some features is unknown, but knowing that it belongs to circular regions of R 2 space. Next we present a Java implementation of the algorithm, which provides a complete and efficient visual interaction for the setting of the parameters involved into the system. To demonstrate the applications of SFCM, we discuss a case study where it is shown the generality of our model by treating a simple 3-way data fuzzy clustering as example of a multicriteria optimization problem.  相似文献   

10.
Clustering algorithms divide up a dataset into a set of classes/clusters, where similar data objects are assigned to the same cluster. When the boundary between clusters is ill defined, which yields situations where the same data object belongs to more than one class, the notion of fuzzy clustering becomes relevant. In this course, each datum belongs to a given class with some membership grade, between 0 and 1. The most prominent fuzzy clustering algorithm is the fuzzy c-means introduced by Bezdek (Pattern recognition with fuzzy objective function algorithms, 1981), a fuzzification of the k-means or ISODATA algorithm. On the other hand, several research issues have been raised regarding both the objective function to be minimized and the optimization constraints, which help to identify proper cluster shape (Jain et al., ACM Computing Survey 31(3):264–323, 1999). This paper addresses the issue of clustering by evaluating the distance of fuzzy sets in a feature space. Especially, the fuzzy clustering optimization problem is reformulated when the distance is rather given in terms of divergence distance, which builds a bridge to the notion of probabilistic distance. This leads to a modified fuzzy clustering, which implicitly involves the variance–covariance of input terms. The solution of the underlying optimization problem in terms of optimal solution is determined while the existence and uniqueness of the solution are demonstrated. The performances of the algorithm are assessed through two numerical applications. The former involves clustering of Gaussian membership functions and the latter tackles the well-known Iris dataset. Comparisons with standard fuzzy c-means (FCM) are evaluated and discussed.  相似文献   

11.
The problem of clustering a group of observations according to some objective function (e.g., K-means clustering, variable selection) or a density (e.g., posterior from a Dirichlet process mixture model prior) can be cast in the framework of Monte Carlo sampling for cluster indicators. We propose a new method called the evolutionary Monte Carlo clustering (EMCC) algorithm, in which three new “crossover moves,” based on swapping and reshuffling sub cluster intersections, are proposed. We apply the EMCC algorithm to several clustering problems including Bernoulli clustering, biological sequence motif clustering, BIC based variable selection, and mixture of normals clustering. We compare EMCC's performance both as a sampler and as a stochastic optimizer with Gibbs sampling, “split-merge” Metropolis–Hastings algorithms, K-means clustering, and the MCLUST algorithm.  相似文献   

12.
We introduce a class of analytic positive definite multivariate kernels which includes infinite dot product kernels as sometimes used in machine learning, certain new nonlinearly factorizable kernels, and a kernel which is closely related to the Gaussian. Each such kernel reproduces in a certain “native” Hilbert space of multivariate analytic functions. If functions from this space are interpolated in scattered locations by translates of the kernel, we prove spectral convergence rates of the interpolants and all derivatives. By truncation of the power series of the kernel-based interpolants, we constructively generalize the classical Bernstein theorem concerning polynomial approximation of analytic functions to the multivariate case. An application to machine learning algorithms is presented.   相似文献   

13.
An new initialization method for fuzzy c-means algorithm   总被引:1,自引:0,他引:1  
In this paper an initialization method for fuzzy c-means (FCM) algorithm is proposed in order to solve the two problems of clustering performance affected by initial cluster centers and lower computation speed for FCM. Grid and density are needed to extract approximate clustering center from sample space. Then, an initialization method for fuzzy c-means algorithm is proposed by using amount of approximate clustering centers to initialize classification number, and using approximate clustering centers to initialize initial clustering centers. Experiment shows that this method can improve clustering result and shorten clustering time validly.  相似文献   

14.
一种通用的基于梯度的SVM核参数选取算法   总被引:1,自引:0,他引:1  
核函数的选取是SVM分类器选取的核心问题.核函数的自动选取既可以提高分类器的性能,又可以减少人为的干预.因此如何自动选取核函数已经成为SVM的热点问题,但是这个问题并没有获得很好的解决.近年来对核函数参数的自动选取的研究,特别是对基于梯度的优化算法的研究取得了一定的进展.提出了一种基于梯度的核函数选取的通用算法,并进行了实验.  相似文献   

15.
Two kernel-based approaches to discriminant analysis are considered: the traditional one where kernels are used to estimate the distribution of the predictor variables given the group and a direct kernel method where kernels are used to estimate the a posteriori probabilities directly. For both approaches cross-validatory choice of smoothing parameters is based on various loss functions which are directly connected to the separation of groups. Comparison with parametric models shows the improvement gained by the more flexible kernel approaches.  相似文献   

16.
The support vector regression (SVR) is a supervised machine learning technique that has been successfully employed to forecast financial volatility. As the SVR is a kernel-based technique, the choice of the kernel has a great impact on its forecasting accuracy. Empirical results show that SVRs with hybrid kernels tend to beat single-kernel models in terms of forecasting accuracy. Nevertheless, no application of hybrid kernel SVR to financial volatility forecasting has been performed in previous researches. Given that the empirical evidence shows that the stock market oscillates between several possible regimes, in which the overall distribution of returns it is a mixture of normals, we attempt to find the optimal number of mixture of Gaussian kernels that improve the one-period-ahead volatility forecasting of SVR based on GARCH(1,1). The forecast performance of a mixture of one, two, three and four Gaussian kernels are evaluated on the daily returns of Nikkei and Ibovespa indexes and compared with SVR–GARCH with Morlet wavelet kernel, standard GARCH, Glosten–Jagannathan–Runkle (GJR) and nonlinear EGARCH models with normal, student-t, skew-student-t and generalized error distribution (GED) innovations by using mean absolute error (MAE), root mean squared error (RMSE) and robust Diebold–Mariano test. The results of the out-of-sample forecasts suggest that the SVR–GARCH with a mixture of Gaussian kernels can improve the volatility forecasts and capture the regime-switching behavior.  相似文献   

17.
针对混合核支持向量机(SVM)中的可调参数一般是根据经验或人工随机调试得到,不能确保参数最优的局限性,提出用粒子群和人工蜂群的并行混合优化(ABC-PSO)算法来优化混合核SVM参数,找出满足条件的最优参数组合.将该SVM模型应用到语音识别中,通过对三个不同语种的语音数据库的实验仿真,验证了混合算法优化SVM参数所得的优化SVM模型比PSO算法优化SVM所得的模型,具有良好的泛化能力和语音识别能力.  相似文献   

18.
Fitting semiparametric clustering models to dissimilarity data   总被引:1,自引:0,他引:1  
The cluster analysis problem of partitioning a set of objects from dissimilarity data is here handled with the statistical model-based approach of fitting the “closest” classification matrix to the observed dissimilarities. A classification matrix represents a clustering structure expressed in terms of dissimilarities. In cluster analysis there is a lack of methodologies widely used to directly partition a set of objects from dissimilarity data. In real applications, a hierarchical clustering algorithm is applied on dissimilarities and subsequently a partition is chosen by visual inspection of the dendrogram. Alternatively, a “tandem analysis” is used by first applying a Multidimensional Scaling (MDS) algorithm and then by using a partitioning algorithm such as k-means applied on the dimensions specified by the MDS. However, neither the hierarchical clustering algorithms nor the tandem analysis is specifically defined to solve the statistical problem of fitting the closest partition to the observed dissimilarities. This lack of appropriate methodologies motivates this paper, in particular, the introduction and the study of three new object partitioning models for dissimilarity data, their estimation via least-squares and the introduction of three new fast algorithms.  相似文献   

19.
《Fuzzy Sets and Systems》2004,141(2):301-317
This paper presents fuzzy clustering algorithms for mixed features of symbolic and fuzzy data. El-Sonbaty and Ismail proposed fuzzy c-means (FCM) clustering for symbolic data and Hathaway et al. proposed FCM for fuzzy data. In this paper we give a modified dissimilarity measure for symbolic and fuzzy data and then give FCM clustering algorithms for these mixed data types. Numerical examples and comparisons are also given. Numerical examples illustrate that the modified dissimilarity gives better results. Finally, the proposed clustering algorithm is applied to real data with mixed feature variables of symbolic and fuzzy data.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号