首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Advances in Data Analysis and Classification - The Ministry of Social Development in Mexico is in charge of creating and assigning social programmes targeting specific needs in the population for...  相似文献   

2.
A non metric clustering algorithm based on a fuzzy objective function which reflects proximity based on some global dissimilarity measure is proposed.The global optimal solution is shown to be difficult to obtain and an alternative iterative procedure is presented. This procedure is easily implemented and converges to a local optimum.Some validity functionals which measure the effectiveness with which cluster structure has been identified are compared in relation with the iterative procedures described in the paper.  相似文献   

3.
For clustering objects, we often collect not only continuous variables, but binary attributes as well. This paper proposes a model-based clustering approach with mixed binary and continuous variables where each binary attribute is generated by a latent continuous variable that is dichotomized with a suitable threshold value, and where the scores of the latent variables are estimated from the binary data. In economics, such variables are called utility functions and the assumption is that the binary attributes (the presence or the absence of a public service or utility) are determined by low and high values of these functions. In genetics, the latent response is interpreted as the ??liability?? to develop a qualitative trait or phenotype. The estimated scores of the latent variables, together with the observed continuous ones, allow to use a multivariate Gaussian mixture model for clustering, instead of using a mixture of discrete and continuous distributions. After describing the method, this paper presents the results of both simulated and real-case data and compares the performances of the multivariate Gaussian mixture model and of a mixture of joint multivariate and multinomial distributions. Results show that the former model outperforms the mixture model for variables with different scales, both in terms of classification error rate and reproduction of the clusters means.  相似文献   

4.
We consider the following two instances of the projective clustering problem: Given a set S of n points in and an integer k>0, cover S by k slabs (respectively d-cylinders) so that the maximum width of a slab (respectively the maximum diameter of a d-cylinder) is minimized. Let w* be the smallest value so that S can be covered by k slabs (respectively d-cylinders), each of width (respectively diameter) at most w*. This paper contains three main results: (i) For d=2, we present a randomized algorithm that computes O(klogk) strips of width at most w* that cover S. Its expected running time is O(nk2log4n) if k2logkn; for larger values of k, the expected running time is O(n2/3k8/3log14/3n). (ii) For d=3, a cover of S by O(klogk) slabs of width at most w* can be computed in expected time O(n3/2k9/4polylog(n)). (iii) We compute a cover of by O(dklogk) d-cylinders of diameter at most 8w* in expected time O(dnk3log4n). We also present a few extensions of this result.  相似文献   

5.
This study employs genetic algorithms to solve clustering problems. Three models, SICM, STCM, CSPM, are developed according to different coding/decoding techniques. The effectiveness and efficiency of these models under varying problem sizes are analyzed in comparison to a conventional statistics clustering method (the agglomerative hierarchical clustering method). The results for small scale problems (10–50 objects) indicate that CSPM is the most effective but least efficient method, STCM is second most effective and efficient, SICM is least effective because of its long chromosome. The results for medium-to-large scale problems (50–200 objects) indicate that CSPM is still the most effective method. Furthermore, we have applied CSPM to solve an exemplified p-Median problem. The good results demonstrate that CSPM is usefully applicable.  相似文献   

6.
7.
The proportion exponent is introduced as a measure of the validity of the clustering obtained for a data set using a fuzzy clustering algorithm. It is assumed that the output of an algorithm includes a fuzzy nembership function for each data point. We show how to compute the proportion of possible memberships whose maximum entry exceeds the maximum entry of a given membership function, and use these proportions to define the proportion exponent. Its use as a validity functional is illustrated with four numerical examples and its effectiveness compared to other validity functionals, namely, classification entropy and partition coefficient.  相似文献   

8.
Approximation algorithms for Hamming clustering problems   总被引:1,自引:0,他引:1  
We study Hamming versions of two classical clustering problems. The Hamming radius p-clustering problem (HRC) for a set S of k binary strings, each of length n, is to find p binary strings of length n that minimize the maximum Hamming distance between a string in S and the closest of the p strings; this minimum value is termed the p-radius of S and is denoted by . The related Hamming diameter p-clustering problem (HDC) is to split S into p groups so that the maximum of the Hamming group diameters is minimized; this latter value is called the p-diameter of S.We provide an integer programming formulation of HRC which yields exact solutions in polynomial time whenever k is constant. We also observe that HDC admits straightforward polynomial-time solutions when k=O(logn) and p=O(1), or when p=2. Next, by reduction from the corresponding geometric p-clustering problems in the plane under the L1 metric, we show that neither HRC nor HDC can be approximated within any constant factor smaller than two unless P=NP. We also prove that for any >0 it is NP-hard to split S into at most pk1/7− clusters whose Hamming diameter does not exceed the p-diameter, and that solving HDC exactly is an NP-complete problem already for p=3. Furthermore, we note that by adapting Gonzalez' farthest-point clustering algorithm [T. Gonzalez, Theoret. Comput. Sci. 38 (1985) 293–306], HRC and HDC can be approximated within a factor of two in time O(pkn). Next, we describe a 2O(p/)kO(p/)n2-time (1+)-approximation algorithm for HRC. In particular, it runs in polynomial time when p=O(1) and =O(log(k+n)). Finally, we show how to find in

time a set L of O(plogk) strings of length n such that for each string in S there is at least one string in L within distance (1+), for any constant 0<<1.  相似文献   

9.
Fuzzy variables     
The purpose of this study is to explore a possible axiomatic framework from which a rigorous theory of fuzziness may be constructed. The approach we propose is analogous to the sample space concept of probability theory. A fuzzy variable is a mapping from an abstract space (called the pattern space) onto the real line. The membership function is obtained as the extension of a special type of capacity (called a scale) from the pattern space to the real line via the fuzzy variable. In essence we are proposing an entirely new definition of a fuzzy set on the line as a mapping to the line rather than on the line. The current definition of a transformation of a fuzzy set is obtained as a derived result of our model. In addition, we derive the membership function of sums and products of fuzzy sets and present an example which reinforces the credibility of our approach.  相似文献   

10.
Supervised clustering of variables   总被引:1,自引:0,他引:1  
In predictive modelling, highly correlated predictors lead to unstable models that are often difficult to interpret. The selection of features, or the use of latent components that reduce the complexity among correlated observed variables, are common strategies. Our objective with the new procedure that we advocate here is to achieve both purposes: to highlight the group structure among the variables and to identify the most relevant groups of variables for prediction. The proposed procedure is an iterative adaptation of a method developed for the clustering of variables around latent variables (CLV). Modification of the standard CLV algorithm leads to a supervised procedure, in the sense that the variable to be predicted plays an active role in the clustering. The latent variables associated with the groups of variables, selected for their “proximity” to the variable to be predicted and their “internal homogeneity”, are progressively added in a predictive model. The features of the methodology are illustrated based on a simulation study and a real-world application.  相似文献   

11.
The presence of less relevant or highly correlated features often decrease classification accuracy. Feature selection in which most informative variables are selected for model generation is an important step in data-driven modeling. In feature selection, one often tries to satisfy multiple criteria such as feature discriminating power, model performance or subset cardinality. Therefore, a multi-objective formulation of the feature selection problem is more appropriate. In this paper, we propose to use fuzzy criteria in feature selection by using a fuzzy decision making framework. This formulation allows for a more flexible definition of the goals in feature selection, and avoids the problem of weighting different goals is classical multi-objective optimization. The optimization problem is solved using an ant colony optimization algorithm proposed in our previous work. We illustrate the added value of the approach by applying our proposed fuzzy feature selection algorithm to eight benchmark problems.  相似文献   

12.
13.
In this paper we present a comparison among some nonhierarchical and hierarchical clustering algorithms including SOM (Self-Organization Map) neural network and Fuzzy c-means methods. Data were simulated considering correlated and uncorrelated variables, nonoverlapping and overlapping clusters with and without outliers. A total of 2530 data sets were simulated. The results showed that Fuzzy c-means had a very good performance in all cases being very stable even in the presence of outliers and overlapping. All other clustering algorithms were very affected by the amount of overlapping and outliers. SOM neural network did not perform well in almost all cases being very affected by the number of variables and clusters. The traditional hierarchical clustering and K-means methods presented similar performance.  相似文献   

14.
Fuzzy random variables   总被引:1,自引:0,他引:1  
  相似文献   

15.
In order to manage the high call density expected of future cellular systems, microcells must be used. A migration to microcells will increase the number of handoffs, and require faster handoff algorithms – in terms of decision making. In the case of line-of-sight transmission, it is important that the handoff algorithm detects the cell boundary early enough, otherwise this will lead to channel dragging into the new cell subsequently increasing the chance of co-channel interference. In the case of non-line-of-sight transmission, a mobile station on turning a street corner will experience a phenomenon known as the Manhattan corner effect that causes the received signal level to drop by 20–30 dB in 20–30 m. This corner effect problem can lead to a loss of communication if not identified early enough. This paper presents two new handoff techniques using fuzzy logic as possible solutions to microcellular handoff. The first algorithm uses an adaptive fuzzy predictor, while the second uses a fuzzy averaging technique. The results of the simulation show that fuzzy is a viable option for microcellular handoff.  相似文献   

16.
《Fuzzy Sets and Systems》2004,141(2):281-299
In this paper, we consider the issue of clustering when outliers exist. The outlier set is defined as the complement of the data set. Following this concept, a specially designed fuzzy membership weighted objective function is proposed and the corresponding optimal membership is derived. Unlike the membership of fuzzy c-means, the derived fuzzy membership does not reduce with the increase of the cluster number. With the suitable redefinition of the distance metric, we demonstrate that the objective function could be used to extract c spherical shells. A hard clustering algorithm alleviating the prototype under-utilization problem is also derived. Artificially generated data are used for comparisons.  相似文献   

17.
Automatic clustering using genetic algorithms   总被引:2,自引:0,他引:2  
In face of the clustering problem, many clustering methods usually require the designer to provide the number of clusters as input. Unfortunately, the designer has no idea, in general, about this information beforehand. In this article, we develop a genetic algorithm based clustering method called automatic genetic clustering for unknown K (AGCUK). In the AGCUK algorithm, noising selection and division-absorption mutation are designed to keep a balance between selection pressure and population diversity. In addition, the Davies-Bouldin index is employed to measure the validity of clusters. Experimental results on artificial and real-life data sets are given to illustrate the effectiveness of the AGCUK algorithm in automatically evolving the number of clusters and providing the clustering partition.  相似文献   

18.
In this paper, we consider a generalized mixed equilibrium problem in real Hilbert space. Using the auxiliary principle, we define a class of resolvent mappings. Further, using fixed point and resolvent methods, we give some iterative algorithms for solving generalized mixed equilibrium problem. Furthermore, we prove that the sequences generated by iterative algorithms converge weakly to the solution of generalized mixed equilibrium problem. These results require monotonicity (θ-pseudo monotonicity) and continuity (Lipschitz continuity) for mappings.  相似文献   

19.
Preconditioned proximal penalty-duality two- and three-field algorithms for mixed optimality conditions, of evolution mixed constrained optimal control problems, are considered. Fixed point existence analysis is performed for corresponding evolution mixed governing variational state systems, in reflexive Banach spaces. Further, convergence analysis of the proximal penalty-duality algorithms is established via fixed point characterizations. In both analysis, a resolvent fixed point variational strategy is applied.  相似文献   

20.
Feasible descent algorithms for mixed complementarity problems   总被引:6,自引:0,他引:6  
In this paper we consider a general algorithmic framework for solving nonlinear mixed complementarity problems. The main features of this framework are: (a) it is well-defined for an arbitrary mixed complementarity problem, (b) it generates only feasible iterates, (c) it has a strong global convergence theory, and (d) it is locally fast convergent under standard regularity assumptions. This framework is applied to the PATH solver in order to show viability of the approach. Numerical results for an appropriate modification of the PATH solver indicate that this framework leads to substantial computational improvements. Received April 9, 1998 / Revised version received November 23, 1998?Published online March 16, 1999  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号