首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
The proportion exponent is introduced as a measure of the validity of the clustering obtained for a data set using a fuzzy clustering algorithm. It is assumed that the output of an algorithm includes a fuzzy nembership function for each data point. We show how to compute the proportion of possible memberships whose maximum entry exceeds the maximum entry of a given membership function, and use these proportions to define the proportion exponent. Its use as a validity functional is illustrated with four numerical examples and its effectiveness compared to other validity functionals, namely, classification entropy and partition coefficient.  相似文献   

3.
Automatic clustering using genetic algorithms   总被引:2,自引:0,他引:2  
In face of the clustering problem, many clustering methods usually require the designer to provide the number of clusters as input. Unfortunately, the designer has no idea, in general, about this information beforehand. In this article, we develop a genetic algorithm based clustering method called automatic genetic clustering for unknown K (AGCUK). In the AGCUK algorithm, noising selection and division-absorption mutation are designed to keep a balance between selection pressure and population diversity. In addition, the Davies-Bouldin index is employed to measure the validity of clusters. Experimental results on artificial and real-life data sets are given to illustrate the effectiveness of the AGCUK algorithm in automatically evolving the number of clusters and providing the clustering partition.  相似文献   

4.
We present a clustering method for collections of graphs based on the assumptions that graphs in the same cluster have a similar role structure and that the respective roles can be founded on implicit vertex types. Given a network ensemble (a collection of attributed graphs with some substantive commonality), we start by partitioning the set of all vertices based on attribute similarity. Projection of each graph onto the resulting vertex types yields feature vectors of equal dimensionality, irrespective of the original graph sizes. These feature vectors are then subjected to standard clustering methods. This approach is motivated by social network concepts, and we demonstrate its utility on an ensemble of personal networks of migrants, where we extract structurally similar groups and show their resemblance to predicted acculturation strategies.  相似文献   

5.
Let X(n) be a time series satisfying the following ARUMA(p, d, q) models:U (B) A (B)X (n)=C (B) W (n)where U(B)=1+u(1)B+…+u(d) B~d is a polynomial with all roots on the unit circle, A(B)=1+a(1)B+…+a(p)Bp is a polynomial with all roots outside the unit circle, C(B)=1+c(1) B+…+c(q)Bq is a polynomial which is relatively prime with the polynomial U(B)A(B), B is thebackshift operator such that BX(n)=X(n-1), and (W (n), F(n), n≥1) is a sequence of martingaledifferences satisfying the following conditions:lim E (W (n)~2|F(n-1))=σ~2 a.s.n→∞sup E |W(n)|γ<∞ for some γ>2.n≥1The purpose of this paper is to provide consistent estimates of the parameters p, d, q, u(j) (j=1,2,…,d), and a(k) (k=1, 2.…, p).  相似文献   

6.
This is a summary of the author’s PhD thesis supervised by Edoardo Amaldi and defended on 3 April 2009 at the Politecnico di Milano. The thesis is written in English and is available from the author upon request. In this work, we extensively study two challenging variants of the general problem of clustering a given set of data points with respect to hyperplanes so as to extract collinearity between them. After pointing out the similarities and differences with previous work on related problems, we propose mathematical programming formulations for our problem variants. Since these problems are difficult to handle due to the nonlinear nonconvexity that arises because of the 2-norm in the distance function and a large number of binary assignment variables, we develop column generation algorithms and heuristics to tackle them. The efficiency of the methods developed is demonstrated on realistic randomly generated instances along with applications in piecewise linear model fitting and line segment detection in digital images.  相似文献   

7.

In this paper, an ordinal multilevel latent Markov model based on separate random effects is proposed. In detail, two distinct second-level discrete effects are considered in the model, one affecting the initial probability vector and the other affecting the transition probability matrix of the first-level ordinal latent Markov process. To model these separate effects, we consider a bi-dimensional mixture specification that allows to avoid unverifiable assumptions on the random effect distribution and to derive a two-way clustering of second-level units. Starting from a general model where the two random effects are dependent, we also obtain the independence model as a special case. The proposal is applied to data on the physical health status of a sample of elderly residents grouped into nursing homes. A simulation study assessing the performance of the proposal is also included.

  相似文献   

8.
The paper is concerned with a hybrid optimization of fuzzy inference systems based on hierarchical fair competition-based parallel genetic algorithms (HFCGA) and information granulation. The process of information granulation is realized with the aid of the C-Means clustering. HFCGA being a multi-population based parallel genetic algorithms (PGA) is exploited here to realize structure optimization and carry out parameter estimation of the fuzzy models. The HFCGA becomes helpful in the context of fuzzy models as it restricts a premature convergence encountered quite often in optimization problems. It concerns a set of parameters of the model including among others the number of input variables to be used, a specific subset of input variables, and the number of membership functions. In the hybrid optimization process, two general optimization mechanisms are explored. The structural development of the fuzzy model is realized via the HFCGA optimization and C-Means, whereas to deal with the parametric optimization we proceed with a standard least square method and the use of the HFCGA technique. A suite of comparative studies demonstrates that the proposed algorithm leads to the models whose performance is superior in comparison with some other constructs commonly used in fuzzy modeling.  相似文献   

9.
Cellular manufacturing is a useful way to improve overall manufacturing performance. Group technology is used to increase the productivity for manufacturing high quality products and improving the flexibility of manufacturing systems. Cell formation is an important step in group technology. It is used in designing good cellular manufacturing systems. The key step in designing any cellular manufacturing system is the identification of part families and machine groups for the creation of cells that uses the similarities between parts in relation to the machines in their manufacture. There are two basic procedures for cell formation in group technology. One is part-family formation and the other is machine–cell formation. In this paper, we apply a fuzzy relational data clustering algorithm to form part families and machine groups. A real data study shows that the proposed approach performs well based on the grouping efficiency proposed by Chandrasekharan and Rajagopalan.  相似文献   

10.
Evolving fuzzy rule based controllers using genetic algorithms   总被引:9,自引:0,他引:9  
The synthesis of genetics-based machine learning and fuzzy logic is beginning to show promise as a potent tool in solving complex control problems in multi-variate non-linear systems. In this paper an overview of current research applying the genetic algorithm to fuzzy rule based control is presented. A novel approach to genetics-based machine learning of fuzzy controllers, called a Pittsburgh Fuzzy Classifier System # 1 (P-FCS1) is proposed. P-FCS1 is based on the Pittsburgh model of learning classifier systems and employs variable length rule-sets and simultaneously evolves fuzzy set membership functions and relations. A new crossover operator which respects the functional linkage between fuzzy rules with overlapping input fuzzy set membership functions is introduced. Experimental results using P-FCS 1 are reported and compared with other published results. Application of P-FCS1 to a distributed control problem (dynamic routing in computer networks) is also described and experimental results are presented.  相似文献   

11.
在给定的度量空间中, 单位聚类问题就是寻找最少的单位球来覆盖给定的所有点。这是一个众所周知的组合优化问题, 其在线版本为: 给定一个度量空间, 其中的n个点会一个接一个的到达任何可能的位置, 在点到达的时候必须给该点分配一个单位聚类, 而此时未来点的相关信息都是未知的, 问题的目标是最后使用的单位聚类数目最少。本文考虑的是带如下假设的一类一维在线单位聚类问题: 在相应离线问题的最优解中任意两个相邻聚类之间的距离都大于0.5。本文首先给出了两个在线算法和一些引理, 接着通过0.5的概率分别运行两个在线算法得到一个组合随机算法, 最后证明了这个组合随机算法的期望竞争比不超过1.5。  相似文献   

12.
在给定的度量空间中, 单位聚类问题就是寻找最少的单位球来覆盖给定的所有点。这是一个众所周知的组合优化问题, 其在线版本为: 给定一个度量空间, 其中的n个点会一个接一个的到达任何可能的位置, 在点到达的时候必须给该点分配一个单位聚类, 而此时未来点的相关信息都是未知的, 问题的目标是最后使用的单位聚类数目最少。本文考虑的是带如下假设的一类一维在线单位聚类问题: 在相应离线问题的最优解中任意两个相邻聚类之间的距离都大于0.5。本文首先给出了两个在线算法和一些引理, 接着通过0.5的概率分别运行两个在线算法得到一个组合随机算法, 最后证明了这个组合随机算法的期望竞争比不超过1.5。  相似文献   

13.
Deductive reasoning with classical logic is hampered when imprecision is present in the variables, although human reasoning can cope quite adequately with vague concepts. A new approach to reasoning which allows imprecise conclusions to be drawn consistently from imprecise premises was introduced by Baldwin [2]. This method is economical in calculation as it avoids the high dimensionality that fuzzy set representations often involve.This paper briefly reviews the method from an operational viewpoint, isolating the individual processes that are used in the method. A feasible algorithm for computing each process is then presented.It is assumed that the reader is familiar with the concept of, and operations on, fuzzy sets introduced by Zadeh [14].  相似文献   

14.
ESTIMATIONOFTHEPARAMETERSFORUNSTABLEARMODELSANHoNGZHI(安鸿志)(InstituteofAppliedMathematics,theChineseAcademyofScience,Beijing10...  相似文献   

15.
Summary  Additive models of the type y=f 1(x 1)+...+f p(x p)+ε where f j , j=1,..,p, have unspecified functional form, are flexible statistical regression models which can be used to characterize nonlinear regression effects. One way of fitting additive models is the expansion in B-splines combined with penalization which prevents overfitting. The performance of this penalized B-spline (called P-spline) approach strongly depends on the choice of the amount of smoothing used for components f j . In particular for higher dimensional settings this is a computationaly demanding task. In this paper we treat the problem of choosing the smoothing parameters for P-splines by genetic algorithms. In several simulation studies this approach is compared to various alternative methods of fitting additive models. In particular functions with different spatial variability are considered and the effect of constant respectively local adaptive smoothing parameters is evaluated.  相似文献   

16.
We present a dual-view mixture model to cluster users based on their features and latent behavioral functions. Every component of the mixture model represents a probability density over a feature view for observed user attributes and a behavior view for latent behavioral functions that are indirectly observed through user actions or behaviors. Our task is to infer the groups of users as well as their latent behavioral functions. We also propose a non-parametric version based on a Dirichlet Process to automatically infer the number of clusters. We test the properties and performance of the model on a synthetic dataset that represents the participation of users in the threads of an online forum. Experiments show that dual-view models outperform single-view ones when one of the views lacks information.  相似文献   

17.
A bootstrap procedure useful in latent class, or more general mixture models has been developed to determine the sufficient number of latent classes or components required to account for systematic group differences in the data. The procedure is illustrated in the context of a multidimensional scaling latent class model, CLASCAL. Also presented is a bootstrap technique for determining standard errors for estimates of the stimulus co‐ordinates, parameters of the multidimensional scaling model. Real and artificial data are presented. The bootstrap procedure for selecting a sufficient number of classes seems to correctly select the correct number of latent classes at both low and high error levels. At higher error levels it outperforms Hope's (J. Roy. Statist. Soc. Ser B 1968; 30 : 582) procedure. The bootstrap procedures to estimate parameter stability appear to correctly re‐produce Monte Carlo results. Copyright © 2002 John Wiley & Sons, Ltd.  相似文献   

18.
Conceptual data modeling has become essential for non-traditional application areas. Some conceptual data models have been proposed as tools for database design and object-oriented database modeling. Information in real-world applications is often vague or ambiguous. Currently, a little research is underway on modeling the imprecision and uncertainty in conceptual data modeling and the conceptual design of fuzzy databases. The unified modeling language (UML) is a set of object-oriented modeling notations and a standard of the object management group (OMG) with applications to many areas of software engineering and knowledge engineering, increasingly including data modeling. This paper introduces different levels of fuzziness into the class of UML and presents the corresponding graphical representations, with the result that UML class diagrams may model fuzzy information. The fuzzy UML data model is also formally mapped into the fuzzy object-oriented database model.  相似文献   

19.
In this paper, we consider the general growth curve model with multivariate random effects covariance structure and provide a new simple estimator for the parameters of interest. This estimator is not only convenient for testing the hypothesis on the corresponding parameters, but also has higher efficiency than the least-square estimator and the improved two-stage estimator obtained by Rao under certain conditions. Moreover, we obtain the necessary and sufficient condition for the new estimator to be identical to the best linear unbiased estimator. Examples of its application are given.  相似文献   

20.
Summary  Graphical methods for the discovery of structural models from observational data provide interesting tools for applied researchers. A problem often faced in empirical studies is the presence of latent confounders which produce associations between the observed variables. Although causal inference algorithms exist which can cope with latent confounders, empirical applications assessing the performance of such algorithms are largely lacking. In this study, we apply the constraint based Fast Causal Inference algorithm implemented in the software program TETRAD on a data set containing strategy and performance information about 608 business units. In contrast to the informative and reasonable results for the impirical data, simulation findings reveal problems in recovering some of the structural relations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号