首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Deviations from theoretical assumptions together with the presence of certain amount of outlying observations are common in many practical statistical applications. This is also the case when applying Cluster Analysis methods, where those troubles could lead to unsatisfactory clustering results. Robust Clustering methods are aimed at avoiding these unsatisfactory results. Moreover, there exist certain connections between robust procedures and Cluster Analysis that make Robust Clustering an appealing unifying framework. A review of different robust clustering approaches in the literature is presented. Special attention is paid to methods based on trimming which try to discard most outlying data when carrying out the clustering process.  相似文献   

2.
Summary  In this paper a robust fuzzy k-means clustering model for interval valued data is introduced. The peculiarity of the proposed model is the capability to manage anomalous interval valued data by reducing the effects of such outliers in the clustering model. In the interval case, the concept of anomalous data involves both the center and the width (the radius) of an interval. In order to show how our model works the results of a simulation experiment and an application to real interval valued data are discussed.  相似文献   

3.
In this paper we introduce a new method to the cluster analysis of longitudinal data focusing on the determination of uncertainty levels for cluster memberships. The method uses the Dirichlet-t distribution which notably utilizes the robustness feature of the student-t distribution in the framework of a Bayesian semi-parametric approach together with robust clustering of subjects evaluates the uncertainty level of subjects memberships to their clusters. We let the number of clusters and the uncertainty levels be unknown while fitting Dirichlet process mixture models. Two simulation studies are conducted to demonstrate the proposed methodology. The method is applied to cluster a real data set taken from gene expression studies.  相似文献   

4.
The following mixture model-based clustering methods are compared in a simulation study with one-dimensional data, fixed number of clusters and a focus on outliers and uniform “noise”: an ML-estimator (MLE) for Gaussian mixtures, an MLE for a mixture of Gaussians and a uniform distribution (interpreted as “noise component” to catch outliers), an MLE for a mixture of Gaussian distributions where a uniform distribution over the range of the data is fixed (Fraley and Raftery in Comput J 41:578–588, 1998), a pseudo-MLE for a Gaussian mixture with improper fixed constant over the real line to catch “noise” (RIMLE; Hennig in Ann Stat 32(4): 1313–1340, 2004), and MLEs for mixtures of t-distributions with and without estimation of the degrees of freedom (McLachlan and Peel in Stat Comput 10(4):339–348, 2000). The RIMLE (using a method to choose the fixed constant first proposed in Coretto, The noise component in model-based clustering. Ph.D thesis, Department of Statistical Science, University College London, 2008) is the best method in some, and acceptable in all, simulation setups, and can therefore be recommended.  相似文献   

5.
6.
The aim of this paper is to characterize from a sinctactical and from a semantical point of view the output of a clustering procedure on an object-predicate table. The logical framework of this work is the calculus of the first order of the predicates. The main result can be stated simply in this way. The reordering of the rows of the object-predicate table, with the aim of concentrating the one's on the left higher corner of the object-predicate table, which can be conceived as an high informative reorganization of the data, can be performed through a goedelization of the scheme of formulas, namely models, we can superimpose on the table. A semantical theorem is stated, after a sinctactical characterization of the object-predicate tables.  相似文献   

7.
Impulse influence matrix function is introduced based on that the de-centralized control analysis is analogous to the sub-structural analysis in structural mechanics. The static sub-structural analysis is analogous to the usual de-centralized control, whereas the dynamic sub-structural analysis corresponds to the de-centralized control theory. The re-  相似文献   

8.
Impulse influence matrix function is introduced based on that the de-centralized control analysis is analogous to the sub-structural analysis in structural mechanics. The static sub-structural analysis is analogous to the usual de-centralized control, whereas the dynamic sub-structural analysis corresponds to the de-centralized control theory. The reciprocal symmetry for the impulse influence matrix function is proved, and is solved by the precise integration method for time invariant system, giving the results up to computer precision. Based on the impulse influence functions of subsystems, the combination of subsystems can lead to a set of integral equations and be solved numerically. Numerical example demonstrates the effectiveness of the method.  相似文献   

9.
Severe constraints imposed by the nature of endless sequences of data collected from unstable phenomena have pushed the understanding and the development of automated analysis strategies, such as data clustering techniques. However, current clustering validation approaches are inadequate to data streams due to they do not properly evaluate representation of behavior changes. This paper proposes a novel function to continuously evaluate data stream clustering inspired in Lyapunov energy functions used by techniques such as the Hopfield artificial neural network and the Bidirectional Associative Memory (Bam). The proposed function considers three terms: i) the intra-cluster distance, which allows to evaluate cluster compactness; ii) the inter-cluster distance, which reflects cluster separability; and iii) entropy estimation of the clustering model, which permits the evaluation of the level of uncertainty in data streams. A first set of experiments illustrate the proposed function applied to scenarios of continuous evaluation of data stream clustering. Further experiments were conducted to compare this new function to well-established clustering indices and results confirm our proposal reflects the same information obtained with external clustering indices.  相似文献   

10.
In classical credibility theory we assume that the vector of claims conditionally on has independent components with identical means. However, this assumption is sometimes unrealistic. To relax this condition Hachemeister (Hachemeister, C.A., 1975. Credibility for regression models with application to trend. In: Kahn, P. (Ed.), Credibility, Theory and Applications. Academic Press, New York) introduced regressors. The presence of large claims can perturb the credibility premium estimation. The lack of robustness of regression credibility estimators, as well as the fairness of tariff evaluation, led to the development of this paper. Our proposal is to apply robust statistics to the regression credibility estimation by using the robust influence function approach of M-estimators.  相似文献   

11.
与一般相似度函数相关的谱聚类的收敛性   总被引:1,自引:0,他引:1       下载免费PDF全文
谱聚类算法由与相似度函数相关的图Laplace 算子的特征函数产生. 本文证明与一般相似度函数相关的谱聚类算法的收敛性, 并使用覆盖数方法对收敛性给出量化估计. 当相似度函数是欧氏空间子集上一个Lipschitz s > 0 函数时, O(√log(n + 1)/√n) 形式的收敛率得到证实. 我们同时指出一个相应函数集的覆盖数的增长可以表现任意差.  相似文献   

12.
In many multiobjective optimization problems, the Pareto Fronts and Sets contain a large number of solutions and this makes it difficult for the decision maker to identify the preferred ones. A possible way to alleviate this difficulty is to present to the decision maker a subset of a small number of solutions representatives of the Pareto Front characteristics.  相似文献   

13.
To impute the function of a variational inequality and the objective of a convex optimization problem from observations of (nearly) optimal decisions, previous approaches constructed inverse programming methods based on solving a convex optimization problem [17], [7]. However, we show that, in addition to requiring complete observations, these approaches are not robust to measurement errors, while in many applications, the outputs of decision processes are noisy and only partially observable from, e.g., limitations in the sensing infrastructure. To deal with noisy and missing data, we formulate our inverse problem as the minimization of a weighted sum of two objectives: 1) a duality gap or Karush–Kuhn–Tucker (KKT) residual, and 2) a distance from the observations robust to measurement errors. In addition, we show that our method encompasses previous ones by generating a sequence of Pareto optimal points (with respect to the two objectives) converging to an optimal solution of previous formulations. To compare duality gaps and KKT residuals, we also derive new sub-optimality results defined by KKT residuals. Finally, an implementation framework is proposed with applications to delay function inference on the road network of Los Angeles, and consumer utility estimation in oligopolies.  相似文献   

14.
15.
This article derives the influence function of the Stahel–Donoho estimator of multivariate location and scatter for elliptical distributions. Local robustness and asymptotic relative efficiency are studied. The expressions obtained for the influence functions coincide with those of one-step reweighted estimators.  相似文献   

16.
Support vector regression (SVR) is one of the most popular nonlinear regression techniques with the aim to approximate a nonlinear system with a good generalization capability. However, SVR has a major drawback in that it is sensitive to the presence of outliers. The ramp loss function for robust SVR has been introduced to resolve this problem, but SVR with ramp loss function has a non-differentiable and non-convex formulation, which is not easy to solve. Consequently, SVR with the ramp loss function requires smoothing and Concave-Convex Procedure techniques, which transform the non-differentiable and non-convex optimization to a differentiable and convex one. We present a robust SVR with linear-log concave loss function (RSLL), which does not require the transformation technique, where the linear-log concave loss function has a similar effect as the ramp loss function. The zero norm approximation and the difference of convex functions problem are employed for solving the optimization problem. The proposed RSLL approach is used to develop a robust and stable virtual metrology (VM) prediction model, which utilizes the status variables of process equipment to predict the process quality of wafer level in semiconductor manufacturing. We also compare the proposed approach to existing SVR-based methods in terms of the root mean squared error of prediction using both synthetic and real data sets. Our experimental results show that the proposed approach performs better than existing SVR-based methods regardless of the data set and type of outliers (ie, X-space and Y-space outliers), implying that it can be used as a useful alternative when the regression data contain outliers.  相似文献   

17.
The small-world network, proposed by Watts and Strogatz, has been extensively studied for the past over ten years. In this paper, a generalized small-world network is proposed, which extends several small-world network models. Furthermore, some properties of a special type of generalized small-world network with given expectation of edge numbers have been investigated, such as the degree distribution and the isoperimetric number. These results are used to present a lower and an upper bounds for the clustering coefficient and the diameter of the given edge number expectation generalized small-world network, respectively. In other words, we prove mathematically that the given edge number expectation generalized small-world network possesses large clustering coefficient and small diameter.  相似文献   

18.
19.
20.
Advances in Data Analysis and Classification - Clustering via marked point processes and influence space, Is-ClusterMPP, is a new unsupervised clustering algorithm through adaptive MCMC sampling of...  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号