首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Abstract

A method of robust estimation of multivariate location and shape that has attracted a lot of attention recently is Rousseeuw's minimum volume ellipsoid estimator (MVE). This estimator has a high breakdown point but is difficult to compute successfully. In this article, we apply methods of heuristic search to this problem, including simulated annealing, genetic algorithms, and tabu search, and compare the results to the undirected random search algorithm that is often cited. Heuristic search provides several effective algorithms that are far more computationally efficient than random search. Furthermore, random search, as currently implemented, is shown to be ineffective for larger problems.  相似文献   

2.
Abstract

An improved resampling algorithm for S estimators reduces the number of times the objective function is evaluated and increases the speed of convergence. With this algorithm, S estimates can be computed in less time than least median squares (LMS) for regression and minimum volume ellipsoid (MVE) for location/scatter estimates with the same accuracy. Here accuracy refers to the randomness due to the algorithm. S estimators are also more statistically efficient than the LMS and MVE estimators, that is, they have less variability due to the randomness of the data.  相似文献   

3.
Abstract

All known robust location and scale estimators with high breakdown point for multivariate samples are very expensive to compute. In practice, this computation has to be carried out using an approximate subsampling procedure. In this article we describe an alternative subsampling scheme, applicable to both the Stahel-Donoho estimator and the minimum volume ellipsoid estimator, with the property that the number of subsamples required can be substantially reduced with respect to the standard subsampling procedures used in both cases. We also discuss some bias and variability properties of the estimator obtained from the proposed subsampling process.  相似文献   

4.
Consider the problem of computing a (1+?)-approximation to the minimum volume axis-aligned ellipsoid (MVAE) enclosing a set of m points in Rn. We first provide an extension and improvement to algorithm proposed in Kumar and Y?ld?r?m (2008) [5] (the KY algorithm) for the MVAE problem. The main challenge of the MVAE problem is that there is no closed form solution in the line search step (beta). Therefore, the KY algorithm proposed a certain choice of beta that leads to their complexity and core set results in solving the MVAE problem. We further analyze the line search step to derive a new beta, relying on an analysis of up to the fourth order derivative. This choice of beta leads to the improved complexity and core set results. The second modification is given by incorporating “away steps” into the first one at each iteration, which obtains the same complexity and core set results as the first one. In addition, since the second modification uses the idea of “dropping points”, it has the potential to compute smaller core sets in practice. Some numerical results are given to show the efficiency of the modified algorithms.  相似文献   

5.
Investigating the minimum weight triangulation of a point set with constraint is an important approach for seeking the ultimate solution of the minimum weight triangulation problem. In this paper, we consider the minimum weight triangulation of a sparse point set, and present an O(n 4) algorithm to compute a triangulation of such a set. The property of sparse point set can be converted into a new sufficient condition for finding subgraphs of the minimum weight triangulation. A special point set is exhibited to show that our new subgraph of minimum weight triangulation cannot be found by any currently known methods.  相似文献   

6.
Abstract

Polynomial splines are often used in statistical regression models for smooth response functions. When the number and location of the knots are optimized, the approximating power of the spline is improved and the model is nonparametric with locally determined smoothness. However, finding the optimal knot locations is an historically difficult problem. We present a new estimation approach that improves computational properties by penalizing coalescing knots. The resulting estimator is easier to compute than the unpenalized estimates of knot positions, eliminates unnecessary “corners” in the fitted curve, and in simulation studies, shows no increase in the loss. A number of GCV and AIC type criteria for choosing the number of knots are evaluated via simulation.  相似文献   

7.
We consider the problem of computing minimum geometric hitting sets in which, given a set of geometric objects and a set of points, the goal is to compute the smallest subset of points that hit all geometric objects. The problem is known to be strongly NP-hard even for simple geometric objects like unit disks in the plane. Therefore, unless P = NP, it is not possible to get Fully Polynomial Time Approximation Algorithms (FPTAS) for such problems. We give the first PTAS for this problem when the geometric objects are half-spaces in ?3 and when they are an r-admissible set regions in the plane (this includes pseudo-disks as they are 2-admissible). Quite surprisingly, our algorithm is a very simple local-search algorithm which iterates over local improvements only.  相似文献   

8.
Finding the set of nearest neighbors for a query point of interest appears in a variety of algorithms for machine learning and pattern recognition. Examples include k nearest neighbor classification, information retrieval, case-based reasoning, manifold learning, and nonlinear dimensionality reduction. In this work, we propose a new approach for determining a distance metric from the data for finding such neighboring points. For a query point of interest, our approach learns a generalized quadratic distance (GQD) metric based on the statistical properties in a “small” neighborhood for the point of interest. The locally learned GQD metric captures information such as the density, curvature, and the intrinsic dimensionality for the points falling in this particular neighborhood. Unfortunately, learning the GQD parameters under such a local learning mechanism is a challenging problem with a high computational overhead. To address these challenges, we estimate the GQD parameters using the minimum volume covering ellipsoid (MVCE) for a set of points. The advantage of the MVCE is two-fold. First, the MVCE together with the local learning approach approximate the functionality of a well known robust estimator for covariance matrices. Second, computing the MVCE is a convex optimization problem which, in addition to having a unique global solution, can be efficiently solved using a first order optimization algorithm. We validate our metric learning approach on a large variety of datasets and show that the proposed metric has promising results when compared with five algorithms from the literature for supervised metric learning.  相似文献   

9.
Abstract

The extraction of sinusoidal signals from time-series data is a classic problem of ongoing interest in the statistics and signal processing literatures. Obtaining least squares estimates is difficult because the sum of squares has local minima O(1/n) apart in the frequencies. In practice the frequencies are often estimated using ad hoc and inefficient methods. Problems of data quality have received little attention. An elemental set is a subset of the data containing the minimum number of points such that the unknown parameters in the model can be identified. This article shows that, using a variant of the classical method of Prony, parameter estimates for a sum of sinusoids can be obtained algebraically from an elemental set. Elemental set methods are used to construct finite algorithm estimators that approximately minimize the least squares, least trimmed sum of squares, or least median of squares criteria. The elemental set estimators prove able in simulations to resolve the frequencies to the correct local minima of the objective functions. When used as the first stage of an MM estimator, the constructed estimators based on the trimmed sum of squares and least median of squares criteria produce final estimators which have high breakdown properties and which are simultaneously efficient when no outliers are present. The approach can also be applied to sums of exponentials, and sums of damped sinusoids. The article includes simulations with one and two sinusoids and two data examples.  相似文献   

10.
The nonlinear convex programming problem of finding the minimum covering weighted ball of a given finite set of points in ${\mathbb{R}^n}$ is solved by generating a finite sequence of subsets of the points and by finding the minimum covering weighted ball of each subset in the sequence until all points are covered. Each subset has at most n + 1 points and is affinely independent. The radii of the covering weighted balls are strictly increasing. The minimum covering weighted ball of each subset is found by using a directional search along either a ray or a circular arc, starting at the solution to the previous subset. The step size is computed explicitly at each iteration.  相似文献   

11.
For a finite set of points S, the (monochromatic) reverse nearest neighbor (RNN) rule associates with any query point q the subset of points in S that have q as its nearest neighbor. In the bichromatic reverse nearest neighbor (BRNN) rule, sets of red and blue points are given and any blue query is associated with the subset of red points that have it as its nearest blue neighbor. In this paper we introduce and study new optimization problems in the plane based on the bichromatic reverse nearest neighbor (BRNN) rule. We provide efficient algorithms to compute a new blue point under criteria such as: (1) the number of associated red points is maximum (MAXCOV criterion); (2) the maximum distance to the associated red points is minimum (MINMAX criterion); (3) the minimum distance to the associated red points is maximum (MAXMIN criterion). These problems arise in the competitive location area where competing facilities are established. Our solutions use techniques from computational geometry, such as the concept of depth of an arrangement of disks or upper envelope of surface patches in three dimensions.  相似文献   

12.
13.
Mahalanobis-type distances in which the shape matrix is derived from a consistent, high-breakdown robust multivariate location and scale estimator have an asymptotic chi-squared distribution as is the case with those derived from the ordinary covariance matrix. For example, Rousseeuw's minimum covariance determinant (MCD) is a robust estimator with a high breakdown. However, even in quite large samples, the chi-squared approximation to the distances of the sample data from the MCD center with respect to the MCD shape is poor. We provide an improved F approximation that gives accurate outlier rejection points for various sample sizes.  相似文献   

14.
This paper considers the discrete two-hub location problem. We need to choose two hubs from a set of nodes. The remaining nodes are to be connected to one of the two hubs which act as switching points for internodal flows. A configuration which minimizes the total flow cost needs to be found. We show that the problem can be solved in polynomial time when the hub locations are fixed. Since there are at most ways to choose the hub locations, the two-hub location problem can be solved in polynomial time. We transform the quadratic 0–1 integer program of the single allocation problem in the fixed two-hub system into a linear program and show that all extreme points of the polytope defined by the LP are integral. Also, the problem can be transformed into a minimum cut problem which can be solved efficiently by any polynomial time algorithm.  相似文献   

15.
The problem of estimating a set S from a random sample of points taken within S is considered. It is assumed that S is r-convex, which means that a ball of radius r can go around from outside the set boundary. Under this assumption, the r-convex hull of the sample is a natural estimator of S. We obtain convergence rates for this estimator under both the distance in measure and the Hausdorff metric between sets. It is also proved that the boundary of the estimator consistently estimates the boundary of S, in Hausdorff's sense.  相似文献   

16.
The canonical correlation (CANCOR) method for dimension reduction in a regression setting is based on the classical estimates of the first and second moments of the data, and therefore sensitive to outliers. In this paper, we study a weighted canonical correlation (WCANCOR) method, which captures a subspace of the central dimension reduction subspace, as well as its asymptotic properties. In the proposed WCANCOR method, each observation is weighted based on its Mahalanobis distance to the location of the predictor distribution. Robust estimates of the location and scatter, such as the minimum covariance determinant (MCD) estimator of Rousseeuw [P.J. Rousseeuw, Multivariate estimation with high breakdown point, Mathematical Statistics and Applications B (1985) 283-297], can be used to compute the Mahalanobis distance. To determine the number of significant dimensions in WCANCOR, a weighted permutation test is considered. A comparison of SIR, CANCOR and WCANCOR is also made through simulation studies to show the robustness of WCANCOR to outlying observations. As an example, the Boston housing data is analyzed using the proposed WCANCOR method.  相似文献   

17.
The determination of minimum variance estimators in an unusual context is considered. The problem arises from an attempt to perform a regression with an unobservable dependent variable. The required minimum variance estimator is shown to satisfy a linear system of equations where the coefficient matrix has a simple structure. Uniqueness of the estimator is established by determining necessary and sufficient conditions on the data which guarantee positive definiteness of this coefficient matrix. Numerical aspects of the method of computation are also briefly explored.  相似文献   

18.
We consider a concept of linear a priori estimate of the accuracy for approximate solutions to inverse problems with perturbed data. We establish that if the linear estimate is valid for a method of solving the inverse problem, then the inverse problem is well-posed according to Tikhonov. We also find conditions, which ensure the converse for the method of solving the inverse problem independent on the error levels of data. This method is well-known method of quasi-solutions by V. K. Ivanov. It provides for well-posed (according to Tikhonov) inverse problems the existence of linear estimates. If the error levels of data are known, a method of solving well-posed according to Tikhonov inverse problems is proposed. This method called the residual method on the correctness set (RMCS) ensures linear estimates for approximate solutions. We give an algorithm for finding linear estimates in the RMCS.  相似文献   

19.
给定欧氏平面上的一个点集合S,我们给出两类端点在S中的线段集合,第一类线段集合是S的任一三角剖分的子集,第二类线段集合是S的任一最小权三解剖分的子集,这两类子集是不相交的,这两类子集合的计算要用O(n3)时间和O(n)空间.  相似文献   

20.
One of the problems in data analysis was earlier reduced to a specific NP-hard optimization problem of finding in a given vector set in the Euclidean space a subset of a given cardinality such that the subset consists of the vectors that are “close” to each other by the criterion of the minimum sum of squared distances. In the paper an efficient 2-approximation algorithm is proposed for solving this problem.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号