A Deterministic Method for Robust Estimation of Multivariate Location and Shape |
| |
Authors: | Wendy L. Poston Edward J. Wegman Carey E. Priebe Jeffrey L. Solka |
| |
Affiliation: | 1. Naval Surface Warfare Center, Dahlgren Division, Advanced Processors Group , Dahlgren , VA , 22448 , USA;2. Center for Computational Statistics, George Mason University , Fairfax , VA , 22030 , USA;3. Department of Mathematical Sciences , Johns Hopkins University , Baltimore , MD , 21218 , USA;4. Naval Surface Warfare Center, Dahlgren Division, Advanced Computation Technology Group , Dahlgren, VA , 22448 , USA |
| |
Abstract: | Abstract The existence of outliers in a data set and how to deal with them is an important problem in statistics. The minimum volume ellipsoid (MVE) estimator is a robust estimator of location and covariate structure; however its use has been limited because there are few computationally attractive methods. Determining the MVE consists of two parts—finding the subset of points to be used in the estimate and finding the ellipsoid that covers this set. This article addresses the first problem. Our method will also allow us to compute the minimum covariance determinant (MCD) estimator. The proposed method of subset selection is called the effective independence distribution (EID) method, which chooses the subset by minimizing determinants of matrices containing the data. This method is deterministic, yielding reproducible estimates of location and scatter for a given data set. The EID method of finding the MVE is applied to several regression data sets where the true estimate is known. Results show that the EID method, when applied to these data sets, produces the subset of data more quickly than conventional procedures and that there is less than 6% relative error in the estimates. We also give timing results illustrating the feasibility of our method for larger data sets. For the case of 10,000 points in 10 dimensions, the compute time is under 25 minutes. |
| |
Keywords: | Minimum covariance determinant Minimum volume ellipsoid Outliers Robust estimators Subset selection |
|
|