首页 | 本学科首页   官方微博 | 高级检索  
     


Projection Pursuit Indexes Based on Orthonormal Function Expansions
Authors:Dianne Cook  Andreas Buja  Javier Cabrera
Affiliation:1. Department of Statistics , Snedecor Hall, Iowa State University , Ames , IA , 50011-1210 , USA;2. Bellcore , 445 South St., Morristown , NJ , 07962-1910 , USA;3. Department of Statistics , Hill Center, Busch Campus, Rutgers University , New Brunswick , NJ , 08904 , USA
Abstract:Abstract

Projection pursuit describes a procedure for searching high-dimensional data for “interesting” low-dimensional projections via the optimization of a criterion function called the projection pursuit index. By empirically examining the optimization process for several projection pursuit indexes, we observed differences in the types of structure that maximized each index. We were especially curious about differences between two indexes based on expansions in terms of orthogonal polynomials, the Legendre index, and the Hermite index. Being fast to compute, these indexes are ideally suited for dynamic graphics implementations.

Both Legendre and Hermite indexes are weighted L 2 distances between the density of the projected data and a standard normal density. A general form for this type of index is introduced that encompasses both indexes. The form clarifies the effects of the weight function on the index's sensitivity to differences from normality, highlighting some conceptual problems with the Legendre and Hermite indexes. A new index, called the Natural Hermite index, which alleviates some of these problems, is introduced.

A polynomial expansion of the data density reduces the form of the index to a sum of squares of the coefficients used in the expansion. This drew our attention to examining these coefficients as indexes in their own right. We found that the first two coefficients, and the lowest-order indexes produced by them, are the most useful ones for practical data exploration because they respond to structure that can be analytically identified, and because they have “long-sighted” vision that enables them to “see” large structure from a distance. Complementing this low-order behavior, the higher-order indexes are “short-sighted.” They are able to see intricate structure, but only when they are close to it.

We also show some practical use of projection pursuit using the polynomial indexes, including a discovery of previously unseen structure in a set of telephone usage data, and two cautionary examples which illustrate that structure found is not always meaningful.
Keywords:Clustering  Density estimation  Exploratory multivariate data analysis  Nonnormality  Principal component analysis  Skewness
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号