首页 | 本学科首页   官方微博 | 高级检索  
     


Risk bounds for model selection via penalization
Authors:Andrew Barron  Lucien Birgé  Pascal Massart
Affiliation:Department of Statistics, Yale University, P.O. Box 208290, New Haven, CT 06520-8290, USA. e-mail: barron@stat.yale.edu, US
URA 1321 “Statistique et modèles aléatoires”, Laboratoire de Probabilités, bo?te 188, Université Paris VI, 4 Place Jussieu, F-75252 Paris Cedex 05, France. e-mail: lb@ccr.jussieu.fr, FR
URA 743 “Modélisation stochastique et Statistique”, Bat. 425, Université Paris Sud, Campus d'Orsay, F-91405 Orsay Cedex, France. e-mail: massart@stats.matups.fr, FR
Abstract:Performance bounds for criteria for model selection are developed using recent theory for sieves. The model selection criteria are based on an empirical loss or contrast function with an added penalty term motivated by empirical process theory and roughly proportional to the number of parameters needed to describe the model divided by the number of observations. Most of our examples involve density or regression estimation settings and we focus on the problem of estimating the unknown density or regression function. We show that the quadratic risk of the minimum penalized empirical contrast estimator is bounded by an index of the accuracy of the sieve. This accuracy index quantifies the trade-off among the candidate models between the approximation error and parameter dimension relative to sample size. If we choose a list of models which exhibit good approximation properties with respect to different classes of smoothness, the estimator can be simultaneously minimax rate optimal in each of those classes. This is what is usually called adaptation. The type of classes of smoothness in which one gets adaptation depends heavily on the list of models. If too many models are involved in order to get accurate approximation of many wide classes of functions simultaneously, it may happen that the estimator is only approximately adaptive (typically up to a slowly varying function of the sample size). We shall provide various illustrations of our method such as penalized maximum likelihood, projection or least squares estimation. The models will involve commonly used finite dimensional expansions such as piecewise polynomials with fixed or variable knots, trigonometric polynomials, wavelets, neural nets and related nonlinear expansions defined by superposition of ridge functions. Received: 7 July 1995 / Revised version: 1 November 1997
Keywords:Mathematics subject classifications (1991): Primary 62G05, 62G07   secondary 41A25
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号