Approximation Methods for Supervised Learning |
| |
Authors: | Ronald DeVore Gerard Kerkyacharian Dominique Picard Vladimir Temlyakov |
| |
Affiliation: | (1) Department of Mathematics, University of South Carolina, Columbia, SC 29208, USA;(2) Universite Paris X-Nanterre, 200 Avenue de la Republique, F 92001 Nanterre cedex, France;(3) Laboratoire de Probabilites et Modeles Aletoires CNRS-UMR 7599, Universite Paris VI et Universite Paris VII, 16 rue de Clisson, F-750013 Paris , France |
| |
Abstract: | Let ρ be an unknown Borel measure defined on the space Z := X × Y with X ⊂ ℝd and Y = [-M,M]. Given a set z of m samples zi =(xi,yi) drawn according to ρ, the problem of estimating a regression function fρ using these samples is considered. The main focus is to understand what is the rate of approximation, measured either in expectation or probability, that can be obtained under a given prior fρ ∈ Θ, i.e., under the assumption that fρ is in the set Θ, and what are possible algorithms for obtaining optimal or semioptimal (up to logarithms) results. The optimal rate of decay in terms of m is established for many priors given either in terms of smoothness of fρ or its rate of approximation measured in one of several ways. This optimal rate is determined by two types of results. Upper bounds are established using various tools in approximation such as entropy, widths, and linear and nonlinear approximation. Lower bounds are proved using Kullback-Leibler information together with Fano inequalities and a certain type of entropy. A distinction is drawn between algorithms which employ knowledge of the prior in the construction of the estimator and those that do not. Algorithms of the second type which are universally optimal for a certain range of priors are given. |
| |
Keywords: | Learning theory Regression estimation Entropy Nonlinear methods Upper and lower estimates |
本文献已被 SpringerLink 等数据库收录! |
|