首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Approximation in Learning Theory
Authors:VN Temlyakov
Institution:(1) Department of Mathematics, University of South Carolina, Columbia, SC 29208, USA
Abstract:This paper addresses some problems of supervised learning in the setting formulated by Cucker and Smale. Supervised learning, or learning-from-examples, refers to a process that builds on the base of available data of inputs xi and outputs yi, i = 1,...,m, a function that best represents the relation between the inputs x ∈ X and the corresponding outputs y ∈ Y. The goal is to find an estimator fz on the base of given data z := ((x1,y1),...,(xm,ym)) that approximates well the regression function fρ (or its projection) of an unknown Borel probability measure ρ defined on Z = X × Y. We assume that (xi,yi), i = 1,...,m, are independent and distributed according to ρ. We discuss the following two problems: I. the projection learning problem (improper function learning problem); II. universal (adaptive) estimators in the proper function learning problem. In the first problem we do not impose any restrictions on a Borel measure ρ except our standard assumption that |y|≤ M a.e. with respect to ρ. In this case we use the data z to estimate (approximate) the L2X) projection (fρ)W of fρ onto a function class W of our choice. Here, ρX is the marginal probability measure. In KT1,2] this problem has been studied for W satisfying the decay condition εn(W,B) ≤ Dn-r of the entropy numbers εn(W,B) of W in a Banach space B in the case B = C(X) or B = L2(\rhoX). In this paper we obtain the upper estimates in the case εn(W,L1X)) ≤ Dn-r with an extra assumption that W is convex. In the second problem we assume that an unknown measure ρ satisfies some conditions. Following the standard way from nonparametric statistics we formulate these conditions of the form fρ ∈ Θ. Next, we assume that the only a priori information available is that fρ belongs to a class Θ (unknown) from a known collection {Θ} of classes. We want to build an estimator that provides approximation of fρ close to the optimal for the class Θ. Along with standard penalized least squares estimators we consider a new method of construction of universal estimators. This method is based on a combination of two powerful ideas in building universal estimators. The first one is the use of penalized least squares estimators. This idea works well in the case of general setting with rather abstract methods of approximation. The second one is the idea of thresholding that works very well when we use wavelets expansions as an approximation tool. A new estimator that we call the big jump estimator uses the least squares estimators and chooses a right model by a thresholding criteria instead of the penalization. In this paper we illustrate how ideas and methods of approximation theory can be used in learning theory both in formulating a problem and in solving it.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号