Approximation in Learning Theory期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Approximation in Learning Theory

Authors:	VN Temlyakov

Institution:	(1) Department of Mathematics, University of South Carolina, Columbia, SC 29208, USA

Abstract:	This paper addresses some problems of supervised learning in the setting formulated by Cucker and Smale. Supervised learning, or learning-from-examples, refers to a process that builds on the base of available data of inputs x_i and outputs y_i, i = 1,...,m, a function that best represents the relation between the inputs x ∈ X and the corresponding outputs y ∈ Y. The goal is to find an estimator f_z on the base of given data z := ((x₁,y₁),...,(x_m,y_m)) that approximates well the regression function f_ρ (or its projection) of an unknown Borel probability measure ρ defined on Z = X × Y. We assume that (x_i,y_i), i = 1,...,m, are independent and distributed according to ρ. We discuss the following two problems: I. the projection learning problem (improper function learning problem); II. universal (adaptive) estimators in the proper function learning problem. In the first problem we do not impose any restrictions on a Borel measure ρ except our standard assumption that \|y\|≤ M a.e. with respect to ρ. In this case we use the data z to estimate (approximate) the L₂(ρ_X) projection (f_ρ)_W of f_ρ onto a function class W of our choice. Here, ρ_X is the marginal probability measure. In KT1,2] this problem has been studied for W satisfying the decay condition ε_n(W,B) ≤ Dn^-r of the entropy numbers ε_n(W,B) of W in a Banach space B in the case B = C(X) or B = L₂(\rho_X). In this paper we obtain the upper estimates in the case ε_n(W,L₁(ρ_X)) ≤ Dn^-r with an extra assumption that W is convex. In the second problem we assume that an unknown measure ρ satisfies some conditions. Following the standard way from nonparametric statistics we formulate these conditions of the form f_ρ ∈ Θ. Next, we assume that the only a priori information available is that f_ρ belongs to a class Θ (unknown) from a known collection {Θ} of classes. We want to build an estimator that provides approximation of f_ρ close to the optimal for the class Θ. Along with standard penalized least squares estimators we consider a new method of construction of universal estimators. This method is based on a combination of two powerful ideas in building universal estimators. The first one is the use of penalized least squares estimators. This idea works well in the case of general setting with rather abstract methods of approximation. The second one is the idea of thresholding that works very well when we use wavelets expansions as an approximation tool. A new estimator that we call the big jump estimator uses the least squares estimators and chooses a right model by a thresholding criteria instead of the penalization. In this paper we illustrate how ideas and methods of approximation theory can be used in learning theory both in formulating a problem and in solving it.

Keywords:
本文献已被 SpringerLink 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏