首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 281 毫秒
1.
A standard assumption in theoretical study of learning algorithms for regression is uniform boundedness of output sample values. This excludes the common case with Gaussian noise. In this paper we investigate the learning algorithm for regression generated by the least squares regularization scheme in reproducing kernel Hilbert spaces without the assumption of uniform boundedness for sampling. By imposing some incremental conditions on moments of the output variable, we derive learning rates in terms of regularity of the regression function and capacity of the hypothesis space. The novelty of our analysis is a new covering number argument for bounding the sample error.  相似文献   

2.
Uniform boundedness of output variables is a standard assumption in most theoretical analysis of regression algorithms. This standard assumption has recently been weaken to a moment hypothesis in least square regression (LSR) setting. Although there has been a large literature on error analysis for LSR under the moment hypothesis, very little is known about the statistical properties of support vector machines regression with unbounded sampling. In this paper, we fill the gap in the literature. Without any restriction on the boundedness of the output sampling, we establish an ad hoc convergence analysis for support vector machines regression under very mild conditions.  相似文献   

3.
蔡佳  王承 《中国科学:数学》2013,43(6):613-624
本文讨论样本依赖空间中无界抽样情形下最小二乘损失函数的系数正则化问题. 这里的学习准则与之前再生核Hilbert空间的准则有着本质差异: 核除了满足连续性和有界性之外, 不需要再满足对称性和正定性; 正则化子是函数关于样本展开系数的l2-范数; 样本输出是无界的. 上述差异给误差分析增加了额外难度. 本文的目的是在样本输出不满足一致有界的情形下, 通过l2-经验覆盖数给出误差的集中估计(concentration estimates). 通过引入一个恰当的Hilbert空间以及l2-经验覆盖数的技巧, 得到了与假设空间的容量以及与回归函数的正则性有关的较满意的学习速率.  相似文献   

4.
An online gradient method with momentum for two-layer feedforward neural networks is considered. The momentum coefficient is chosen in an adaptive manner to accelerate and stabilize the learning procedure of the network weights. Corresponding convergence results are proved, that is, the weak convergence result is proved under the uniformly boundedness assumption of the activation function and its derivatives, moreover, if the number of elements of the stationary point set for the error function is finite, then strong convergence result holds.  相似文献   

5.
Estimation of regression functions from independent and identically distributed data is considered. The L2 error with integration with respect to the design measure is used as an error criterion. Usually in the analysis of the rate of convergence of estimates besides smoothness assumptions on the regression function and moment conditions on Y also boundedness assumptions on X are made. In this article we consider partitioning and nearest neighbor estimates and show that by replacing the boundedness assumption on X by a proper moment condition the same rate of convergence can be shown as for bounded data.  相似文献   

6.
The expected residual minimization (ERM) formulation for the stochastic nonlinear complementarity problem (SNCP) is studied in this paper. We show that the involved function is a stochastic R 0 function if and only if the objective function in the ERM formulation is coercive under a mild assumption. Moreover, we model the traffic equilibrium problem (TEP) under uncertainty as SNCP and show that the objective function in the ERM formulation is a stochastic R 0 function. Numerical experiments show that the ERM-SNCP model for TEP under uncertainty has various desirable properties. This work was partially supported by a Grant-in-Aid from the Japan Society for the Promotion of Science. The authors thank Professor Guihua Lin for pointing out an error in Proposition 2.1 on an earlier version of this paper. The authors are also grateful to the referees for their insightful comments.  相似文献   

7.
The multi-class classification problem is considered by an empirical risk minimization (ERM) approach. The hypothesis space for the learning algorithm is taken to be a ball of a Banach space of continuous functions. When the regression function lies in some interpolation space, satisfactory learning rates for the excess misclassification error are provided in terms of covering numbers of the unit ball of the Banach space. A comparison theorem is proved and is used to bound the excess misclassification error by means of the excess generalization error.  相似文献   

8.
The quantile regression problem is considered by learning schemes based on ? 1—regularization and Gaussian kernels. The purpose of this paper is to present concentration estimates for the algorithms. Our analysis shows that the convergence behavior of ? 1—quantile regression with Gaussian kernels is almost the same as that of the RKHS-based learning schemes. Furthermore, the previous analysis for kernel-based quantile regression usually requires that the output sample values are uniformly bounded, which excludes the common case with Gaussian noise. Our error analysis presented in this paper can give satisfactory convergence rates even for unbounded sampling processes. Besides, numerical experiments are given which support the theoretical results.  相似文献   

9.
One of the main goals of machine learning is to study the generalization performance of learning algorithms. The previous main results describing the generalization ability of learning algorithms are usually based on independent and identically distributed (i.i.d.) samples. However, independence is a very restrictive concept for both theory and real-world applications. In this paper we go far beyond this classical framework by establishing the bounds on the rate of relative uniform convergence for the Empirical Risk Minimization (ERM) algorithm with uniformly ergodic Markov chain samples. We not only obtain generalization bounds of ERM algorithm, but also show that the ERM algorithm with uniformly ergodic Markov chain samples is consistent. The established theory underlies application of ERM type of learning algorithms.  相似文献   

10.
Evaluation for generalization performance of learning algorithms has been the main thread of machine learning theoretical research. The previous bounds describing the generalization performance of the empirical risk minimization (ERM) algorithm are usually established based on independent and identically distributed (i.i.d.) samples. In this paper we go far beyond this classical framework by establishing the generalization bounds of the ERM algorithm with uniformly ergodic Markov chain (u.e.M.c.) samples. We prove the bounds on the rate of uniform convergence/relative uniform convergence of the ERM algorithm with u.e.M.c. samples, and show that the ERM algorithm with u.e.M.c. samples is consistent. The established theory underlies application of ERM type of learning algorithms.  相似文献   

11.
In this paper, we obtain some results on the boundedness and asymptotic behavior of the sequence generated by the proximal point algorithm without summability assumption on the error sequence. We also study the rate of convergence to minimum value of a proper, convex, and lower semicontinuous function. Finally, we consider the proximal point algorithm for solving equilibrium problems.  相似文献   

12.
Solutions of learning problems by Empirical Risk Minimization (ERM) – and almost-ERM when the minimizer does not exist – need to be consistent, so that they may be predictive. They also need to be well-posed in the sense of being stable, so that they might be used robustly. We propose a statistical form of stability, defined as leave-one-out (LOO) stability. We prove that for bounded loss classes LOO stability is (a) sufficient for generalization, that is convergence in probability of the empirical error to the expected error, for any algorithm satisfying it and, (b) necessary and sufficient for consistency of ERM. Thus LOO stability is a weak form of stability that represents a sufficient condition for generalization for symmetric learning algorithms while subsuming the classical conditions for consistency of ERM. In particular, we conclude that a certain form of well-posedness and consistency are equivalent for ERM. Dedicated to Charles A. Micchelli on his 60th birthday Mathematics subject classifications (2000) 68T05, 68T10, 68Q32, 62M20. Tomaso Poggio: Corresponding author.  相似文献   

13.
In this paper we establish the error estimates for multi-penalty regularization under the general smoothness assumption in the context of learning theory. One of the motivation for this work is to study the convergence analysis of two-parameter regularization theoretically in the manifold learning setting. In this spirit, we obtain the error bounds for the manifold learning problem using more general framework of multi-penalty regularization. We propose a new parameter choice rule “the balanced-discrepancy principle” and analyze the convergence of the scheme with the help of estimated error bounds. We show that multi-penalty regularization with the proposed parameter choice exhibits the convergence rates similar to single-penalty regularization. Finally on a series of test samples we demonstrate the superiority of multi-parameter regularization over single-penalty regularization.  相似文献   

14.
We continue our study [S. Smale, D.X. Zhou, Shannon sampling and function reconstruction from point values, Bull. Amer. Math. Soc. 41 (2004) 279–305] of Shannon sampling and function reconstruction. In this paper, the error analysis is improved. Then we show how our approach can be applied to learning theory: a functional analysis framework is presented; dimension independent probability estimates are given not only for the error in the L2 spaces, but also for the error in the reproducing kernel Hilbert space where the learning algorithm is performed. Covering number arguments are replaced by estimates of integral operators.  相似文献   

15.
The optimization of the output matrix for a discrete-time, single-output, linear stochastic system is approached from two different points of view. Firstly, we investigate the problem of minimizing the steady-state filter error variance with respect to a time-invariant output matrix subject to a norm constraint. Secondly, we propose a filter algorithm in which the output matrix at timek is chosen so as to maximize the difference at timek+1 between the variance of the prediction error and that of the a posteriori error. For this filter, boundedness of the covariance and asymptotic stability are investigated. Several numerical experiments are reported: they give information about the limiting behavior of the sequence of output matrices generated by the algorithm and the corresponding error covariance. They also enable us to make a comparison with the results obtained by solving the former problem.This work was supported by the Italian Ministry of Education (MPI 40%), Rome, Italy.  相似文献   

16.
有限时间迭代学习控制   总被引:7,自引:0,他引:7  
针对任意初态情形, 借助于初始修正吸引子的概念,讨论不确定时变系统能够达到实际完全跟踪性能的迭代学习控制方法.闭环系统中含有限时间控制作用, 在预先指定的区间上实现零误差跟踪,且起始段的系统输出轨迹也可预先规划.分别讨论部分限幅学习与完全限幅学习, 证明闭环系统中各变量的一致有界性以及误差序列的一致收敛性. 变量有界性证明得益于提出的限幅学习算法,特别是完全限幅学习算法可确保参数估值的变化范围.  相似文献   

17.
研究一类具有非线性不确定参数的非线性系统的自适应模型参考跟踪问题.假设系统的非线性项关于不确定参数是凸或凹的.去掉了在先前有关研究中要求参考模型矩阵有小于零的实特征值的条件.既考虑了状态反馈控制方式,也考虑了输出反馈控制方式.在采用输出反馈控制时,假设非线性项满足李普希兹条件,但李普希兹常数未知.基于一种极大极小方法,提出了一种自适应控制器的设计方法.控制器是连续的,能保证闭环系统的所有变量有界,并且渐近精确跟踪参考模型.举例说明了本结论的有用性.  相似文献   

18.
The problem of learning from data involving function values and gradients is considered in a framework of least-square regularized regression in reproducing kernel Hilbert spaces. The algorithm is implemented by a linear system with the coefficient matrix involving both block matrices for generating Graph Laplacians and Hessians. The additional data for function gradients improve learning performance of the algorithm. Error analysis is done by means of sampling operators for sample error and integral operators in Sobolev spaces for approximation error.  相似文献   

19.
Online gradient algorithm has been widely used as a learning algorithm for feedforward neural network training. In this paper, we prove a weak convergence theorem of an online gradient algorithm with a penalty term, assuming that the training examples are input in a stochastic way. The monotonicity of the error function in the iteration and the boundedness of the weight are both guaranteed. We also present a numerical experiment to support our results.  相似文献   

20.
The previous results describing the generalization ability of Empirical Risk Minimization (ERM) algorithm are usually based on the assumption of independent and identically distributed (i.i.d.) samples. In this paper we go far beyond this classical framework by establishing the first exponential bound on the rate of uniform convergence of the ERM algorithm with V-geometrically ergodic Markov chain samples, as the application of the bound on the rate of uniform convergence, we also obtain the generalization bounds of the ERM algorithm with V-geometrically ergodic Markov chain samples and prove that the ERM algorithm with V-geometrically ergodic Markov chain samples is consistent. The main results obtained in this paper extend the previously known results of i.i.d. observations to the case of V-geometrically ergodic Markov chain samples.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号