期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Optimal learning rates for least squares regularized regression with unbounded sampling

Cheng Wang 《Journal of Complexity》2011,27(1):55-67

A standard assumption in theoretical study of learning algorithms for regression is uniform boundedness of output sample values. This excludes the common case with Gaussian noise. In this paper we investigate the learning algorithm for regression generated by the least squares regularization scheme in reproducing kernel Hilbert spaces without the assumption of uniform boundedness for sampling. By imposing some incremental conditions on moments of the output variable, we derive learning rates in terms of regularity of the regression function and capacity of the hypothesis space. The novelty of our analysis is a new covering number argument for bounding the sample error. 相似文献

2.

Support vector machines regression with unbounded sampling

Hongzhi Tong Di-Rong Chen Fenghong Yang 《Applicable analysis》2013,92(9):1626-1635

Uniform boundedness of output variables is a standard assumption in most theoretical analysis of regression algorithms. This standard assumption has recently been weaken to a moment hypothesis in least square regression (LSR) setting. Although there has been a large literature on error analysis for LSR under the moment hypothesis, very little is known about the statistical properties of support vector machines regression with unbounded sampling. In this paper, we fill the gap in the literature. Without any restriction on the boundedness of the output sampling, we establish an ad hoc convergence analysis for support vector machines regression under very mild conditions. 相似文献

3.

无界抽样情形下不定核的系数正则化回归

下载免费PDF全文

蔡佳王承《中国科学:数学》2013,43(6):613-624

本文讨论样本依赖空间中无界抽样情形下最小二乘损失函数的系数正则化问题. 这里的学习准则与之前再生核Hilbert空间的准则有着本质差异: 核除了满足连续性和有界性之外, 不需要再满足对称性和正定性; 正则化子是函数关于样本展开系数的l²-范数; 样本输出是无界的. 上述差异给误差分析增加了额外难度. 本文的目的是在样本输出不满足一致有界的情形下, 通过l²-经验覆盖数给出误差的集中估计(concentration estimates). 通过引入一个恰当的Hilbert空间以及l²-经验覆盖数的技巧, 得到了与假设空间的容量以及与回归函数的正则性有关的较满意的学习速率. 相似文献

4.

An online gradient method with momentum for two-layer feedforward neural networks 总被引：1，自引：0，他引：1

Naimin Zhang 《Applied mathematics and computation》2009,212(2):488-498

An online gradient method with momentum for two-layer feedforward neural networks is considered. The momentum coefficient is chosen in an adaptive manner to accelerate and stabilize the learning procedure of the network weights. Corresponding convergence results are proved, that is, the weak convergence result is proved under the uniformly boundedness assumption of the activation function and its derivatives, moreover, if the number of elements of the stationary point set for the error function is finite, then strong convergence result holds. 相似文献

5.

Rates of convergence for partitioning and nearest neighbor regression estimates with unbounded data

Michael Kohler 《Journal of multivariate analysis》2006,97(2):311-323

Estimation of regression functions from independent and identically distributed data is considered. The L₂ error with integration with respect to the design measure is used as an error criterion. Usually in the analysis of the rate of convergence of estimates besides smoothness assumptions on the regression function and moment conditions on Y also boundedness assumptions on X are made. In this article we consider partitioning and nearest neighbor estimates and show that by replacing the boundedness assumption on X by a proper moment condition the same rate of convergence can be shown as for bounded data. 相似文献

6.

Stochastic Nonlinear Complementarity Problem and?Applications to?Traffic Equilibrium under?Uncertainty 总被引：1，自引：0，他引：1

C. Zhang X. Chen 《Journal of Optimization Theory and Applications》2008,137(2):277-295

The expected residual minimization (ERM) formulation for the stochastic nonlinear complementarity problem (SNCP) is studied in this paper. We show that the involved function is a stochastic R ₀ function if and only if the objective function in the ERM formulation is coercive under a mild assumption. Moreover, we model the traffic equilibrium problem (TEP) under uncertainty as SNCP and show that the objective function in the ERM formulation is a stochastic R ₀ function. Numerical experiments show that the ERM-SNCP model for TEP under uncertainty has various desirable properties. This work was partially supported by a Grant-in-Aid from the Japan Society for the Promotion of Science. The authors thank Professor Guihua Lin for pointing out an error in Proposition 2.1 on an earlier version of this paper. The authors are also grateful to the referees for their insightful comments. 相似文献

7.

ERM learning algorithm for multi-class classification

Cheng Wang 《Applicable analysis》2013,92(7):1339-1349

The multi-class classification problem is considered by an empirical risk minimization (ERM) approach. The hypothesis space for the learning algorithm is taken to be a ball of a Banach space of continuous functions. When the regression function lies in some interpolation space, satisfactory learning rates for the excess misclassification error are provided in terms of covering numbers of the unit ball of the Banach space. A comparison theorem is proved and is used to bound the excess misclassification error by means of the excess generalization error. 相似文献

8.

Quantile regression with ℓ 1—regularization and Gaussian kernels

Lei Shi Xiaolin Huang Zheng Tian Johan A. K. Suykens 《Advances in Computational Mathematics》2014,40(2):517-551

The quantile regression problem is considered by learning schemes based on ? ₁—regularization and Gaussian kernels. The purpose of this paper is to present concentration estimates for the algorithms. Our analysis shows that the convergence behavior of ? ₁—quantile regression with Gaussian kernels is almost the same as that of the RKHS-based learning schemes. Furthermore, the previous analysis for kernel-based quantile regression usually requires that the output sample values are uniformly bounded, which excludes the common case with Gaussian noise. Our error analysis presented in this paper can give satisfactory convergence rates even for unbounded sampling processes. Besides, numerical experiments are given which support the theoretical results. 相似文献

9.

Generalization bounds of ERM algorithm with Markov chain samples

Bin ZOU Zong-ben XU Jie XU 《应用数学学报(英文版)》2014,30(1):223-238

One of the main goals of machine learning is to study the generalization performance of learning algorithms. The previous main results describing the generalization ability of learning algorithms are usually based on independent and identically distributed （i.i.d.） samples. However, independence is a very restrictive concept for both theory and real-world applications. In this paper we go far beyond this classical framework by establishing the bounds on the rate of relative uniform convergence for the Empirical Risk Minimization （ERM） algorithm with uniformly ergodic Markov chain samples. We not only obtain generalization bounds of ERM algorithm, but also show that the ERM algorithm with uniformly ergodic Markov chain samples is consistent. The established theory underlies application of ERM type of learning algorithms. 相似文献

10.

Learning from uniformly ergodic Markov chains

Bin Zou Hai Zhang Zongben Xu 《Journal of Complexity》2009

Evaluation for generalization performance of learning algorithms has been the main thread of machine learning theoretical research. The previous bounds describing the generalization performance of the empirical risk minimization (ERM) algorithm are usually established based on independent and identically distributed (i.i.d.) samples. In this paper we go far beyond this classical framework by establishing the generalization bounds of the ERM algorithm with uniformly ergodic Markov chain (u.e.M.c.) samples. We prove the bounds on the rate of uniform convergence/relative uniform convergence of the ERM algorithm with u.e.M.c. samples, and show that the ERM algorithm with u.e.M.c. samples is consistent. The established theory underlies application of ERM type of learning algorithms. 相似文献

11.

Some Remarks on the Proximal Point Algorithm

Hadi Khatibzadeh 《Journal of Optimization Theory and Applications》2012,153(3):769-778

In this paper, we obtain some results on the boundedness and asymptotic behavior of the sequence generated by the proximal point algorithm without summability assumption on the error sequence. We also study the rate of convergence to minimum value of a proper, convex, and lower semicontinuous function. Finally, we consider the proximal point algorithm for solving equilibrium problems. 相似文献

12.

Learning theory: stability is sufficient for generalization and necessary and sufficient for consistency of empirical risk minimization

Sayan Mukherjee Partha Niyogi Tomaso Poggio Ryan Rifkin 《Advances in Computational Mathematics》2006,25(1-3):161-193

Solutions of learning problems by Empirical Risk Minimization (ERM) – and almost-ERM when the minimizer does not exist – need to be consistent, so that they may be predictive. They also need to be well-posed in the sense of being stable, so that they might be used robustly. We propose a statistical form of stability, defined as leave-one-out (LOO) stability. We prove that for bounded loss classes LOO stability is (a) sufficient for generalization, that is convergence in probability of the empirical error to the expected error, for any algorithm satisfying it and, (b) necessary and sufficient for consistency of ERM. Thus LOO stability is a weak form of stability that represents a sufficient condition for generalization for symmetric learning algorithms while subsuming the classical conditions for consistency of ERM. In particular, we conclude that a certain form of well-posedness and consistency are equivalent for ERM. Dedicated to Charles A. Micchelli on his 60th birthday Mathematics subject classifications (2000) 68T05, 68T10, 68Q32, 62M20. Tomaso Poggio: Corresponding author. 相似文献

13.

Multi-penalty regularization in learning theory

《Journal of Complexity》2016

In this paper we establish the error estimates for multi-penalty regularization under the general smoothness assumption in the context of learning theory. One of the motivation for this work is to study the convergence analysis of two-parameter regularization theoretically in the manifold learning setting. In this spirit, we obtain the error bounds for the manifold learning problem using more general framework of multi-penalty regularization. We propose a new parameter choice rule “the balanced-discrepancy principle” and analyze the convergence of the scheme with the help of estimated error bounds. We show that multi-penalty regularization with the proposed parameter choice exhibits the convergence rates similar to single-penalty regularization. Finally on a series of test samples we demonstrate the superiority of multi-parameter regularization over single-penalty regularization. 相似文献

14.

Shannon sampling II: Connections to learning theory

Steve Smale Ding-Xuan Zhou 《Applied and Computational Harmonic Analysis》2005,19(3):285

We continue our study [S. Smale, D.X. Zhou, Shannon sampling and function reconstruction from point values, Bull. Amer. Math. Soc. 41 (2004) 279–305] of Shannon sampling and function reconstruction. In this paper, the error analysis is improved. Then we show how our approach can be applied to learning theory: a functional analysis framework is presented; dimension independent probability estimates are given not only for the error in the L² spaces, but also for the error in the reproducing kernel Hilbert space where the learning algorithm is performed. Covering number arguments are replaced by estimates of integral operators. 相似文献

15.

On the optimal design of the output transformation for discrete-time linear systems

L. Carotenuto P. Muraca G. Raiconi 《Journal of Optimization Theory and Applications》1991,68(1):1-18

The optimization of the output matrix for a discrete-time, single-output, linear stochastic system is approached from two different points of view. Firstly, we investigate the problem of minimizing the steady-state filter error variance with respect to a time-invariant output matrix subject to a norm constraint. Secondly, we propose a filter algorithm in which the output matrix at timek is chosen so as to maximize the difference at timek+1 between the variance of the prediction error and that of the a posteriori error. For this filter, boundedness of the covariance and asymptotic stability are investigated. Several numerical experiments are reported: they give information about the limiting behavior of the sequence of output matrices generated by the algorithm and the corresponding error covariance. They also enable us to make a comparison with the results obtained by solving the former problem.This work was supported by the Italian Ministry of Education (MPI 40%), Rome, Italy. 相似文献

16.

有限时间迭代学习控制 总被引：7，自引：0，他引：7

孙明轩《系统科学与数学》2010,30(6):733-741

针对任意初态情形, 借助于初始修正吸引子的概念,讨论不确定时变系统能够达到实际完全跟踪性能的迭代学习控制方法.闭环系统中含有限时间控制作用, 在预先指定的区间上实现零误差跟踪,且起始段的系统输出轨迹也可预先规划.分别讨论部分限幅学习与完全限幅学习, 证明闭环系统中各变量的一致有界性以及误差序列的一致收敛性. 变量有界性证明得益于提出的限幅学习算法,特别是完全限幅学习算法可确保参数估值的变化范围. 相似文献

17.

基于极大极小方法的一类非线性系统的自适应控制

陈彭年秦化淑《系统科学与数学》2007,27(3):344-353

研究一类具有非线性不确定参数的非线性系统的自适应模型参考跟踪问题.假设系统的非线性项关于不确定参数是凸或凹的.去掉了在先前有关研究中要求参考模型矩阵有小于零的实特征值的条件.既考虑了状态反馈控制方式,也考虑了输出反馈控制方式.在采用输出反馈控制时,假设非线性项满足李普希兹条件,但李普希兹常数未知.基于一种极大极小方法,提出了一种自适应控制器的设计方法.控制器是连续的,能保证闭环系统的所有变量有界,并且渐近精确跟踪参考模型.举例说明了本结论的有用性. 相似文献

18.

Hermite learning with gradient data

Lei Shi Xin Guo 《Journal of Computational and Applied Mathematics》2010,233(11):3046-3059

The problem of learning from data involving function values and gradients is considered in a framework of least-square regularized regression in reproducing kernel Hilbert spaces. The algorithm is implemented by a linear system with the coefficient matrix involving both block matrices for generating Graph Laplacians and Hessians. The additional data for function gradients improve learning performance of the algorithm. Error analysis is done by means of sampling operators for sample error and integral operators in Sobolev spaces for approximation error. 相似文献

19.

CONVERGENCE OF ONLINE GRADIENT METHOD WITH A PENALTY TERM FOR FEEDFORWARD NEURAL NETWORKS WITH STOCHASTIC INPUTS

Shao Hongmei Wu Wei Li Feng 《高等学校计算数学学报(英文版)》2005,14(1)

Online gradient algorithm has been widely used as a learning algorithm for feedforward neural network training. In this paper, we prove a weak convergence theorem of an online gradient algorithm with a penalty term, assuming that the training examples are input in a stochastic way. The monotonicity of the error function in the iteration and the boundedness of the weight are both guaranteed. We also present a numerical experiment to support our results. 相似文献

20.

Generalization bounds of ERM algorithm with V-geometrically Ergodic Markov chains

Bin Zou Zongben Xu Xiangyu Chang 《Advances in Computational Mathematics》2012,36(1):99-114

The previous results describing the generalization ability of Empirical Risk Minimization (ERM) algorithm are usually based on the assumption of independent and identically distributed (i.i.d.) samples. In this paper we go far beyond this classical framework by establishing the first exponential bound on the rate of uniform convergence of the ERM algorithm with V-geometrically ergodic Markov chain samples, as the application of the bound on the rate of uniform convergence, we also obtain the generalization bounds of the ERM algorithm with V-geometrically ergodic Markov chain samples and prove that the ERM algorithm with V-geometrically ergodic Markov chain samples is consistent. The main results obtained in this paper extend the previously known results of i.i.d. observations to the case of V-geometrically ergodic Markov chain samples. 相似文献