首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Error bounds, which refer to inequalities that bound the distance of vectors in a test set to a given set by a residual function, have proven to be extremely useful in analyzing the convergence rates of a host of iterative methods for solving optimization problems. In this paper, we present a new framework for establishing error bounds for a class of structured convex optimization problems, in which the objective function is the sum of a smooth convex function and a general closed proper convex function. Such a class encapsulates not only fairly general constrained minimization problems but also various regularized loss minimization formulations in machine learning, signal processing, and statistics. Using our framework, we show that a number of existing error bound results can be recovered in a unified and transparent manner. To further demonstrate the power of our framework, we apply it to a class of nuclear-norm regularized loss minimization problems and establish a new error bound for this class under a strict complementarity-type regularity condition. We then complement this result by constructing an example to show that the said error bound could fail to hold without the regularity condition. We believe that our approach will find further applications in the study of error bounds for structured convex optimization problems.  相似文献   

2.
In this paper, we consider the robust regression problem associated with Huber loss in the framework of functional linear model and reproducing kernel Hilbert spaces. We propose an Ivanov regularized empirical risk minimization estimation procedure to approximate the slope function of the linear model in the presence of outliers or heavy-tailed noises. By appropriately tuning the scale parameter of the Huber loss, we establish explicit rates of convergence for our estimates in terms of excess prediction risk under mild assumptions. Our study in the paper justifies the efficiency of Huber regression for functional data from a theoretical viewpoint.  相似文献   

3.
Tao  Ting  Pan  Shaohua  Bi  Shujun 《Journal of Global Optimization》2021,81(4):991-1017

This paper is concerned with the squared F(robenius)-norm regularized factorization form for noisy low-rank matrix recovery problems. Under a suitable assumption on the restricted condition number of the Hessian matrix of the loss function, we establish an error bound to the true matrix for the non-strict critical points with rank not more than that of the true matrix. Then, for the squared F-norm regularized factorized least squares loss function, we establish its KL property of exponent 1/2 on the global optimal solution set under the noisy and full sample setting, and achieve this property at its certain class of critical points under the noisy and partial sample setting. These theoretical findings are also confirmed by solving the squared F-norm regularized factorization problem with an accelerated alternating minimization method.

  相似文献   

4.
考虑一类随机互线性补问题的求解方法,目的是通过定义NCP函数来使正则化期望残差最小化.通过拟蒙洛包洛方法产生一系列观察值并且证得离散近似问题最小值解的聚点就是相应随机线性互补问题的期望残差最小值ERM,同时得到利用ERM到解为有界的充分条件.进一步证明ERM法能够得到具有稳定性和最小灵敏度的稳健解.  相似文献   

5.
We view regularized learning of a function in a Banach space from its finite samples as an optimization problem. Within the framework of reproducing kernel Banach spaces, we prove the representer theorem for the minimizer of regularized learning schemes with a general loss function and a nondecreasing regularizer. When the loss function and the regularizer are differentiable, a characterization equation for the minimizer is also established.  相似文献   

6.
Analysis of Support Vector Machines Regression   总被引:1,自引:0,他引:1  
Support vector machines regression (SVMR) is a regularized learning algorithm in reproducing kernel Hilbert spaces with a loss function called the ε-insensitive loss function. Compared with the well-understood least square regression, the study of SVMR is not satisfactory, especially the quantitative estimates of the convergence of this algorithm. This paper provides an error analysis for SVMR, and introduces some recently developed methods for analysis of classification algorithms such as the projection operator and the iteration technique. The main result is an explicit learning rate for the SVMR algorithm under some assumptions. Research supported by NNSF of China No. 10471002, No. 10571010 and RFDP of China No. 20060001010.  相似文献   

7.
We study the predictive performance of ? 1-regularized linear regression in a model-free setting, including the case where the number of covariates is substantially larger than the sample size. We introduce a new analysis method that avoids the boundedness problems that typically arise in model-free empirical minimization. Our technique provides an answer to a conjecture of Greenshtein and Ritov (Bernoulli 10(6):971–988, 2004) regarding the “persistence” rate for linear regression and allows us to prove an oracle inequality for the error of the regularized minimizer. It also demonstrates that empirical risk minimization gives optimal rates (up to log factors) of convex aggregation of a set of estimators of a regression function.  相似文献   

8.
We consider an optimization reformulation approach for the generalized Nash equilibrium problem (GNEP) that uses the regularized gap function of a quasi-variational inequality (QVI). The regularized gap function for QVI is in general not differentiable, but only directionally differentiable. Moreover, a simple condition has yet to be established, under which any stationary point of the regularized gap function solves the QVI. We tackle these issues for the GNEP in which the shared constraints are given by linear equalities, while the individual constraints are given by convex inequalities. First, we formulate the minimization problem involving the regularized gap function and show the equivalence to GNEP. Next, we establish the differentiability of the regularized gap function and show that any stationary point of the minimization problem solves the original GNEP under some suitable assumptions. Then, by using a barrier technique, we propose an algorithm that sequentially solves minimization problems obtained from GNEPs with the shared equality constraints only. Further, we discuss the case of shared inequality constraints and present an algorithm that utilizes the transformation of the inequality constraints to equality constraints by means of slack variables. We present some results of numerical experiments to illustrate the proposed approach.  相似文献   

9.
To achieve robustness against the outliers or heavy-tailed sampling distribution, we consider an Ivanov regularized empirical risk minimization scheme associated with a modified Huber's loss for nonparametric regression in reproducing kernel Hilbert space. By tuning the scaling and regularization parameters in accordance with the sample size, we develop nonasymptotic concentration results for such an adaptive estimator. Specifically, we establish the best convergence rates for prediction error when the conditional distribution satisfies a weak moment condition.  相似文献   

10.
In this paper, we establish a strong convergence theorem regarding a regularized variant of the projected subgradient method for nonsmooth, nonstrictly convex minimization in real Hilbert spaces. Only one projection step is needed per iteration and the involved stepsizes are controlled so that the algorithm is of practical interest. To this aim, we develop new techniques of analysis which can be adapted to many other non-Fejérian methods.  相似文献   

11.
In this paper we first establish a Lagrange multiplier condition characterizing a regularized Lagrangian duality for quadratic minimization problems with finitely many linear equality and quadratic inequality constraints, where the linear constraints are not relaxed in the regularized Lagrangian dual. In particular, in the case of a quadratic optimization problem with a single quadratic inequality constraint such as the linearly constrained trust-region problems, we show that the Slater constraint qualification (SCQ) is necessary and sufficient for the regularized Lagrangian duality in the sense that the regularized duality holds for each quadratic objective function over the constraints if and only if (SCQ) holds. A new theorem of the alternative for systems involving both equality constraints and two quadratic inequality constraints plays a key role. We also provide classes of quadratic programs, including a class of CDT-subproblems with linear equality constraints, where (SCQ) ensures regularized Lagrangian duality.  相似文献   

12.
Methods for analyzing or learning from “fuzzy data” have attracted increasing attention in recent years. In many cases, however, existing methods (for precise, non-fuzzy data) are extended to the fuzzy case in an ad-hoc manner, and without carefully considering the interpretation of a fuzzy set when being used for modeling data. Distinguishing between an ontic and an epistemic interpretation of fuzzy set-valued data, and focusing on the latter, we argue that a “fuzzification” of learning algorithms based on an application of the generic extension principle is not appropriate. In fact, the extension principle fails to properly exploit the inductive bias underlying statistical and machine learning methods, although this bias, at least in principle, offers a means for “disambiguating” the fuzzy data. Alternatively, we therefore propose a method which is based on the generalization of loss functions in empirical risk minimization, and which performs model identification and data disambiguation simultaneously. Elaborating on the fuzzification of specific types of losses, we establish connections to well-known loss functions in regression and classification. We compare our approach with related methods and illustrate its use in logistic regression for binary classification.  相似文献   

13.
We consider a generalized risk process which consists of a subordinator plus a spectrally negative Lévy process. Our interest is to estimate the expected discounted penalty function (EDPF) from a set of data which is practical in the insurance framework. We construct an empirical type estimator of the Laplace transform of the EDPF and obtain it by a regularized Laplace inversion. The asymptotic behavior of the estimator under a high frequency assumption is investigated.  相似文献   

14.
In regularized kernel methods, the solution of a learning problem is found by minimizing a functional consisting of a empirical risk and a regularization term. In this paper, we study the existence of optimal solution of multi-kernel regularization learning. First, we ameliorate a previous conclusion about this problem given by Micchelli and Pontil, and prove that the optimal solution exists whenever the kernel set is a compact set. Second, we consider this problem for Gaussian kernels with variance σ∈(0,∞), and give some conditions under which the optimal solution exists.  相似文献   

15.
In this paper, we investigate the generalization performance of a regularized ranking algorithm in a reproducing kernel Hilbert space associated with least square ranking loss. An explicit expression for the solution via a sampling operator is derived and plays an important role in our analysis. Convergence analysis for learning a ranking function is provided, based on a novel capacity independent approach, which is stronger than for previous studies of the ranking problem.  相似文献   

16.
Gaussians are important tools for learning from data of large dimensions. The variance of a Gaussian kernel is a measurement of the frequency range of function components or features retrieved by learning algorithms induced by the Gaussian. The learning ability and approximation power increase when the variance of the Gaussian decreases. Thus, it is natural to use  Gaussians with decreasing variances  for online algorithms when samples are imposed one by one. In this paper, we consider fully online classification algorithms associated with a general loss function and varying Gaussians which are closely related to regularization schemes in reproducing kernel Hilbert spaces. Learning rates are derived in terms of the smoothness of a target function associated with the probability measure controlling sampling and the loss function. A critical estimate is given for the norm of the difference of regularized target functions as the variance of the Gaussian changes. Concrete learning rates are presented for the online learning algorithm with the least square loss function.  相似文献   

17.
In the present paper, we give an investigation on the learning rate ofl2-coefcient regularized classifcation with strong loss and the data dependent kernel functional spaces. The results show that the learning rate is influenced by the strong convexity.  相似文献   

18.
The multi-class classification problem is considered by an empirical risk minimization (ERM) approach. The hypothesis space for the learning algorithm is taken to be a ball of a Banach space of continuous functions. When the regression function lies in some interpolation space, satisfactory learning rates for the excess misclassification error are provided in terms of covering numbers of the unit ball of the Banach space. A comparison theorem is proved and is used to bound the excess misclassification error by means of the excess generalization error.  相似文献   

19.
In this paper, we study the backward–forward algorithm as a splitting method to solve structured monotone inclusions, and convex minimization problems in Hilbert spaces. It has a natural link with the forward–backward algorithm and has the same computational complexity, since it involves the same basic blocks, but organized differently. Surprisingly enough, this kind of iteration arises when studying the time discretization of the regularized Newton method for maximally monotone operators. First, we show that these two methods enjoy remarkable involutive relations, which go far beyond the evident inversion of the order in which the forward and backward steps are applied. Next, we establish several convergence properties for both methods, some of which were unknown even for the forward–backward algorithm. This brings further insight into this well-known scheme. Finally, we specialize our results to structured convex minimization problems, the gradient-projection algorithms, and give a numerical illustration of theoretical interest.  相似文献   

20.
This paper is concerned with estimating the regression function fρ in supervised learning by utilizing piecewise polynomial approximations on adaptively generated partitions. The main point of interest is algorithms that with high probability are optimal in terms of the least square error achieved for a given number m of observed data. In a previous paper [1], we have developed for each β > 0 an algorithm for piecewise constant approximation which is proven to provide such optimal order estimates with probability larger than 1- m. In this paper we consider the case of higher-degree polynomials. We show that for general probability measures ρ empirical least squares minimization will not provide optimal error estimates with high probability. We go further in identifying certain conditions on the probability measure ρ which will allow optimal estimates with high probability.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号