首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 23 毫秒
1.
Abstract

We investigate a new method for regression trees which obtains estimates and predictions subject to constraints on the coefficients representing the effects of splits in the tree. The procedure leads to both shrinking of the node estimates and pruning of branches in the tree and for some problems gives better predictions than cost-complexity pruning used in the classification and regression tree (CART) algorithm. The new method is based on the least absolute shrinkage and selection operator (LASSO) method developed by Tibshirani.  相似文献   

2.
In high‐dimensional data settings where p  ? n , many penalized regularization approaches were studied for simultaneous variable selection and estimation. However, with the existence of covariates with weak effect, many existing variable selection methods, including Lasso and its generations, cannot distinguish covariates with weak and no contribution. Thus, prediction based on a subset model of selected covariates only can be inefficient. In this paper, we propose a post selection shrinkage estimation strategy to improve the prediction performance of a selected subset model. Such a post selection shrinkage estimator (PSE) is data adaptive and constructed by shrinking a post selection weighted ridge estimator in the direction of a selected candidate subset. Under an asymptotic distributional quadratic risk criterion, its prediction performance is explored analytically. We show that the proposed post selection PSE performs better than the post selection weighted ridge estimator. More importantly, it improves the prediction performance of any candidate subset model selected from most existing Lasso‐type variable selection methods significantly. The relative performance of the post selection PSE is demonstrated by both simulation studies and real‐data analysis. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

3.
Variable selection is an important aspect of high-dimensional statistical modeling, particularly in regression and classification. In the regularization framework, various penalty functions are used to perform variable selection by putting relatively large penalties on small coefficients. The L1 penalty is a popular choice because of its convexity, but it produces biased estimates for the large coefficients. The L0 penalty is attractive for variable selection because it directly penalizes the number of non zero coefficients. However, the optimization involved is discontinuous and non convex, and therefore it is very challenging to implement. Moreover, its solution may not be stable. In this article, we propose a new penalty that combines the L0 and L1 penalties. We implement this new penalty by developing a global optimization algorithm using mixed integer programming (MIP). We compare this combined penalty with several other penalties via simulated examples as well as real applications. The results show that the new penalty outperforms both the L0 and L1 penalties in terms of variable selection while maintaining good prediction accuracy.  相似文献   

4.
《Optimization》2012,61(6):843-853
In this paper we consider different classes of noneonvex quadratic problems that can be solved in polynomial time. We present an algorithm for the problem of minimizing the product of two linear functions over a polyhedron P in R n The complexity of the algorithm depends on the number of vertices of the projection of P onto the R 2 space. In the worst-case this algorithm requires an exponential number of steps but its expected computational time complexity is polynomial. In addition, we give a characterization for the number of isolated local minimum areas for problems on this form.

Furthermore, we consider indefinite quadratic problems with variables restricted to be nonnegative. These problems can be solved in polynomial time if the number of negative eigenvalues of the associated symmetric matrix is fixed.  相似文献   

5.
In this paper, a self-weighted composite quantile regression estimation procedure is developed to estimate unknown parameter in an infinite variance autoregressive (IVAR) model. The proposed estimator is asymptotically normal and more efficient than a single quantile regression estimator. At the same time, the adaptive least absolute shrinkage and selection operator (LASSO) for variable selection are also suggested. We show that the adaptive LASSO based on the self-weighted composite quantile regression enjoys the oracle properties. Simulation studies and a real data example are conducted to examine the performance of the proposed approaches.  相似文献   

6.
We consider the integer program P→max cx|Ax=y;xNn . Using the generating function of an associated counting problem, and a generalized residue formula of Brion and Vergne, we explicitly relate P with its continuous linear programming (LP) analogue and provide a characterization of its optimal value. In particular, dual variables λRm have discrete analogues zCm, related in a simple manner. Moreover, both optimal values of P and the LP obey the same formula, using z for P and |z| for the LP. One retrieves (and refines) the so-called group-relaxations of Gomory which, in this dual approach, arise naturally from a detailed analysis of a generalized residue formula of Brion and Vergne. Finally, we also provide an explicit formulation of a dual problem P*, the analogue of the dual LP in linear programming.  相似文献   

7.
Abstract

This article makes three contributions. First, we introduce a computationally efficient estimator for the component functions in additive nonparametric regression exploiting a different motivation from the marginal integration estimator of Linton and Nielsen. Our method provides a reduction in computation of order n which is highly significant in practice. Second, we define an efficient estimator of the additive components, by inserting the preliminary estimator into a backfitting˙ algorithm but taking one step only, and establish that it is equivalent, in various senses, to the oracle estimator based on knowing the other components. Our two-step estimator is minimax superior to that considered in Opsomer and Ruppert, due to its better bias. Third, we define a bootstrap algorithm for computing pointwise confidence intervals and show that it achieves the correct coverage.  相似文献   

8.
We propose an algorithm to estimate the common density s of a stationary process X 1, ..., X n . We suppose that the process is either β or τ-mixing. We provide a model selection procedure based on a generalization of Mallows’ C p and we prove oracle inequalities for the selected estimator under a few prior assumptions on the collection of models and on the mixing coefficients. We prove that our estimator is adaptive over a class of Besov spaces, namely, we prove that it achieves the same rates of convergence as in the i.i.d. framework.   相似文献   

9.
In this paper we give a new convergence analysis of a projective scaling algorithm. We consider a long-step affine scaling algorithm applied to a homogeneous linear programming problem obtained from the original linear programming problem. This algorithm takes a fixed fraction λ≤2/3 of the way towards the boundary of the nonnegative orthant at each iteration. The iteration sequence for the original problem is obtained by pulling back the homogeneous iterates onto the original feasible region with a conical projection, which generates the same search direction as the original projective scaling algorithm at each iterate. The recent convergence results for the long-step affine scaling algorithm by the authors are applied to this algorithm to obtain some convergence results on the projective scaling algorithm. Specifically, we will show (i) polynomiality of the algorithm with complexities of O(nL) and O(n 2 L) iterations for λ<2/3 and λ=2/3, respectively; (ii) global covnergence of the algorithm when the optimal face is unbounded; (iii) convergence of the primal iterates to a relative interior point of the optimal face; (iv) convergence of the dual estimates to the analytic center of the dual optimal face; and (v) convergence of the reduction rate of the objective function value to 1−λ.  相似文献   

10.
We will propose an outer-approximation (cutting plane) method for minimizing a function f X subject to semi-definite constraints on the variables XR n. A number of efficient algorithms have been proposed when the objective function is linear. However, there are very few practical algorithms when the objective function is nonlinear. An algorithm to be proposed here is a kind of outer-approximation(cutting plane) method, which has been successfully applied to several low rank global optimization problems including generalized convex multiplicative programming problems and generalized linear fractional programming problems, etc. We will show that this algorithm works well when f is convex and n is relatively small. Also, we will provide the proof of its convergence under various technical assumptions.  相似文献   

11.
In this paper we introduce an algorithm for the construction of compactly supported interpolating scaling vectors on ℝ d with certain symmetry properties. In addition, we give an explicit construction method for corresponding symmetric dual scaling vectors and multiwavelets. As the main ingredients of our recipe we derive some implementable conditions for accuracy, symmetry, and biorthogonality of a scaling vector in terms of its mask. Our method is substantiated by several bivariate examples for quincunx and box-spline dilation matrices.   相似文献   

12.
We study the distributions of the LASSO, SCAD, and thresholding estimators, in finite samples and in the large-sample limit. The asymptotic distributions are derived for both the case where the estimators are tuned to perform consistent model selection and for the case where the estimators are tuned to perform conservative model selection. Our findings complement those of Knight and Fu [K. Knight, W. Fu, Asymptotics for lasso-type estimators, Annals of Statistics 28 (2000) 1356–1378] and Fan and Li [J. Fan, R. Li, Variable selection via non-concave penalized likelihood and its oracle properties, Journal of the American Statistical Association 96 (2001) 1348–1360]. We show that the distributions are typically highly non-normal regardless of how the estimator is tuned, and that this property persists in large samples. The uniform convergence rate of these estimators is also obtained, and is shown to be slower than n−1/2 in case the estimator is tuned to perform consistent model selection. An impossibility result regarding estimation of the estimators’ distribution function is also provided.  相似文献   

13.
Automatic model selection for partially linear models   总被引:1,自引:0,他引:1  
We propose and study a unified procedure for variable selection in partially linear models. A new type of double-penalized least squares is formulated, using the smoothing spline to estimate the nonparametric part and applying a shrinkage penalty on parametric components to achieve model parsimony. Theoretically we show that, with proper choices of the smoothing and regularization parameters, the proposed procedure can be as efficient as the oracle estimator [J. Fan, R. Li, Variable selection via nonconcave penalized likelihood and its oracle properties, Journal of American Statistical Association 96 (2001) 1348–1360]. We also study the asymptotic properties of the estimator when the number of parametric effects diverges with the sample size. Frequentist and Bayesian estimates of the covariance and confidence intervals are derived for the estimators. One great advantage of this procedure is its linear mixed model (LMM) representation, which greatly facilitates its implementation by using standard statistical software. Furthermore, the LMM framework enables one to treat the smoothing parameter as a variance component and hence conveniently estimate it together with other regression coefficients. Extensive numerical studies are conducted to demonstrate the effective performance of the proposed procedure.  相似文献   

14.
本文研究测量误差模型的自适应LASSO(least absolute shrinkage and selection operator)变量选择和系数估计问题.首先分别给出协变量有测量误差时的线性模型和部分线性模型自适应LASSO参数估计量,在一些正则条件下研究估计量的渐近性质,并且证明选择合适的调整参数,自适应LASSO参数估计量具有oracle性质.其次讨论估计的实现算法及惩罚参数和光滑参数的选择问题.最后通过模拟和一个实际数据分析研究了自适应LASSO变量选择方法的表现,结果表明,变量选择和参数估计效果良好.  相似文献   

15.
A random model approach for the LASSO   总被引:1,自引:0,他引:1  
The least absolute selection and shrinkage operator (LASSO) is a method of estimation for linear models similar to ridge regression. It shrinks the effect estimates, potentially shrinking some to be identically zero. The amount of shrinkage is governed by a single parameter. Using a random model formulation of the LASSO, this parameter can be specified as the ratio of dispersion parameters. These parameters are estimated using an approximation to the marginal likelihood of the observed data. The observed score equations from the approximation are biased and hence are adjusted by subtracting an empirical estimate of the expected value. After estimation, the model effects can be tested (via simulation) as the distribution of the observed data given that all model effects are zero is known. Two related simulation studies are presented that show that dispersion parameter estimation results in effect estimates that are competitive with other estimation methods (including other LASSO methods).  相似文献   

16.
Abstract Consider a partially linear regression model with an unknown vector parameter β,an unknownfunction g(.),and unknown heteroscedastic error variances.Chen,You proposed a semiparametric generalizedleast squares estimator(SGLSE)for β,which takes the heteroscedasticity into account to increase efficiency.Forinference based on this SGLSE,it is necessary to construct a consistent estimator for its asymptotic covariancematrix.However,when there exists within-group correlation, the traditional delta method and the delete-1jackknife estimation fail to offer such a consistent estimator.In this paper, by deleting grouped partial residualsa delete-group jackknife method is examined.It is shown that the delete-group jackknife method indeed canprovide a consistent estimator for the asymptotic covariance matrix in the presence of within-group correlations.This result is an extension of that in[21].  相似文献   

17.
We investigate a robust penalized logistic regression algorithm based on a minimum distance criterion. Influential outliers are often associated with the explosion of parameter vector estimates, but in the context of standard logistic regression, the bias due to outliers always causes the parameter vector to implode, that is, shrink toward the zero vector. Thus, using LASSO-like penalties to perform variable selection in the presence of outliers can result in missed detections of relevant covariates. We show that by choosing a minimum distance criterion together with an elastic net penalty, we can simultaneously find a parsimonious model and avoid estimation implosion even in the presence of many outliers in the important small n large p situation. Minimizing the penalized minimum distance criterion is a challenging problem due to its nonconvexity. To meet the challenge, we develop a simple and efficient MM (majorization–minimization) algorithm that can be adapted gracefully to the small n large p context. Performance of our algorithm is evaluated on simulated and real datasets. This article has supplementary materials available online.  相似文献   

18.
We introduce an algorithm which, in the context of nonlinear regression on vector-valued explanatory variables, aims to choose those combinations of vector components that provide best prediction. The algorithm is constructed specifically so that it devotes attention to components that might be of relatively little predictive value by themselves, and so might be ignored by more conventional methodology for model choice, but which, in combination with other difficult-to-find components, can be particularly beneficial for prediction. The design of the algorithm is also motivated by a desire to choose vector components that become redundant once appropriate combinations of other, more relevant components are selected. Our theoretical arguments show these goals are met in the sense that, with probability converging to 1 as sample size increases, the algorithm correctly determines a small, fixed number of variables on which the regression mean, g say, depends, even if dimension diverges to infinity much faster than n. Moreover, the estimated regression mean based on those variables approximates g with an error that, to first order, equals the error which would arise if we were told in advance the correct variables. In this sense, the estimator achieves oracle performance. Our numerical work indicates that the algorithm is suitable for very high dimensional problems, where it keeps computational labor in check by using a novel sequential argument, and also for more conventional prediction problems, where dimension is relatively low.  相似文献   

19.
Given a sequence of independent random variables with a common continuous distribution, we consider the online decision problem where one seeks to minimize the expected value of the time that is needed to complete the selection of a monotone increasing subsequence of a prespecified length n. This problem is dual to some online decision problems that have been considered earlier, and this dual problem has some notable advantages. In particular, the recursions and equations of optimality lead with relative ease to asymptotic formulas for mean and variance of the minimal selection time. © 2016 Wiley Periodicals, Inc. Random Struct. Alg., 49, 235–252, 2016  相似文献   

20.
Consider a repeated measurement partially linear regression model with an unknown vector parameter β, an unknown function g(.), and unknown heteroscedastic error variances. In order to improve the semiparametric generalized least squares estimator (SGLSE) of β, we propose an iterative weighted semiparametric least squares estimator (IWSLSE) and show that it improves upon the SGLSE in terms of asymptotic covariance matrix. An adaptive procedure is given to determine the number of iterations. We also show that when the number of replicates is less than or equal to two, the IWSLSE can not improve upon the SGLSE. These results are generalizations of those in [2] to the case of semiparametric regressions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号