首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The Lasso is a very well-known penalized regression model, which adds an L1 penalty with parameter λ1 on the coefficients to the squared error loss function. The Fused Lasso extends this model by also putting an L1 penalty with parameter λ2 on the difference of neighboring coefficients, assuming there is a natural ordering. In this article, we develop a path algorithm for solving the Fused Lasso Signal Approximator that computes the solutions for all values of λ1 and λ2. We also present an approximate algorithm that has considerable speed advantages for a moderate trade-off in accuracy. In the Online Supplement for this article, we provide proofs and further details for the methods developed in the article.  相似文献   

2.
Lasso是机器学习中比较常用的一种变量选择方法,适用于具有稀疏性的回归问题.当样本量巨大或者海量的数据存储在不同的机器上时,分布式计算是减少计算时间提高效率的重要方式之一.本文在给出Lasso模型等价优化模型的基础上,将ADMM算法应用到此优化变量可分离的模型中,构造了一种适用于Lasso变量选择的分布式算法,证明了...  相似文献   

3.
Regularization methods, including Lasso, group Lasso, and SCAD, typically focus on selecting variables with strong effects while ignoring weak signals. This may result in biased prediction, especially when weak signals outnumber strong signals. This paper aims to incorporate weak signals in variable selection, estimation, and prediction. We propose a two‐stage procedure, consisting of variable selection and postselection estimation. The variable selection stage involves a covariance‐insured screening for detecting weak signals, whereas the postselection estimation stage involves a shrinkage estimator for jointly estimating strong and weak signals selected from the first stage. We term the proposed method as the covariance‐insured screening‐based postselection shrinkage estimator. We establish asymptotic properties for the proposed method and show, via simulations, that incorporating weak signals can improve estimation and prediction performance. We apply the proposed method to predict the annual gross domestic product rates based on various socioeconomic indicators for 82 countries.  相似文献   

4.
In this article, for Lasso penalized linear regression models in high-dimensional settings, we propose a modified cross-validation (CV) method for selecting the penalty parameter. The methodology is extended to other penalties, such as Elastic Net. We conduct extensive simulation studies and real data analysis to compare the performance of the modified CV method with other methods. It is shown that the popular K-fold CV method includes many noise variables in the selected model, while the modified CV works well in a wide range of coefficient and correlation settings. Supplementary materials containing the computer code are available online.  相似文献   

5.
研究在非高斯噪声下的Lasso的高维统计分析,给出了在误差噪声满足二阶矩有限条件下,Lasso方法的高维界估计,推广了现有的关于Lasso的主要理论结果.所得结果具有一定的理论及应用价值.  相似文献   

6.

We study the asymptotic properties of a new version of the Sparse Group Lasso estimator (SGL), called adaptive SGL. This new version includes two distinct regularization parameters, one for the Lasso penalty and one for the Group Lasso penalty, and we consider the adaptive version of this regularization, where both penalties are weighted by preliminary random coefficients. The asymptotic properties are established in a general framework, where the data are dependent and the loss function is convex. We prove that this estimator satisfies the oracle property: the sparsity-based estimator recovers the true underlying sparse model and is asymptotically normally distributed. We also study its asymptotic properties in a double-asymptotic framework, where the number of parameters diverges with the sample size. We show by simulations and on real data that the adaptive SGL outperforms other oracle-like methods in terms of estimation precision and variable selection.

  相似文献   

7.
We study the properties of the Lasso in the high-dimensional partially linear model where the number of variables in the linear part can be greater than the sample size. We use truncated series expansion based on polynomial splines to approximate the nonparametric component in this model. Under a sparsity assumption on the regression coefficients of the linear component and some regularity conditions, we derive the oracle inequalities for the prediction risk and the estimation error. We also provide sufficient conditions under which the Lasso estimator is selection consistent for the variables in the linear part of the model. In addition, we derive the rate of convergence of the estimator of the nonparametric function. We conduct simulation studies to evaluate the finite sample performance of variable selection and nonparametric function estimation.  相似文献   

8.
The Lasso is a popular model selection and estimation procedure for linear models that enjoys nice theoretical properties. In this paper, we study the Lasso estimator for fitting autoregressive time series models. We adopt a double asymptotic framework where the maximal lag may increase with the sample size. We derive theoretical results establishing various types of consistency. In particular, we derive conditions under which the Lasso estimator for the autoregressive coefficients is model selection consistent, estimation consistent and prediction consistent. Simulation study results are reported.  相似文献   

9.
基于多重共线性的处理方法   总被引:2,自引:0,他引:2  
多重共线性简称共线性是多元线性回归分析中一个重要问题。消除共线性的危害一直是回归分析的一个重点。目前处理严重共线性的常用方法有以下几种:岭回归、主成分回归、逐步回归、偏最小二乘法、Lasso回归等。本文就这几种方法进行比较分析,介绍它们的优缺点,通过实例分析以便于选择合适的方法处理共线性。  相似文献   

10.
本文将多元线性回归选择变量的Lasso方法引入到指数跟踪和股指期货套利策略研究,提出运用LARS算法实现非负限制下的Lasso选择现货组合问题,为业界给出了一种选择构造现货组合的股票的新方法。实证表明:采用本文提出的方法得到的现货组合,在组合含有较少数量股票的情况下,得到较文献中已有方法更小的跟踪误差。同时,利用本文的方法对沪深300仿真交易的期现套利进行研究,得到有重要市场价值的结果。  相似文献   

11.
讨论在聚集数据情形下,具有附加信息的线性回归模型的参数估计,提出了回归系数的聚集混合估计,研究了该估计相对于Peter—Karsten估计和相对于最小二乘估计的相对效率,得到了相对效率的上、下界.  相似文献   

12.
We propose a procedure for constructing a sparse estimator of a multivariate regression coefficient matrix that accounts for correlation of the response variables. This method, which we call multivariate regression with covariance estimation (MRCE), involves penalized likelihood with simultaneous estimation of the regression coefficients and the covariance structure. An efficient optimization algorithm and a fast approximation are developed for computing MRCE. Using simulation studies, we show that the proposed method outperforms relevant competitors when the responses are highly correlated. We also apply the new method to a finance example on predicting asset returns. An R-package containing this dataset and code for computing MRCE and its approximation are available online.  相似文献   

13.
Cross-validation (CV) is often used to select the regularization parameter in high-dimensional problems. However, when applied to the sparse modeling method Lasso, CV leads to models that are unstable in high-dimensions, and consequently not suited for reliable interpretation. In this article, we propose a model-free criterion ESCV based on a new estimation stability (ES) metric and CV. Our proposed ESCV finds a smaller and locally ES-optimal model smaller than the CV choice so that it fits the data and also enjoys estimation stability property. We demonstrate that ESCV is an effective alternative to CV at a similar easily parallelizable computational cost. In particular, we compare the two approaches with respect to several performance measures when applied to the Lasso on both simulated and real datasets. For dependent predictors common in practice, our main finding is that ESCV cuts down false positive rates often by a large margin, while sacrificing little of true positive rates. ESCV usually outperforms CV in terms of parameter estimation while giving similar performance as CV in terms of prediction. For the two real datasets from neuroscience and cell biology, the models found by ESCV are less than half of the model sizes by CV, but preserves CV's predictive performance and corroborates with subject knowledge and independent work. We also discuss some regularization parameter alignment issues that come up in both approaches. Supplementary materials are available online.  相似文献   

14.
针对Lasso方法与支持向量机两者的联系与各自的优势,给出了基于Lasso与支持向量机的串联型、并联型和嵌入型三种组合预测,并将它们运用到我国粮食价格预测中.实证结果表明,与单一预测方法的预测结果相比,基于Lasso方法与支持向量机的串联型组合预测和嵌入型组合预测具有更高的预测精度.  相似文献   

15.
Finite mixture regression models are useful for modeling the relationship between response and predictors arising from different subpopulations. In this article, we study high-dimensional predictors and high-dimensional response and propose two procedures to cluster observations according to the link between predictors and the response. To reduce the dimension, we propose to use the Lasso estimator, which takes into account the sparsity and a maximum likelihood estimator penalized by the rank, to take into account the matrix structure. To choose the number of components and the sparsity level, we construct a collection of models, varying those two parameters and we select a model among this collection with a non-asymptotic criterion. We extend these procedures to functional data, where predictors and responses are functions. For this purpose, we use a wavelet-based approach. For each situation, we provide algorithms and apply and evaluate our methods both on simulated and real datasets, to understand how they work in practice.  相似文献   

16.
We present a version of Catoni's “progressive mixture estimator” [3] suited for a general regression framework. Following basically Catoni's steps, we derive strong non-asymptotic upper bounds for the Kullback-Leibler risk in this framework. We give a more explicit form for this bound when the models considered are regression trees, present a modified version of the estimator in an extended framework and propose an approximate computation using a Metropolis algorithm.  相似文献   

17.
一种有偏估计与最小二乘估计的两种新的相对效率   总被引:1,自引:0,他引:1  
考察了线性回归模型的回归系数的一类有偏估计,在均方误差矩阵准则下将其与最小二乘估计(LSE)进行比较,导出了这类有偏估计相对于LSE的两种新的相对效率的上、下界.  相似文献   

18.
这篇文章我们研究了回归系数的最佳线性无偏估计. 在加权平衡损失函数下, 我们得到了回归系数的最佳线性无偏估计. 同时提出了度量最佳线性无偏估计和最小二乘估计的相对效率. 并且我们给出了它们的上下界.  相似文献   

19.
In high‐dimensional data settings where p  ? n , many penalized regularization approaches were studied for simultaneous variable selection and estimation. However, with the existence of covariates with weak effect, many existing variable selection methods, including Lasso and its generations, cannot distinguish covariates with weak and no contribution. Thus, prediction based on a subset model of selected covariates only can be inefficient. In this paper, we propose a post selection shrinkage estimation strategy to improve the prediction performance of a selected subset model. Such a post selection shrinkage estimator (PSE) is data adaptive and constructed by shrinking a post selection weighted ridge estimator in the direction of a selected candidate subset. Under an asymptotic distributional quadratic risk criterion, its prediction performance is explored analytically. We show that the proposed post selection PSE performs better than the post selection weighted ridge estimator. More importantly, it improves the prediction performance of any candidate subset model selected from most existing Lasso‐type variable selection methods significantly. The relative performance of the post selection PSE is demonstrated by both simulation studies and real‐data analysis. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

20.
In many biomedical studies, identifying effects of covariate interactions on survival is a major goal. Important examples are treatment–subgroup interactions in clinical trials, and gene–gene or gene–environment interactions in genomic studies. A common problem when implementing a variable selection algorithm in such settings is the requirement that the model must satisfy the strong heredity constraint, wherein an interaction may be included in the model only if the interaction’s component variables are included as main effects. We propose a modified Lasso method for the Cox regression model that adaptively selects important single covariates and pairwise interactions while enforcing the strong heredity constraint. The proposed method is based on a modified log partial likelihood including two adaptively weighted penalties, one for main effects and one for interactions. A two-dimensional tuning parameter for the penalties is determined by generalized cross-validation. Asymptotic properties are established, including consistency and rate of convergence, and it is shown that the proposed selection procedure has oracle properties, given proper choice of regularization parameters. Simulations illustrate that the proposed method performs reliably across a range of different scenarios.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号