共查询到20条相似文献,搜索用时 15 毫秒
1.
Lasso是机器学习中比较常用的一种变量选择方法,适用于具有稀疏性的回归问题.当样本量巨大或者海量的数据存储在不同的机器上时,分布式计算是减少计算时间提高效率的重要方式之一.本文在给出Lasso模型等价优化模型的基础上,将ADMM算法应用到此优化变量可分离的模型中,构造了一种适用于Lasso变量选择的分布式算法,证明了... 相似文献
2.
删失场合半参数回归模型的二阶段估计 总被引:4,自引:0,他引:4
邱瑾 《高校应用数学学报(A辑)》1998,13(3):281-288
对于半参数回归模型yi=x′iβ+g(ti)+ei,1≤i≤n,g为R1上未知函数,β为p×1维待估参数向量.本文考虑当yi被随机删失时β和g的估计.基于模型的可加性,利用综合数据法得到β的二阶段估计β~*n和g的估计g*n,并证明了它们的强相合性. 相似文献
3.
NA样本下半参数回归模型估计的强相合性 总被引:3,自引:0,他引:3
考虑固定设计下的半参数回归模型;yi=xiβ g(ti) ei,i=1,2……,n,在{ei}为Eei=0,Ee^2i=σ^2i的NA序列时,得到了一类估计的强相合性。 相似文献
4.
5.
Ashley Petersen Daniela Witten Noah Simon 《Journal of computational and graphical statistics》2016,25(4):1005-1025
We consider the problem of predicting an outcome variable using p covariates that are measured on n independent observations, in a setting in which additive, flexible, and interpretable fits are desired. We propose the fused lasso additive model (FLAM), in which each additive function is estimated to be piecewise constant with a small number of adaptively chosen knots. FLAM is the solution to a convex optimization problem, for which a simple algorithm with guaranteed convergence to a global optimum is provided. FLAM is shown to be consistent in high dimensions, and an unbiased estimator of its degrees of freedom is proposed. We evaluate the performance of FLAM in a simulation study and on two datasets. Supplemental materials are available online, and the R package flam is available on CRAN. 相似文献
6.
《Journal of computational and graphical statistics》2013,22(4):984-1006
The Lasso is a very well-known penalized regression model, which adds an L1 penalty with parameter λ1 on the coefficients to the squared error loss function. The Fused Lasso extends this model by also putting an L1 penalty with parameter λ2 on the difference of neighboring coefficients, assuming there is a natural ordering. In this article, we develop a path algorithm for solving the Fused Lasso Signal Approximator that computes the solutions for all values of λ1 and λ2. We also present an approximate algorithm that has considerable speed advantages for a moderate trade-off in accuracy. In the Online Supplement for this article, we provide proofs and further details for the methods developed in the article. 相似文献
7.
Luis Leon-Novelo 《Statistics & probability letters》2012,82(3):438-445
It is becoming more typical in regression problems today to have the situation where “p>n”, that is, where the number of covariates is greater than the number of observations. Approaches to this problem include such strategies as model selection and dimension reduction, and, of course, a Bayesian approach. However, the discrepancy between p and n can be so large, especially in genomic data, that examining the limiting case where p→∞ can be a relevant calculation. Here we look at the effect of a prior distribution on the coefficients, and in particular characterize the conditions under which, as p→∞, the prior does not overwhelm the data. Specifically, we find that the prior variance on the growing number of covariates must approach zero at rate 1/p, otherwise the prior will overwhelm the data and the posterior distribution of the regression coefficient will equal the prior distribution. 相似文献
8.
线性回归诊断的若干问题 总被引:3,自引:0,他引:3
本文对于线性回归诊断提出了几种新的模型和方法。我们首次研究了方差加权和均值漂移的混合模型,得到了相应的诊断统计量。本文还引入了罚函数方法,并以此为工具,讨论了若干有偏估计的影响度量,最后,本文提出了基于重心的诊断统计量,对于识别异常点有较好的效果。 相似文献
9.
Oscar Hernan Madrid-Padilla James Scott 《Journal of computational and graphical statistics》2017,26(3):537-546
We present an approach for penalized tensor decomposition (PTD) that estimates smoothly varying latent factors in multiway data. This generalizes existing work on sparse tensor decomposition and penalized matrix decompositions, in a manner parallel to the generalized lasso for regression and smoothing problems. Our approach presents many nontrivial challenges at the intersection of modeling and computation, which are studied in detail. An efficient coordinate-wise optimization algorithm for PTD is presented, and its convergence properties are characterized. The method is applied both to simulated data and real data on flu hospitalizations in Texas and motion-capture data from video cameras. These results show that our penalized tensor decomposition can offer major improvements on existing methods for analyzing multiway data that exhibit smooth spatial or temporal features. 相似文献
10.
Brian R. Gaines Juhyun Kim Hua Zhou 《Journal of computational and graphical statistics》2013,22(4):861-871
We compare alternative computing strategies for solving the constrained lasso problem. As its name suggests, the constrained lasso extends the widely used lasso to handle linear constraints, which allow the user to incorporate prior information into the model. In addition to quadratic programming, we employ the alternating direction method of multipliers (ADMM) and also derive an efficient solution path algorithm. Through both simulations and benchmark data examples, we compare the different algorithms and provide practical recommendations in terms of efficiency and accuracy for various sizes of data. We also show that, for an arbitrary penalty matrix, the generalized lasso can be transformed to a constrained lasso, while the converse is not true. Thus, our methods can also be used for estimating a generalized lasso, which has wide-ranging applications. Code for implementing the algorithms is freely available in both the Matlab toolbox SparseReg and the Julia package ConstrainedLasso. Supplementary materials for this article are available online. 相似文献
11.
We consider selecting a regression model, using a variant of the general-to-specific algorithm in PcGets, when there are more variables than observations. We look at the special case where the variables are single impulse dummies, one defined for each observation. We show that this setting is unproblematic if tackled appropriately, and obtain the asymptotic distribution of the mean and variance in a location-scale model, under the null that no impulses matter. Monte Carlo simulations confirm the null distributions and suggest extensions to highly non-normal cases. An erratum to this article can be found at 相似文献
12.
Nicholas A. Johnson 《Journal of computational and graphical statistics》2013,22(2):246-260
We propose a dynamic programming algorithm for the one-dimensional Fused Lasso Signal Approximator (FLSA). The proposed algorithm has a linear running time in the worst case. A similar approach is developed for the task of least squares segmentation, and simulations indicate substantial performance improvement over existing algorithms. Examples of R and C implementations are provided in the online Supplementary materials, posted on the journal web site. 相似文献
13.
In the framework of supervised learning, we prove that the iterative algorithm introduced in Umanità and Villa (2010) [22] allows us to estimate in a consistent way the relevant features of the regression function under the a priori assumption that it admits a sparse representation on a fixed dictionary. 相似文献
14.
多项式回归的建模方法比较研究 总被引:18,自引:0,他引:18
在实际工作中,人们在采用回归模型解释因果变量间的相关关系时,经常会遇到自变量之间存在幂乘关系的情况。在这种情况下,多项式回归模型成为一个合理的选择。由于多项式回归模型中自变量之间存在较强的相关关系,采用普通最小二乘回归方法来估计变量的回归系数,则会存在较大的误差。在本文中,为了提高多项式回归模型的预测准确性和可靠性,提出使用主成分分析、偏最小二乘回归建模,并采用仿真数据来比较它们的异同。 相似文献
15.
16.
半参数回归模型中二阶段估计的渐近性质 总被引:6,自引:0,他引:6
给定半参数回归模型Y=X′β g(T) e,其中β∈R^p是未知参数向量,g(t)是定义在[0,1]上的未知函数,e是随机误差,本文研究了β,,g(t)和σ2的估计量βn,gn(t)和σn^2,在适当的条件下证明了它们的渐近正态性,并给出了gn(t)的最优收敛速度。 相似文献
17.
在补偿最小二乘法则下,采用迭代法考虑半参数回归模型Li=Ai^TX+s(ti)+△i(i=1,2,……n,)得到参数及非参数的估计;接着从理论上证明了该法的可行性,并给出了误差上界及确定迭代的最大次数;最后用模拟的算例说明该法的有效性. 相似文献
18.
The Lasso is a popular model selection and estimation procedure for linear models that enjoys nice theoretical properties. In this paper, we study the Lasso estimator for fitting autoregressive time series models. We adopt a double asymptotic framework where the maximal lag may increase with the sample size. We derive theoretical results establishing various types of consistency. In particular, we derive conditions under which the Lasso estimator for the autoregressive coefficients is model selection consistent, estimation consistent and prediction consistent. Simulation study results are reported. 相似文献
19.
Wenjiang J. Fu 《Journal of computational and graphical statistics》2013,22(3):397-416
Abstract Bridge regression, a special family of penalized regressions of a penalty function Σ|βj|γ with γ ≤ 1, considered. A general approach to solve for the bridge estimator is developed. A new algorithm for the lasso (γ = 1) is obtained by studying the structure of the bridge estimators. The shrinkage parameter γ and the tuning parameter λ are selected via generalized cross-validation (GCV). Comparison between the bridge model (γ ≤ 1) and several other shrinkage models, namely the ordinary least squares regression (λ = 0), the lasso (γ = 1) and ridge regression (γ = 2), is made through a simulation study. It is shown that the bridge regression performs well compared to the lasso and ridge regression. These methods are demonstrated through an analysis of a prostate cancer data. Some computational advantages and limitations are discussed. 相似文献
20.
再论线性模型自变元选择的BIC方法相容性条件 总被引:2,自引:0,他引:2
孙道德 《高校应用数学学报(A辑)》1995,(1):26-33
在许多情况下,对线性回归模型我们感兴趣于选择足够多的重要预测变量,本文指出了1中对著名的BIC准则变量选择方法强相合性证明的错误,并重新给出了一组强相全性条件。在这组条件下,我们也证明了BIC选择方法是强相合的,这组新的条件既容易验证又应用广泛。 相似文献