首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Abstract

Proposed by Tibshirani, the least absolute shrinkage and selection operator (LASSO) estimates a vector of regression coefficients by minimizing the residual sum of squares subject to a constraint on the l 1-norm of the coefficient vector. The LASSO estimator typically has one or more zero elements and thus shares characteristics of both shrinkage estimation and variable selection. In this article we treat the LASSO as a convex programming problem and derive its dual. Consideration of the primal and dual problems together leads to important new insights into the characteristics of the LASSO estimator and to an improved method for estimating its covariance matrix. Using these results we also develop an efficient algorithm for computing LASSO estimates which is usable even in cases where the number of regressors exceeds the number of observations. An S-Plus library based on this algorithm is available from StatLib.  相似文献   

2.
Smart transportation technologies require real‐time traffic prediction to be both fast and scalable to full urban networks. We discuss a method that is able to meet this challenge while accounting for nonlinear traffic dynamics and space‐time dependencies of traffic variables. Nonlinearity is taken into account by a union of non‐overlapping linear regimes characterized by a sequence of temporal thresholds. In each regime, for each measurement location, a penalized estimation scheme, namely the adaptive absolute shrinkage and selection operator (LASSO), is implemented to perform model selection and coefficient estimation simultaneously. Both the robust to outliers least absolute deviation estimates and conventional LASSO estimates are considered. The methodology is illustrated on 5‐minute average speed data from three highway networks. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

3.
In this paper, a self-weighted composite quantile regression estimation procedure is developed to estimate unknown parameter in an infinite variance autoregressive (IVAR) model. The proposed estimator is asymptotically normal and more efficient than a single quantile regression estimator. At the same time, the adaptive least absolute shrinkage and selection operator (LASSO) for variable selection are also suggested. We show that the adaptive LASSO based on the self-weighted composite quantile regression enjoys the oracle properties. Simulation studies and a real data example are conducted to examine the performance of the proposed approaches.  相似文献   

4.
Single-index models have found applications in econometrics and biometrics, where multidimensional regression models are often encountered. This article proposes a nonparametric estimation approach that combines wavelet methods for nonequispaced designs with Bayesian models. We consider a wavelet series expansion of the unknown regression function and set prior distributions for the wavelet coefficients and the other model parameters. To ensure model identifiability, the direction parameter is represented via its polar coordinates. We employ ad hoc hierarchical mixture priors that perform shrinkage on wavelet coefficients and use Markov chain Monte Carlo methods for a posteriori inference. We investigate an independence-type Metropolis-Hastings algorithm to produce samples for the direction parameter. Our method leads to simultaneous estimates of the link function and of the index parameters. We present results on both simulated and real data, where we look at comparisons with other methods.  相似文献   

5.
This paper introduces an estimation method based on Least Squares Support Vector Machines (LS-SVMs) for approximating time-varying as well as constant parameters in deterministic parameter-affine delay differential equations (DDEs). The proposed method reduces the parameter estimation problem to an algebraic optimization problem. Thus, as opposed to conventional approaches, it avoids iterative simulation of the given dynamical system and therefore a significant speedup can be achieved in the parameter estimation procedure. The solution obtained by the proposed approach can be further utilized for initialization of the conventional nonconvex optimization methods for parameter estimation of DDEs. Approximate LS-SVM based models for the state and its derivative are first estimated from the observed data. These estimates are then used for estimation of the unknown parameters of the model. Numerical results are presented and discussed for demonstrating the applicability of the proposed method.  相似文献   

6.
The non-parametric estimation of average causal effects in observational studies often relies on controlling for confounding covariates through smoothing regression methods such as kernel, splines or local polynomial regression. Such regression methods are tuned via smoothing parameters which regulates the amount of degrees of freedom used in the fit. In this paper we propose data-driven methods for selecting smoothing parameters when the targeted parameter is an average causal effect. For this purpose, we propose to estimate the exact expression of the mean squared error of the estimators. Asymptotic approximations indicate that the smoothing parameters minimizing this mean squared error converges to zero faster than the optimal smoothing parameter for the estimation of the regression functions. In a simulation study we show that the proposed data-driven methods for selecting the smoothing parameters yield lower empirical mean squared error than other methods available such as, e.g., cross-validation.  相似文献   

7.
金融时间序列长记忆参数的半参数估计方法以频域分析为主,带宽选择是其中必不可少的关键环节。不同的带宽可能给出差异明显的长记忆参数估计值,甚至产生矛盾的结论,进而影响时间序列平稳性的判断。本文提出一种两步法,用于金融时间序列长记忆估计的半参数方法的带宽选择,并进一步对长记忆参数进行估计:首先,为了克服半参数方法忽略短期结构的不足,通过信息准则判断ARFIMA(p,d,q)过程的短记忆结构;其次,用短记忆模型拟合差分后的序列,根据拟合效果确定选择带宽及长记忆参数估计值。数值模拟显示以长记忆参数估计值均方根误差最小为标准,两步法优于其他方法。经上证50指数已实现波动率日数据的实证检验,两步法在长记忆模型中的预测误差最小;与短记忆模型相比,两步法在中期提前预测步长上具有优势。  相似文献   

8.
This paper investigates the estimation in a class of single-index varying coefficient regression model when some covariates are contaminated with measurement errors. A bias-corrected least square procedure based on the observed data is proposed. By replacing the nonparametric single index part with a local linear approximation, an iterative algorithm for estimating the index parameter is proposed. More importantly, a special case is identified in which the naive procedure provides consistent estimates for the single index parameters. Large sample properties of the proposed estimators are established. The finite sample performance of the proposed estimators are evaluated by simulation studies.  相似文献   

9.
自从Box和Meyer首次提出无重复因析试验中散度效应的识别和估计问题, 各种散度效应的估计方法(包括迭代和非迭代)被提出. 特别地, Brenneman 和Nair 给出了这些方法的一个综述, 并且他们验证了改进的Harvey方法优于其它的方法.本文中对于对数线性模型, 一个基于多个位置模型残差平均的非迭代的散度效应估计方法在模型选择阶段被提出. 在大多数的模拟实验模型中, 本文方法具有比MH方法更小的均方误差, 且它可以应用于MH方法不适用的0或小的绝对残差情形. 我们也考虑了这个估计的理论性质, 并进行了实例分析.  相似文献   

10.
主要考虑了生长曲线模型中的参数矩阵的估计.首先基于Potthoff-Roy变换后的生长曲线模型,采用不同的惩罚函数:Hard Thresholding函数,LASSO,ENET,改进LASSO,SACD给出了参数矩阵的惩罚最小二乘估计.接着对不做变换的生长曲线模型,直接定义其惩罚最小二乘估计,基于Nelder-Mead法给出了估计的数值解算法.最后对提出的参数估计方法进行了数据模拟.结果表明自适应LASSO在估计方面效果比较好.  相似文献   

11.
In multinomial logit models, the identifiability of parameter estimates is typically obtained by side constraints that specify one of the response categories as reference category. When parameters are penalized, shrinkage of estimates should not depend on the reference category. In this paper we investigate ridge regression for the multinomial logit model with symmetric side constraints, which yields parameter estimates that are independent of the reference category. In simulation studies the results are compared with the usual maximum likelihood estimates and an application to real data is given.  相似文献   

12.
Partially linear model is a class of commonly used semiparametric models, this paper focus on variable selection and parameter estimation for partially linear models via adaptive LASSO method. Firstly, based on profile least squares and adaptive LASSO method, the adaptive LASSO estimator for partially linear models are constructed, and the selections of penalty parameter and bandwidth are discussed. Under some regular conditions, the consistency and asymptotic normality for the estimator are investigated, and it is proved that the adaptive LASSO estimator has the oracle properties. The proposed method can be easily implemented. Finally a Monte Carlo simulation study is conducted to assess the finite sample performance of the proposed variable selection procedure, results show the adaptive LASSO estimator behaves well.  相似文献   

13.
Standard errors for the maximum likelihood estimates of the regression parameters in the logistic-proportional-hazards cure model are proposed using an approximate profile likelihood approach and a nonparametric likelihood. Two methods are given and are compared with the standard errors obtained from the inverse of the joint observed information matrix of the regression parameters and the nuisance hazard parameters. The observed information matrix is derived and is shown to be an approximation of the conditional information matrix of the regression parameters given the hazard parameters. Simulations indicate that the standard errors obtained from the inverse of the observed information matrix based on the profile likelihood and the full likelihood are comparable and appropriate. The coverage rates for the logistic regression parameter are generally good. The proportional hazards regression parameter show reasonable coverage rates under ideal conditions but lower coverage rates when the incidence proportion is low or when censoring is heavy. The three methods are applied to a data set to investigate the effects of radiation therapy on tonsil cancer.  相似文献   

14.
For analyzing correlated binary data with high-dimensional covariates,we,in this paper,propose a two-stage shrinkage approach.First,we construct a weighted least-squares(WLS) type function using a special weighting scheme on the non-conservative vector field of the generalized estimating equations(GEE) model.Second,we define a penalized WLS in the spirit of the adaptive LASSO for simultaneous variable selection and parameter estimation.The proposed procedure enjoys the oracle properties in high-dimensional framework where the number of parameters grows to infinity with the number of clusters.Moreover,we prove the consistency of the sandwich formula of the covariance matrix even when the working correlation matrix is misspecified.For the selection of tuning parameter,we develop a consistent penalized quadratic form(PQF) function criterion.The performance of the proposed method is assessed through a comparison with the existing methods and through an application to a crossover trial in a pain relief study.  相似文献   

15.
This paper deal with the classical and Bayesian estimation for two parameter exponential distribution having scale and location parameters with randomly censored data. The censoring time is also assumed to follow a two parameter exponential distribution with different scale but same location parameter. The main stress is on the location parameter in this paper. This parameter has not yet been studied with random censoring in literature. Fitting and using exponential distribution on the range \((0, \infty )\), specially when the minimum observation in the data set is significantly large, will give estimates far from accurate. First we obtain the maximum likelihood estimates of the unknown parameters with their variances and asymptotic confidence intervals. Some other classical methods of estimation such as method of moment, L-moments and least squares are also employed. Next, we discuss the Bayesian estimation of the unknown parameters using Gibbs sampling procedures under generalized entropy loss function with inverted gamma priors and Highest Posterior Density credible intervals. We also consider some reliability and experimental characteristics and their estimates. A Monte Carlo simulation study is performed to compare the proposed estimates. Two real data examples are given to illustrate the importance of the location parameter.  相似文献   

16.
本文研究测量误差模型的自适应LASSO(least absolute shrinkage and selection operator)变量选择和系数估计问题.首先分别给出协变量有测量误差时的线性模型和部分线性模型自适应LASSO参数估计量,在一些正则条件下研究估计量的渐近性质,并且证明选择合适的调整参数,自适应LASSO参数估计量具有oracle性质.其次讨论估计的实现算法及惩罚参数和光滑参数的选择问题.最后通过模拟和一个实际数据分析研究了自适应LASSO变量选择方法的表现,结果表明,变量选择和参数估计效果良好.  相似文献   

17.
针对现实生活中大量数据存在偏斜的情况,构建偏正态数据下的众数回归模型.又加之数据的缺失常有发生,采用插补方法处理缺失数据集,为比较插补效果,考虑对响应变量随机缺失情形进行统计推断研究.利用高斯牛顿迭代法给出众数回归模型参数的极大似然估计,比较该模型在均值插补,回归插补,众数插补三种插补条件下的插补效果.随机模拟和实例分...  相似文献   

18.
The MAPK pathway is one of the well-known systems in oncogene researches of eukaryotes due to its important role in cell life. In this study, we perform the parameter estimation of a realistic MAPK system by using western blotting data. In inference, we use the modified diffusion bridge algorithm with data augmentation technique by modelling the realistically complex system via the Euler–Maruyama approximation. This approximation, which is the discretized version of the diffusion model, can be seen as an alternative OR approach with respect to the (hidden) Markov chain method in stochastic modelling of the biochemical systems where the data can be fully or partially observed and the time-course measurements are though to be collected at small time steps. Hereby, the modified diffusion bridge technique, which is based on the Markov Chain Monte Carlo (MCMC) methods, enables us to accurately estimate the model parameters, presented as the stochastic reaction rate constants, of the diffusion model under high dimensional systems despite loss in computational demand. In the estimation of the parameters, due to the complexity in the decision-making problems of the MCMC updates at different stages, we face with the dependency challenges. We unravel them by checking the singularity of the system in every stage of updates. In modelling, we also assume with/without-measurement error approaches in all states. But in order to evaluate the performance of both models, we initially implement them in a toy system. From the results, we observe that the model with measurement error performs better than the model without measurement error in terms of the mixing features of the MCMC runs and the accuracy of estimates, thereby, it is used for the parameter estimation of the realistic MAPK pathway. From the outcomes, we consider that the suggested approach can be seen as a promising alternative method in inference of parameters via different OR techniques in system biology.  相似文献   

19.
Bayesian networks with mixtures of truncated exponentials (MTEs) support efficient inference algorithms and provide a flexible way of modeling hybrid domains (domains containing both discrete and continuous variables). On the other hand, estimating an MTE from data has turned out to be a difficult task, and most prevalent learning methods treat parameter estimation as a regression problem. The drawback of this approach is that by not directly attempting to find the parameter estimates that maximize the likelihood, there is no principled way of performing subsequent model selection using those parameter estimates. In this paper we describe an estimation method that directly aims at learning the parameters of an MTE potential following a maximum likelihood approach. Empirical results demonstrate that the proposed method yields significantly better likelihood results than existing regression-based methods. We also show how model selection, which in the case of univariate MTEs amounts to partitioning the domain and selecting the number of exponential terms, can be performed using the BIC score.  相似文献   

20.
This article deals with non-linear model parameter estimation from experimental data. As for non-linear models a rigorous identifiability analysis is difficult to perform, parameter estimation is performed in such a way that uncertainty in the estimated parameter values is represented by the range of model use results when the model is used for a certain purpose. Using this approach, the article presents a simulation study where the objective is to discover whether the estimation of model parameters can be improved, so that a small enough range of model use results is obtained. The results of the study indicate that from plant measurements available for the estimation of model parameters, it is possible to extract data that are important for the estimation of model parameters relative to a certain model use. If these data are improved by a proper measurement campaign (e.g. proper choice of measured variables, better accuracy, higher measurement frequency) it is to be expected that a valid model for a certain model use will be obtained. The simulation study is performed for an activated sludge model from wastewater treatment, while the estimation of model parameters is done by Monte Carlo simulation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号