首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We study the properties of the Lasso in the high-dimensional partially linear model where the number of variables in the linear part can be greater than the sample size. We use truncated series expansion based on polynomial splines to approximate the nonparametric component in this model. Under a sparsity assumption on the regression coefficients of the linear component and some regularity conditions, we derive the oracle inequalities for the prediction risk and the estimation error. We also provide sufficient conditions under which the Lasso estimator is selection consistent for the variables in the linear part of the model. In addition, we derive the rate of convergence of the estimator of the nonparametric function. We conduct simulation studies to evaluate the finite sample performance of variable selection and nonparametric function estimation.  相似文献   

2.
Semiparametric linear transformation models have received much attention due to their high flexibility in modeling survival data. A useful estimating equation procedure was recently proposed by Chen et al. (2002) [21] for linear transformation models to jointly estimate parametric and nonparametric terms. They showed that this procedure can yield a consistent and robust estimator. However, the problem of variable selection for linear transformation models has been less studied, partially because a convenient loss function is not readily available under this context. In this paper, we propose a simple yet powerful approach to achieve both sparse and consistent estimation for linear transformation models. The main idea is to derive a profiled score from the estimating equation of Chen et al. [21], construct a loss function based on the profile scored and its variance, and then minimize the loss subject to some shrinkage penalty. Under regularity conditions, we have shown that the resulting estimator is consistent for both model estimation and variable selection. Furthermore, the estimated parametric terms are asymptotically normal and can achieve a higher efficiency than that yielded from the estimation equations. For computation, we suggest a one-step approximation algorithm which can take advantage of the LARS and build the entire solution path efficiently. Performance of the new procedure is illustrated through numerous simulations and real examples including one microarray data.  相似文献   

3.
The varying-coefficient model is flexible and powerful for modeling the dynamic changes of regression coefficients. We study the problem of variable selection and estimation in this model in the sparse, high-dimensional case. We develop a concave group selection approach for this problem using basis function expansion and study its theoretical and empirical properties. We also apply the group Lasso for variable selection and estimation in this model and study its properties. Under appropriate conditions, we show that the group least absolute shrinkage and selection operator (Lasso) selects a model whose dimension is comparable to the underlying model, regardless of the large number of unimportant variables. In order to improve the selection results, we show that the group minimax concave penalty (MCP) has the oracle selection property in the sense that it correctly selects important variables with probability converging to one under suitable conditions. By comparison, the group Lasso does not have the oracle selection property. In the simulation parts, we apply the group Lasso and the group MCP. At the same time, the two approaches are evaluated using simulation and demonstrated on a data example.  相似文献   

4.
In this article we study the simultaneous estimation of the means in Poisson decomposable graphical models. We derive some classes of estimators which improve on the maximum likelihood estimator under the normalized squared losses. Our estimators are based on the argument in Chou [Simultaneous estimation in discrete multivariate exponential families, Ann. Statist. 19 (1991) 314-328.] and shrink the maximum likelihood estimator depending on the marginal frequencies of variables forming a complete subgraph of the conditional independence graph.  相似文献   

5.
Parameters of Gaussian multivariate models are often estimated using the maximum likelihood approach. In spite of its merits, this methodology is not practical when the sample size is very large, as, for example, in the case of massive georeferenced data sets. In this paper, we study the asymptotic properties of the estimators that minimize three alternatives to the likelihood function, designed to increase the computational efficiency. This is achieved by applying the information sandwich technique to expansions of the pseudo-likelihood functions as quadratic forms of independent normal random variables. Theoretical calculations are given for a first-order autoregressive time series and then extended to a two-dimensional autoregressive process on a lattice. We compare the efficiency of the three estimators to that of the maximum likelihood estimator as well as among themselves, using numerical calculations of the theoretical results and simulations.  相似文献   

6.

We study the asymptotic properties of a new version of the Sparse Group Lasso estimator (SGL), called adaptive SGL. This new version includes two distinct regularization parameters, one for the Lasso penalty and one for the Group Lasso penalty, and we consider the adaptive version of this regularization, where both penalties are weighted by preliminary random coefficients. The asymptotic properties are established in a general framework, where the data are dependent and the loss function is convex. We prove that this estimator satisfies the oracle property: the sparsity-based estimator recovers the true underlying sparse model and is asymptotically normally distributed. We also study its asymptotic properties in a double-asymptotic framework, where the number of parameters diverges with the sample size. We show by simulations and on real data that the adaptive SGL outperforms other oracle-like methods in terms of estimation precision and variable selection.

  相似文献   

7.
In this paper, we carry out an in-depth theoretical investigation for inference with missing response and covariate data for general regression models. We assume that the missing data are missing at random (MAR) or missing completely at random (MCAR) throughout. Previous theoretical investigations in the literature have focused only on missing covariates or missing responses, but not both. Here, we consider theoretical properties of the estimates under three different estimation settings: complete case (CC) analysis, a complete response (CR) analysis that involves an analysis of those subjects with only completely observed responses, and the all case (AC) analysis, which is an analysis based on all of the cases. Under each scenario, we derive general expressions for the likelihood and devise estimation schemes based on the EM algorithm. We carry out a theoretical investigation of the three estimation methods in the normal linear model and analytically characterize the loss of information for each method, as well as derive and compare the asymptotic variances for each method assuming the missing data are MAR or MCAR. In addition, a theoretical investigation of bias for the CC method is also carried out. A simulation study and real dataset are given to illustrate the methodology.  相似文献   

8.
In high‐dimensional data settings where p  ? n , many penalized regularization approaches were studied for simultaneous variable selection and estimation. However, with the existence of covariates with weak effect, many existing variable selection methods, including Lasso and its generations, cannot distinguish covariates with weak and no contribution. Thus, prediction based on a subset model of selected covariates only can be inefficient. In this paper, we propose a post selection shrinkage estimation strategy to improve the prediction performance of a selected subset model. Such a post selection shrinkage estimator (PSE) is data adaptive and constructed by shrinking a post selection weighted ridge estimator in the direction of a selected candidate subset. Under an asymptotic distributional quadratic risk criterion, its prediction performance is explored analytically. We show that the proposed post selection PSE performs better than the post selection weighted ridge estimator. More importantly, it improves the prediction performance of any candidate subset model selected from most existing Lasso‐type variable selection methods significantly. The relative performance of the post selection PSE is demonstrated by both simulation studies and real‐data analysis. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

9.
This article proposes a new approach to the robust estimation of a mixed autoregressive and moving average (ARMA) model. It is based on the indirect inference method that originally was proposed for models with an intractable likelihood function. The estimation algorithm proposed is based on an auxiliary autoregressive representation whose parameters are first estimated on the observed time series and then on data simulated from the ARMA model. To simulate data the parameters of the ARMA model have to be set. By varying these we can minimize a distance between the simulation-based and the observation-based auxiliary estimate. The argument of the minimum yields then an estimator for the parameterization of the ARMA model. This simulation-based estimation procedure inherits the properties of the auxiliary model estimator. For instance, robustness is achieved with GM estimators. An essential feature of the introduced estimator, compared to existing robust estimators for ARMA models, is its theoretical tractability that allows us to show consistency and asymptotic normality. Moreover, it is possible to characterize the influence function and the breakdown point of the estimator. In a small sample Monte Carlo study it is found that the new estimator performs fairly well when compared with existing procedures. Furthermore, with two real examples, we also compare the proposed inferential method with two different approaches based on outliers detection.  相似文献   

10.
We consider a panel data semiparametric partially linear regression model with an unknown vector β of regression coefficients, an unknown nonparametric function g(·) for nonlinear component, and unobservable serially correlated errors. The correlated errors are modeled by a vector autoregressive process which involves a constant intraclass correlation. Applying the pilot estimators of β and g(·), we construct estimators of the autoregressive coefficients, the intraclass correlation and the error variance, and investigate their asymptotic properties. Fitting the error structure results in a new semiparametric two-step estimator of β, which is shown to be asymptotically more efficient than the usual semiparametric least squares estimator in terms of asymptotic covariance matrix. Asymptotic normality of this new estimator is established, and a consistent estimator of its asymptotic covariance matrix is presented. Furthermore, a corresponding estimator of g(·) is also provided. These results can be used to make asymptotically efficient statistical inference. Some simulation studies are conducted to illustrate the finite sample performances of these proposed estimators.  相似文献   

11.
In this paper we consider a model for dependent censoring and derive a consistent asymptotically normal estimator for the underlying survival distribution from a sample of censored data. The methodology is illustrated with an application to the analysis of cancer data. Some simulations to evaluate the performance of our estimator are also presented. The results indicate that our estimator performs reasonably well in comparison to the other dependent censoring survival curve estimators.  相似文献   

12.
We consider the simultaneous linear minimax estimation problem in linear models with ellipsoidal constraints imposed on an unknown parameter. Using convex analysis, we derive necessary and sufficient optimality conditions for a matrix to define the linear minimax estimator. For certain regions of the set of characteristics of linear models and constraints, we exploit these optimality conditions and get explicit formulae for linear minimax estimators.  相似文献   

13.
This paper concerns with the estimation of a fixed effects panel data partially linear regression model with the idiosyncratic errors being an autoregressive process. For fixed effects short time series panel data, the commonly used autoregressive error structure fitting method will not result in a consistent estimator of the autoregressive coefficients. Here we propose an alternative estimation and show that the resulting estimator of the autoregressive coefficients is consistent and this method is workable for any order autoregressive error structure. Moreover, combining the B-spline approximation, profile least squares dummy variable (PLSDV) technique and consistently estimated the autoregressive error structure, we develop a weighted PLSDV estimator for the parametric component and a weighted B-spline series (BS) estimator for the nonparametric component. The weighted PLSDV estimator is shown to be asymptotically normal and more asymptotically efficient than the one which ignores the error autoregressive structure. In addition, this paper derives the asymptotic bias of the weighted BS estimator and establish its asymptotic normality as well. Simulation studies and an example of application are conducted to illustrate the finite sample performance of the proposed procedures.  相似文献   

14.
The asymptotic distribution of the quasi-maximum likelihood (QML) estimator is established for generalized autoregressive conditional heteroskedastic (GARCH) processes, when the true parameter may have zero coefficients. This asymptotic distribution is the projection of a normal vector distribution onto a convex cone. The results are derived under mild conditions. For an important subclass of models, no moment condition is imposed on the GARCH process. The main practical implication of these results concerns the estimation of overidentified GARCH models.  相似文献   

15.
We study the distributions of the LASSO, SCAD, and thresholding estimators, in finite samples and in the large-sample limit. The asymptotic distributions are derived for both the case where the estimators are tuned to perform consistent model selection and for the case where the estimators are tuned to perform conservative model selection. Our findings complement those of Knight and Fu [K. Knight, W. Fu, Asymptotics for lasso-type estimators, Annals of Statistics 28 (2000) 1356–1378] and Fan and Li [J. Fan, R. Li, Variable selection via non-concave penalized likelihood and its oracle properties, Journal of the American Statistical Association 96 (2001) 1348–1360]. We show that the distributions are typically highly non-normal regardless of how the estimator is tuned, and that this property persists in large samples. The uniform convergence rate of these estimators is also obtained, and is shown to be slower than n−1/2 in case the estimator is tuned to perform consistent model selection. An impossibility result regarding estimation of the estimators’ distribution function is also provided.  相似文献   

16.
The censored single-index model provides a flexible way for modelling the association between a response and a set of predictor variables when the response variable is randomly censored and the link function is unknown. It presents a technique for “dimension reduction” in semiparametric censored regression models and generalizes the existing accelerated failure time models for survival analysis. This paper proposes two methods for estimation of single-index models with randomly censored samples. We first transform the censored data into synthetic data or pseudo-responses unbiasedly, then obtain estimates of the index coefficients by the rOPG or rMAVE procedures of Xia (2006) [1]. Finally, we estimate the unknown nonparametric link function using techniques for univariate censored nonparametric regression. The estimators for the index coefficients are shown to be root-n consistent and asymptotically normal. In addition, the estimator for the unknown regression function is a local linear kernel regression estimator and can be estimated with the same efficiency as the parameters are known. Monte Carlo simulations are conducted to illustrate the proposed methodologies.  相似文献   

17.
In this paper we study the asymptotic properties of the adaptive Lasso estimate in high-dimensional sparse linear regression models with heteroscedastic errors. It is demonstrated that model selection properties and asymptotic normality of the selected parameters remain valid but with a suboptimal asymptotic variance. A weighted adaptive Lasso estimate is introduced and investigated. In particular, it is shown that the new estimate performs consistent model selection and that linear combinations of the estimates corresponding to the non-vanishing components are asymptotically normally distributed with a smaller variance than those obtained by the “classical” adaptive Lasso. The results are illustrated in a data example and by means of a small simulation study.  相似文献   

18.
19.
In this paper we introduce the least-trimmed squares estimator for multivariate regression. We give three equivalent formulations of the estimator and obtain its breakdown point. A fast algorithm for its computation is proposed. We prove Fisher-consistency at the multivariate regression model with elliptically symmetric error distribution and derive the influence function. Simulations investigate the finite-sample efficiency and robustness of the estimator. To increase the efficiency of the estimator, we also consider a one-step reweighted estimator.  相似文献   

20.
The asymptotic distribution for the local linear estimator in nonparametric regression models is established under a general parametric error covariance with dependent and heterogeneously distributed regressors. A two-step estimation procedure that incorporates the parametric information in the error covariance matrix is proposed. Sufficient conditions for its asymptotic normality are given and its efficiency relative to the local linear estimator is established. We give examples of how our results are useful in some recently studied regression models. A Monte Carlo study confirms the asymptotic theory predictions and compares our estimator with some recently proposed alternative estimation procedures.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号