首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 281 毫秒
1.
基于多重共线性的处理方法   总被引:2,自引:0,他引:2  
多重共线性简称共线性是多元线性回归分析中一个重要问题。消除共线性的危害一直是回归分析的一个重点。目前处理严重共线性的常用方法有以下几种:岭回归、主成分回归、逐步回归、偏最小二乘法、Lasso回归等。本文就这几种方法进行比较分析,介绍它们的优缺点,通过实例分析以便于选择合适的方法处理共线性。  相似文献   

2.
We propose a method for estimating nonstationary spatial covariance functions by representing a spatial process as a linear combination of some local basis functions with uncorrelated random coefficients and some stationary processes, based on spatial data sampled in space with repeated measurements. By incorporating a large collection of local basis functions with various scales at various locations and stationary processes with various degrees of smoothness, the model is flexible enough to represent a wide variety of nonstationary spatial features. The covariance estimation and model selection are formulated as a regression problem with the sample covariances as the response and the covariances corresponding to the local basis functions and the stationary processes as the predictors. A constrained least squares approach is applied to select appropriate basis functions and stationary processes as well as estimate parameters simultaneously. In addition, a constrained generalized least squares approach is proposed to further account for the dependencies among the response variables. A simulation experiment shows that our method performs well in both covariance function estimation and spatial prediction. The methodology is applied to a U.S. precipitation dataset for illustration. Supplemental materials relating to the application are available online.  相似文献   

3.
Generalized linear models have been more widely used than linear models which exclude categorical variables. The penalized method becomes an effective tool to study ultrahigh dimensional generalized linear models. In this paper, we study theoretical results of the adaptive Lasso for generalized linear models in terms of diverging number of parameters and ultrahigh dimensionality. The asymptotic results are examined by several simulation studies.  相似文献   

4.
变系数广义线性模型及其估计   总被引:8,自引:0,他引:8  
本文以经典广义线性模型为基础,通过假定其中的回归变量的系数是某一度量空间中点的任意函数,提出了一类有广泛应用背景的变系数广义线性模型,增加了模型的灵活性和适应性,同时也适用于空间数据的统计分析。基于局部加权最大似然估计方法,文章讨论了变系数广义线性模型的拟合与统计推断,以及与之相关的局部权系统和其中光滑参数的确定。  相似文献   

5.
Non-Gaussian spatial data are common in many fields. When fitting regressions for such data, one needs to account for spatial dependence to ensure reliable inference for the regression coefficients. The two most commonly used regression models for spatially aggregated data are the automodel and the areal generalized linear mixed model (GLMM). These models induce spatial dependence in different ways but share the smoothing approach, which is intuitive but problematic. This article develops a new regression model for areal data. The new model is called copCAR because it is copula-based and employs the areal GLMM’s conditional autoregression (CAR). copCAR overcomes many of the drawbacks of the automodel and the areal GLMM. Specifically, copCAR (1) is flexible and intuitive, (2) permits positive spatial dependence for all types of data, (3) permits efficient computation, and (4) provides reliable spatial regression inference and information about dependence strength. An implementation is provided by R package copCAR, which is available from the Comprehensive R Archive Network, and supplementary materials are available online.  相似文献   

6.
Many least-square problems involve affine equality and inequality constraints. Although there are a variety of methods for solving such problems, most statisticians find constrained estimation challenging. The current article proposes a new path-following algorithm for quadratic programming that replaces hard constraints by what are called exact penalties. Similar penalties arise in l 1 regularization in model selection. In the regularization setting, penalties encapsulate prior knowledge, and penalized parameter estimates represent a trade-off between the observed data and the prior knowledge. Classical penalty methods of optimization, such as the quadratic penalty method, solve a sequence of unconstrained problems that put greater and greater stress on meeting the constraints. In the limit as the penalty constant tends to ∞, one recovers the constrained solution. In the exact penalty method, squared penalties are replaced by absolute value penalties, and the solution is recovered for a finite value of the penalty constant. The exact path-following method starts at the unconstrained solution and follows the solution path as the penalty constant increases. In the process, the solution path hits, slides along, and exits from the various constraints. Path following in Lasso penalized regression, in contrast, starts with a large value of the penalty constant and works its way downward. In both settings, inspection of the entire solution path is revealing. Just as with the Lasso and generalized Lasso, it is possible to plot the effective degrees of freedom along the solution path. For a strictly convex quadratic program, the exact penalty algorithm can be framed entirely in terms of the sweep operator of regression analysis. A few well-chosen examples illustrate the mechanics and potential of path following. This article has supplementary materials available online.  相似文献   

7.
For censored response variable against projected co-variable, a generalized linear model with an unknown link function can cover almost all existing models under censorship. Its special cases include the accelerated failure time model with censored data. Such a model in the uncensored case is called the single-index model in econometrics. In this paper, we systematically study the asymptotic properties. We derive the central limit theorem and the law of the iterated logarithm for an estimator of the direction parameter. We also obtain the optimal convergence rate of an estimator of the unknown link function in the model.   相似文献   

8.
We consider the problem of testing for a constant nonparametric effect in a general semiparametric regression model when there is a potential for interaction between the parametrically and nonparametrically modeled variables. The work was originally motivated by a unique testing problem in genetic epidemiology (Chatterjee et al., 2006) that involved a typical generalized linear model but with an additional term reminiscent of the Tukey 1-degree-of-freedom formulation, and their interest was in testing for main effects of the genetic variables, while gaining statistical power by allowing for a possible interaction between genes and the environment. Later work (Maity et al., 2009) involved the possibility of modeling the environmental variable nonparametrically, but they focused on whether there was a parametric main effect for the genetic variables. In this paper, we consider the complementary problem, where the interest is in testing for the main effect of the nonparametrically modeled environmental variable. We derive a generalized likelihood ratio test for this hypothesis, show how to implement it, and provide evidence that our method can improve statistical power when compared to standard partially linear models with main effects only. We use the method for the primary purpose of analyzing data from a case-control study of colorectal adenoma.  相似文献   

9.
The widespread availability of digital spatial data and the capabilities of Geographic Information Systems (GIS) make it possible to easily synthesize spatial data from a variety of sources. More often than not, data have been collected at different geographic scales, and each of the scales may be different from the one of interest. Geographic information systems effortlessly handle these types of problems through raster and geoprocessing operations based on proportional allocation and centroid smoothing techniques. However, these techniques do not provide a measure of uncertainty in the estimates and lack the ability to incorporate important covariate information that may be used to improve the estimates. They also often ignore the different spatial supports (e.g., shape and orientation) of the data. On the other hand, statistical solutions to change-of-support problems are rather specific and difficult to implement. In this article, we present a general geostatistical framework for linking geographic data from different sources. This framework incorporates aggregation and disaggregation of spatial data, as well as prediction problems involving overlapping geographic units. It explicitly incorporates the supports of the data, can adjust for covariate values measured on different spatial units at different scales, provides a measure of uncertainty for the resulting predictions, and is computationally feasible within a GIS. The new framework we develop also includes a new approach for simultaneous estimation of mean and covariance functions from aggregated data using generalized estimating equations.  相似文献   

10.
丁洋 《中国科学:数学》2012,42(4):353-360
多重序列的联合线性复杂度是衡量基于字的流密码体系安全的一个重要指标. 由元素取自Fq上的m 重序列和元素取自Fqm 上的单个序列之间的一一对应, Meidl 和Özbudak 定义多重序列的广义联合线性复杂度为对应的单个序列的线性复杂度. 在本文中, 我们利用代数曲线的常数域扩张, 研究两类多重序列的广义联合线性复杂度. 更进一步, 我们指出这两类多重序列同时具有高联合线性复杂度和高广义联合线性复杂度.  相似文献   

11.
In this paper we study the asymptotic properties of the adaptive Lasso estimate in high-dimensional sparse linear regression models with heteroscedastic errors. It is demonstrated that model selection properties and asymptotic normality of the selected parameters remain valid but with a suboptimal asymptotic variance. A weighted adaptive Lasso estimate is introduced and investigated. In particular, it is shown that the new estimate performs consistent model selection and that linear combinations of the estimates corresponding to the non-vanishing components are asymptotically normally distributed with a smaller variance than those obtained by the “classical” adaptive Lasso. The results are illustrated in a data example and by means of a small simulation study.  相似文献   

12.
Abstract

The iterative convex minorant (ICM) algorithm proposed by Groeneboom and Wellner is fast in computing the NPMLE of the distribution function for interval censored data without covariates. We reformulate the ICM as a generalized gradient projection method (GGP), which leads to a natural extension to the Cox model. It is also easily extended to support Tibshirani's Lasso method. Some simulation results are also shown. For illustration we reanalyze two real datasets.  相似文献   

13.
This article considers generalized partially linear models when the linear covariate is measured with additive error. We propose estimators of parameter and nonparametric function by using local linear regression, the SIMEX technique, and generalized estimating equation. The asymptotic normality of the estimators of the parameter, and bias and variance of the estimators of the nonparametric component are derived under appropriate assumptions. In addition, the generalization to clustered measurements is discussed. The approaches are used to the analysis of data from the Framingham Heart Study. A simulation experiment is conducted for an illustration.  相似文献   

14.
15.
广义部分线性模型是广义线性模型和部分线性模型的推广,是一种应用广泛的半参数模型.本文讨论的是该模型在线性协变量和响应变量均存在非随机缺失数据情形下参数的Bayes估计和基于Bayes因子的模型选择问题,在分析过程中,采用了惩罚样条来估计模型中的非参数成分,并建立了Bayes层次模型;为了解决Gibbs抽样过程中因参数高度相关带来的混合性差以及因维数增加导致出现不稳定性的问题,引入了潜变量做为添加数据并应用了压缩Gibbs抽样方法,改进了收敛性;同时,为了避免计算多重积分,利用了M-H算法估计边缘密度函数后计算Bayes因子,为模型的选择比较提供了一种准则.最后,通过模拟和实例验证了所给方法的有效性.  相似文献   

16.

We study the asymptotic properties of a new version of the Sparse Group Lasso estimator (SGL), called adaptive SGL. This new version includes two distinct regularization parameters, one for the Lasso penalty and one for the Group Lasso penalty, and we consider the adaptive version of this regularization, where both penalties are weighted by preliminary random coefficients. The asymptotic properties are established in a general framework, where the data are dependent and the loss function is convex. We prove that this estimator satisfies the oracle property: the sparsity-based estimator recovers the true underlying sparse model and is asymptotically normally distributed. We also study its asymptotic properties in a double-asymptotic framework, where the number of parameters diverges with the sample size. We show by simulations and on real data that the adaptive SGL outperforms other oracle-like methods in terms of estimation precision and variable selection.

  相似文献   

17.
Penalized estimation has become an established tool for regularization and model selection in regression models. A variety of penalties with specific features are available and effective algorithms for specific penalties have been proposed. But not much is available to fit models with a combination of different penalties. When modeling the rent data of Munich as in our application, various types of predictors call for a combination of a Ridge, a group Lasso and a Lasso-type penalty within one model. We propose to approximate penalties that are (semi-)norms of scalar linear transformations of the coefficient vector in generalized structured models—such that penalties of various kinds can be combined in one model. The approach is very general such that the Lasso, the fused Lasso, the Ridge, the smoothly clipped absolute deviation penalty, the elastic net and many more penalties are embedded. The computation is based on conventional penalized iteratively re-weighted least squares algorithms and hence, easy to implement. New penalties can be incorporated quickly. The approach is extended to penalties with vector based arguments. There are several possibilities to choose the penalty parameter(s). A software implementation is available. Some illustrative examples show promising results.  相似文献   

18.
Making use of a linear operator, which is defined here by means of a Hadamard product (or convolution) involving the generalized hypergeometric function, the authors introduce and investigate the various properties and characteristics of two novel classes of meromorphically multivalent functions. They also apply the familiar concept of neighborhoods of analytic functions to these classes of meromorphically multivalent functions.  相似文献   

19.
We study the properties of the Lasso in the high-dimensional partially linear model where the number of variables in the linear part can be greater than the sample size. We use truncated series expansion based on polynomial splines to approximate the nonparametric component in this model. Under a sparsity assumption on the regression coefficients of the linear component and some regularity conditions, we derive the oracle inequalities for the prediction risk and the estimation error. We also provide sufficient conditions under which the Lasso estimator is selection consistent for the variables in the linear part of the model. In addition, we derive the rate of convergence of the estimator of the nonparametric function. We conduct simulation studies to evaluate the finite sample performance of variable selection and nonparametric function estimation.  相似文献   

20.
In this paper we discuss the asymptotic properties of quantile processes under random censoring. In contrast to most work in this area we prove weak convergence of an appropriately standardized quantile process under the assumption that the quantile regression model is only linear in the region, where the process is investigated. Additionally, we also discuss properties of the quantile process in sparse regression models including quantile processes obtained from the Lasso and adaptive Lasso. The results are derived by a combination of modern empirical process theory, classical martingale methods and a recent result of Kato (2009).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号