首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 37 毫秒
1.
本文利用计数过程技术及VonMises方法,研究了具有时变伴变量的删失生存资料的Cox回归模型的自助法的大样本性质.研究表明:在一些正则条件下,对这个模型施实自助法是可行的.即回归系数的偏极大似然估计及基准危险率的非参数极大似然估计的自过程是相合的.  相似文献   

2.
This article considers Markov chain computational methods for incorporating uncertainty about the dimension of a parameter when performing inference within a Bayesian setting. A general class of methods is proposed for performing such computations, based upon a product space representation of the problem which is similar to that of Carlin and Chib. It is shown that all of the existing algorithms for incorporation of model uncertainty into Markov chain Monte Carlo (MCMC) can be derived as special cases of this general class of methods. In particular, we show that the popular reversible jump method is obtained when a special form of Metropolis–Hastings (M–H) algorithm is applied to the product space. Furthermore, the Gibbs sampling method and the variable selection method are shown to derive straightforwardly from the general framework. We believe that these new relationships between methods, which were until now seen as diverse procedures, are an important aid to the understanding of MCMC model selection procedures and may assist in the future development of improved procedures. Our discussion also sheds some light upon the important issues of “pseudo-prior” selection in the case of the Carlin and Chib sampler and choice of proposal distribution in the case of reversible jump. Finally, we propose efficient reversible jump proposal schemes that take advantage of any analytic structure that may be present in the model. These proposal schemes are compared with a standard reversible jump scheme for the problem of model order uncertainty in autoregressive time series, demonstrating the improvements which can be achieved through careful choice of proposals.  相似文献   

3.
Regression density estimation is the problem of flexibly estimating a response distribution as a function of covariates. An important approach to regression density estimation uses finite mixture models and our article considers flexible mixtures of heteroscedastic regression (MHR) models where the response distribution is a normal mixture, with the component means, variances, and mixture weights all varying as a function of covariates. Our article develops fast variational approximation (VA) methods for inference. Our motivation is that alternative computationally intensive Markov chain Monte Carlo (MCMC) methods for fitting mixture models are difficult to apply when it is desired to fit models repeatedly in exploratory analysis and model choice. Our article makes three contributions. First, a VA for MHR models is described where the variational lower bound is in closed form. Second, the basic approximation can be improved by using stochastic approximation (SA) methods to perturb the initial solution to attain higher accuracy. Third, the advantages of our approach for model choice and evaluation compared with MCMC-based approaches are illustrated. These advantages are particularly compelling for time series data where repeated refitting for one-step-ahead prediction in model choice and diagnostics and in rolling-window computations is very common. Supplementary materials for the article are available online.  相似文献   

4.
This paper is devoted to nonparametric estimation, through the -risk, of a regression function based on observations with spherically symmetric errors, which are dependent random variables (except in the normal case). We apply a model selection approach using improved estimates. In a nonasymptotic setting, an upper bound for the risk is obtained (oracle inequality). Moreover asymptotic properties are given, such as upper and lower bounds for the risk, which provide optimal rate of convergence for penalized estimators.  相似文献   

5.
We propose an objective Bayesian approach to the selection of covariates and their penalized splines transformations in generalized additive models. The methodology is based on a combination of continuous mixtures of g-priors for model parameters and a multiplicity-correction prior for the models themselves. We introduce our approach in the normal model and extend it to nonnormal exponential families. A simulation study and an application with binary outcome is provided. An efficient implementation is available in the R package hypergsplines. Supplementary materials for this article are available online.  相似文献   

6.
Markov chain Monte Carlo (MCMC) is nowadays a standard approach to numerical computation of integrals of the posterior density π of the parameter vector η. Unfortunately, Bayesian inference using MCMC is computationally intractable when the posterior density π is expensive to evaluate. In many such problems, it is possible to identify a minimal subvector β of η responsible for the expensive computation in the evaluation of π. We propose two approaches, DOSKA and INDA, that approximate π by interpolation in ways that exploit this computational structure to mitigate the curse of dimensionality. DOSKA interpolates π directly while INDA interpolates π indirectly by interpolating functions, for example, a regression function, upon which π depends. Our primary contribution is derivation of a Gaussian processes interpolant that provably improves over some of the existing approaches by reducing the effective dimension of the interpolation problem from dim(η) to dim(β). This allows a dramatic reduction of the number of expensive evaluations necessary to construct an accurate approximation of π when dim(η) is high but dim(β) is low.

We illustrate the proposed approaches in a case study for a spatio-temporal linear model for air pollution data in the greater Boston area.

Supplemental materials include proofs, details, and software implementation of the proposed procedures.  相似文献   

7.
Many problems in genomics are related to variable selection where high-dimensional genomic data are treated as covariates. Such genomic covariates often have certain structures and can be represented as vertices of an undirected graph. Biological processes also vary as functions depending upon some biological state, such as time. High-dimensional variable selection where covariates are graph-structured and underlying model is nonparametric presents an important but largely unaddressed statistical challenge. Motivated by the problem of regression-based motif discovery, we consider the problem of variable selection for high-dimensional nonparametric varying-coefficient models and introduce a sparse structured shrinkage (SSS) estimator based on basis function expansions and a novel smoothed penalty function. We present an efficient algorithm for computing the SSS estimator. Results on model selection consistency and estimation bounds are derived. Moreover, finite-sample performances are studied via simulations, and the effects of high-dimensionality and structural information of the covariates are especially highlighted. We apply our method to motif finding problem using a yeast cell-cycle gene expression dataset and word counts in genes’ promoter sequences. Our results demonstrate that the proposed method can result in better variable selection and prediction for high-dimensional regression when the underlying model is nonparametric and covariates are structured. Supplemental materials for the article are available online.  相似文献   

8.
Multivariate adaptive regression splines (MARS) is a popular nonparametric regression tool often used for prediction and for uncovering important data patterns between the response and predictor variables. The standard MARS algorithm assumes responses are normally distributed and independent, but in this article we relax both of these assumptions by extending MARS to generalized estimating equations. We refer to this MARS-for-GEEs algorithm as “MARGE.” Our algorithm makes use of fast forward selection techniques, such that in the univariate case, MARGE has similar computation speed to a standard MARS implementation. Through simulation we show that the proposed algorithm has improved predictive performance than the original MARS algorithm when using correlated and/or nonnormal response data. MARGE is also competitive with alternatives in the literature, especially for problems with multiple interacting predictors. We apply MARGE to various ecological examples with different data types. Supplementary material for this article is available online.  相似文献   

9.
We consider Bayesian nonparametric regression through random partition models. Our approach involves the construction of a covariate-dependent prior distribution on partitions of individuals. Our goal is to use covariate information to improve predictive inference. To do so, we propose a prior on partitions based on the Potts clustering model associated with the observed covariates. This drives by covariate proximity both the formation of clusters, and the prior predictive distribution. The resulting prior model is flexible enough to support many different types of likelihood models. We focus the discussion on nonparametric regression. Implementation details are discussed for the specific case of multivariate multiple linear regression. The proposed model performs well in terms of model fitting and prediction when compared to other alternative nonparametric regression approaches. We illustrate the methodology with an application to the health status of nations at the turn of the 21st century. Supplementary materials are available online.  相似文献   

10.
The calculation of nonparametric quantile regression curve estimates is often computationally intensive, as typically an expensive nonlinear optimization problem is involved. This article proposes a fast and easy-to-implement method for computing such estimates. The main idea is to approximate the costly nonlinear optimization by a sequence of well-studied penalized least squares-type nonparametric mean regression estimation problems. The new method can be paired with different nonparametric smoothing methods and can also be applied to higher dimensional settings. Therefore, it provides a unified framework for computing different types of nonparametric quantile regression estimates, and it also greatly broadens the scope of the applicability of quantile regression methodology. This wide applicability and the practical performance of the proposed method are illustrated with smoothing spline and wavelet curve estimators, for both uni- and bivariate settings. Results from numerical experiments suggest that estimates obtained from the proposed method are superior to many competitors. This article has supplementary material online.  相似文献   

11.
Geographic information systems (GIS) organize spatial data in multiple two-dimensional arrays called layers. In many applications, a response of interest is observed on a set of sites in the landscape, and it is of interest to build a regression model from the GIS layers to predict the response at unsampled sites. Model selection in this context then consists not only of selecting appropriate layers, but also of choosing appropriate neighborhoods within those layers. We formalize this problem as a linear model and propose the use of Lasso to simultaneously select variables, choose neighborhoods, and estimate parameters. Spatially dependent errors are accounted for using generalized least squares and spatial smoothness in selected coefficients is incorporated through use of a priori spatial covariance structure. This leads to a modification of the Lasso procedure, called spatial Lasso. The spatial Lasso can be implemented by a fast algorithm and it performs well in numerical examples, including an application to prediction of soil moisture. The methodology is also extended to generalized linear models. Supplemental materials including R computer code and data analyzed in this article are available online.  相似文献   

12.
针对带协变量的负二项回归模型中离散参数估计问题,推广了极大似然估计和Bootstrap极大似然估计方法,并在绝对偏差的意义下,通过模拟研究和实际数据分析研究了估计的优良性.研究结果表明协变量和样本量均对离散参数估计有影响.  相似文献   

13.
This article is concerned with multivariate density estimation. We discuss deficiencies in two popular multivariate density estimators—mixture and copula estimators, and propose a new class of estimators that combines the advantages of both mixture and copula modeling, while being more robust to their weaknesses. Our method adapts any multivariate density estimator using information obtained by separately estimating the marginals. We propose two marginally adapted estimators based on a multivariate mixture of normals and a mixture of factor analyzers estimators. These estimators are implemented using computationally efficient split-and-elimination variational Bayes algorithms. It is shown through simulation and real-data examples that the marginally adapted estimators are capable of improving on their original estimators and compare favorably with other existing methods. Supplementary materials for this article are available online.  相似文献   

14.
含有协变量缺失的数据缺失问题是现代统计分析中的热点之一.当缺失数据中同时存在厚尾,偏斜和异方差问题时则更加难以处理.为此,本文提出一种逆概率加权分位回归估计来研究响应和协变量之间的关系.与经典估计方法相比具有明显优势,一方面,该估计量使用了所有可用的数据,并且允许缺失的协变量与响应高度相关;另一方面,该估计量在所有分位数水平上满足一致性和渐近正态性.通过模拟验证了该方法的在有限样本下的有效性,进一步将该方法推广到线性多元回归模型和非参数回归模型.  相似文献   

15.
Model selection algorithms are required to efficiently traverse the space of models. In problems with high-dimensional and possibly correlated covariates, efficient exploration of the model space becomes a challenge. To overcome this, a multiset is placed on the model space to enable efficient exploration of multiple model modes with minimal tuning. The multiset model selection (MSMS) framework is based on independent priors for the parameters and model indicators on variables. Posterior model probabilities can be easily obtained from multiset averaged posterior model probabilities in MSMS. The effectiveness of MSMS is demonstrated for linear and generalized linear models. Supplementary material for this article is available online.  相似文献   

16.
与传统的的媒体营销模式相比,搜索引擎广告因其精准和投入低等特点获得巨大成功。但已有的搜索引擎广告点击率模型不能有效解决数据量大及特征维度高的问题,使预测结果的准确性大打折扣。本文构建了一种基于LASSO变量选择方法的广告点击率预测模型,能有效克服现有广告点击率模型在处理数据高维性和稀疏性方面的不足。利用某公司的竞价数据对模型进行验证,结果表明影响广告点击率的关键因素是广告关键词中的商标信息、地域信息和每点击成本。该研究结果为企业制定搜索引擎广告营销策略提供一定的理论依据。  相似文献   

17.
对于三段直线回归模型,本文利用贝叶斯观点,给出了转换点和参数的边沿后验分布,参数的条件后验分布和它的点估计  相似文献   

18.
在线性模型中回归系数与误差方差具有正态-逆Gamma先验时,导出了回归系数与误差方差的同时Bayes估计.在均方误差矩阵准则和Bayes Pitman closeness准则下,研究了回归系数的Bayes估计相对于最小二乘(LS)估计的优良性,还讨论了误差方差的Bayes估计在均方误差准则下相对于LS估计的优良性.  相似文献   

19.
Regression models with interaction effects have been widely used in multivariate analysis to improve model flexibility and prediction accuracy. In functional data analysis, however, due to the challenges of estimating three-dimensional coefficient functions, interaction effects have not been considered for function-on-function linear regression. In this article, we propose function-on-function regression models with interaction and quadratic effects. For a model with specified main and interaction effects, we propose an efficient estimation method that enjoys a minimum prediction error property and has good predictive performance in practice. Moreover, converting the estimation of three-dimensional coefficient functions of the interaction effects to the estimation of two- and one-dimensional functions separately, our method is computationally efficient. We also propose adaptive penalties to account for varying magnitudes and roughness levels of coefficient functions. In practice, the forms of the models are usually unspecified. We propose a stepwise procedure for model selection based on a predictive criterion. This method is implemented in our R package FRegSigComp. Supplemental materials are available online.  相似文献   

20.
删失回归模型是一种很重要的模型,它在计量经济学中有着广泛的应用. 然而,它的变量选择问题在现今的参考文献中研究的比较少.本文提出了一个LASSO型变量选择和估计方法,称之为多样化惩罚$L_1$限制方法, 简称为DPLC. 另外,我们给出了非0回归系数估计的大样本渐近性质. 最后,大量的模拟研究表明了DPLC方法和一般的最优子集选择方法在变量选择和估计方面有着相同的能力.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号