首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 671 毫秒
1.
??When the data has heavy tail feature or contains outliers, conventional variable selection methods based on penalized least squares or likelihood functions perform poorly. Based on Bayesian inference method, we study the Bayesian variable selection problem for median linear models. The Bayesian estimation method is proposed by using Bayesian model selection theory and Bayesian estimation method through selecting the Spike and Slab prior for regression coefficients, and the effective posterior Gibbs sampling procedure is also given. Extensive numerical simulations and Boston house price data analysis are used to illustrate the effectiveness of the proposed method.  相似文献   

2.
In this paper, a Bayesian hierarchical model for variable selection and estimation in the context of binary quantile regression is proposed. Existing approaches to variable selection in a binary classification context are sensitive to outliers, heteroskedasticity or other anomalies of the latent response. The method proposed in this study overcomes these problems in an attractive and straightforward way. A Laplace likelihood and Laplace priors for the regression parameters are proposed and estimated with Bayesian Markov Chain Monte Carlo. The resulting model is equivalent to the frequentist lasso procedure. A conceptional result is that by doing so, the binary regression model is moved from a Gaussian to a full Laplacian framework without sacrificing much computational efficiency. In addition, an efficient Gibbs sampler to estimate the model parameters is proposed that is superior to the Metropolis algorithm that is used in previous studies on Bayesian binary quantile regression. Both the simulation studies and the real data analysis indicate that the proposed method performs well in comparison to the other methods. Moreover, as the base model is binary quantile regression, a much more detailed insight in the effects of the covariates is provided by the approach. An implementation of the lasso procedure for binary quantile regression models is available in the R-package bayesQR.  相似文献   

3.
We propose the Bayesian adaptive Lasso (BaLasso) for variable selection and coefficient estimation in linear regression. The BaLasso is adaptive to the signal level by adopting different shrinkage for different coefficients. Furthermore, we provide a model selection machinery for the BaLasso by assessing the posterior conditional mode estimates, motivated by the hierarchical Bayesian interpretation of the Lasso. Our formulation also permits prediction using a model averaging strategy. We discuss other variants of this new approach and provide a unified framework for variable selection using flexible penalties. Empirical evidence of the attractiveness of the method is demonstrated via extensive simulation studies and data analysis.  相似文献   

4.
Based on the double penalized estimation method,a new variable selection procedure is proposed for partially linear models with longitudinal data.The proposed procedure can avoid the effects of the nonparametric estimator on the variable selection for the parameters components.Under some regularity conditions,the rate of convergence and asymptotic normality of the resulting estimators are established.In addition,to improve efficiency for regression coefficients,the estimation of the working covariance matrix is involved in the proposed iterative algorithm.Some simulation studies are carried out to demonstrate that the proposed method performs well.  相似文献   

5.
We consider the median regression with a LASSO-type penalty term for variable selection. With the fixed number of variables in regression model, a two-stage method is proposed for simultaneous estimation and variable selection where the degree of penalty is adaptively chosen. A Bayesian information criterion type approach is proposed and used to obtain a data-driven procedure which is proved to automatically select asymptotically optimal tuning parameters. It is shown that the resultant estimator achieves the so-called oracle property. The combination of the median regression and LASSO penalty is computationally easy to implement via the standard linear programming. A random perturbation scheme can be made use of to get simple estimator of the standard error. Simulation studies are conducted to assess the finite-sample performance of the proposed method. We illustrate the methodology with a real example.  相似文献   

6.
纵向数据常常用正态混合效应模型进行分析.然而,违背正态性的假定往往会导致无效的推断.与传统的均值回归相比较,分位回归可以给出响应变量条件分布的完整刻画,对于非正态误差分布也可以给稳健的估计结果.本文主要考虑右删失响应下纵向混合效应模型的分位回归估计和变量选择问题.首先,逆删失概率加权方法被用来得到模型的参数估计.其次,结合逆删失概率加权和LASSO惩罚变量选择方法考虑了模型的变量选择问题.蒙特卡洛模拟显示所提方法要比直接删除删失数据的估计方法更具优势.最后,分析了一组艾滋病数据集来展示所提方法的实际应用效果.  相似文献   

7.
植物遗传与基因组学研究表明许多重要的农艺性状有影响的基因位点不是稀疏的,受到大量微效基因的影响,并且还存在基因交互项的影响.本文基于重要油料作物油菜的花期数据,研究中等稀疏条件下的基因选择问题,提出了一种两步Bayes模型选择方法.考虑基因间的交互作用,模型的维数急剧增长,加上数据结构特别,通常的变量选择方法效果不好....  相似文献   

8.
部分线性单指标模型的复合分位数回归及变量选择   总被引:1,自引:0,他引:1       下载免费PDF全文
本文提出复合最小化平均分位数损失估计方法 (composite minimizing average check loss estimation,CMACLE)用于实现部分线性单指标模型(partial linear single-index models,PLSIM)的复合分位数回归(composite quantile regression,CQR).首先基于高维核函数构造参数部分的复合分位数回归意义下的相合估计,在此相合估计的基础上,通过采用指标核函数进一步得到参数和非参数函数的可达最优收敛速度的估计,并建立所得估计的渐近正态性,比较PLSIM的CQR估计和最小平均方差估计(MAVE)的相对渐近效率.进一步地,本文提出CQR框架下PLSIM的变量选择方法,证明所提变量选择方法的oracle性质.随机模拟和实例分析验证了所提方法在有限样本时的表现,证实了所提方法的优良性.  相似文献   

9.
The main challenge in working with gene expression microarrays is that the sample size is small compared to the large number of variables (genes). In many studies, the main focus is on finding a small subset of the genes, which are the most important ones for differentiating between different types of cancer, for simpler and cheaper diagnostic arrays. In this paper, a sparse Bayesian variable selection method in probit model is proposed for gene selection and classification. We assign a sparse prior for regression parameters and perform variable selection by indexing the covariates of the model with a binary vector. The correlation prior for the binary vector assigned in this paper is able to distinguish models with the same size. The performance of the proposed method is demonstrated with one simulated data and two well known real data sets, and the results show that our method is comparable with other existing methods in variable selection and classification.  相似文献   

10.
广义部分线性模型是广义线性模型和部分线性模型的推广,是一种应用广泛的半参数模型.本文讨论的是该模型在线性协变量和响应变量均存在非随机缺失数据情形下参数的Bayes估计和基于Bayes因子的模型选择问题,在分析过程中,采用了惩罚样条来估计模型中的非参数成分,并建立了Bayes层次模型;为了解决Gibbs抽样过程中因参数高度相关带来的混合性差以及因维数增加导致出现不稳定性的问题,引入了潜变量做为添加数据并应用了压缩Gibbs抽样方法,改进了收敛性;同时,为了避免计算多重积分,利用了M-H算法估计边缘密度函数后计算Bayes因子,为模型的选择比较提供了一种准则.最后,通过模拟和实例验证了所给方法的有效性.  相似文献   

11.
A Bayesian model selection procedure for comparing models subject to inequality and/or equality constraints is proposed. An encompassing prior approach is used, and a general form of the Bayes factor of a constrained model against the encompassing model is derived. A simple estimation method is proposed which can estimate the Bayes factors for all candidate models simultaneously by using one set of samples from the encompassing model. A simulation study and a real data analysis demonstrate performance of the method.  相似文献   

12.

Variable selection for multivariate nonparametric regression models usually involves parameterized approximation for nonparametric functions in the objective function. However, this parameterized approximation often increases the number of parameters significantly, leading to the “curse of dimensionality” and inaccurate estimation. In this paper, we propose a novel and easily implemented approach to do variable selection in nonparametric models without parameterized approximation, enabling selection consistency to be achieved. The proposed method is applied to do variable selection for additive models. A two-stage procedure with selection and adaptive estimation is proposed, and the properties of this method are investigated. This two-stage algorithm is adaptive to the smoothness of the underlying components, and the estimation consistency can reach a parametric rate if the underlying model is really parametric. Simulation studies are conducted to examine the performance of the proposed method. Furthermore, a real data example is analyzed for illustration.

  相似文献   

13.
A flexible nonparametric method is proposed for classifying high- dimensional data with a complex structure. The proposed method can be regarded as an extended version of linear logistic discriminant procedures, in which the linear predictor is replaced by a radial-basis-expansion predictor. Radial basis functions with a hyperparameter are used to take the information on covariates and class labels into account; this was nearly impossible within the previously proposed hybrid learning framework. The penalized maximum likelihood estimation procedure is employed to obtain stable parameter estimates. A crucial issue in the model-construction process is the choice of a suitable model from candidates. This issue is examined from information-theoretic and Bayesian viewpoints and we employed Ando et al. (Japanese Journal of Applied Statistics, 31, 123–139, 2002)’s model evaluation criteria. The proposed method is available not only for the high-dimensional data but also for the variable selection problem. Real data analysis and Monte Carlo experiments show that our proposed method performs well in classifying future observations in practical situations. The simulation results also show that the use of the hyperparameter in the basis functions improves the prediction performance.  相似文献   

14.
We consider nonparametric estimation of a smooth function of one variable. Global selection procedures cannot sufficiently account for local sparseness of the covariate nor can they adapt to local curvature of the regression function. We propose a new method for selecting local smoothing parameters which takes into account sparseness and adapts to local curvature. A Bayesian type argument provides an initial smoothing parameter which adapts to the local sparseness of the covariate and provides the basis for local bandwidth selection procedures which further adjust the bandwidth according to the local curvature of the regression function. Simulation evidence indicates that the proposed method can result in reduction of both pointwise mean squared error and integrated mean squared error.  相似文献   

15.
基于改进的Cholesky分解,研究分析了纵向数据下半参数联合均值协方差模型的贝叶斯估计和贝叶斯统计诊断,其中非参数部分采用B样条逼近.主要通过应用Gibbs抽样和Metropolis-Hastings算法相结合的混合算法获得模型中未知参数的贝叶斯估计和贝叶斯数据删除影响诊断统计量.并利用诊断统计量的大小来识别数据的异常点.模拟研究和实例分析都表明提出的贝叶斯估计和诊断方法是可行有效的.  相似文献   

16.
Regularization methods, including Lasso, group Lasso, and SCAD, typically focus on selecting variables with strong effects while ignoring weak signals. This may result in biased prediction, especially when weak signals outnumber strong signals. This paper aims to incorporate weak signals in variable selection, estimation, and prediction. We propose a two‐stage procedure, consisting of variable selection and postselection estimation. The variable selection stage involves a covariance‐insured screening for detecting weak signals, whereas the postselection estimation stage involves a shrinkage estimator for jointly estimating strong and weak signals selected from the first stage. We term the proposed method as the covariance‐insured screening‐based postselection shrinkage estimator. We establish asymptotic properties for the proposed method and show, via simulations, that incorporating weak signals can improve estimation and prediction performance. We apply the proposed method to predict the annual gross domestic product rates based on various socioeconomic indicators for 82 countries.  相似文献   

17.
在多元非参数模型中带宽和阶的选择对局部多项式估计量的表现十分重要。本文基于交叉验证准则提出一个自适应贝叶斯带宽选择方法。在给定的误差密度函数下,该方法可推导出对应的似然函数,并构造带宽参数的后验密度函数。随后,通过带宽的后验期望可同时获得阶和带宽的估计。数值模拟的结果表明,该方法不仅比大拇指准则方法精确,且比交叉验证方法耗时更少。与此同时,与Nadaraya-Watson估计相比,所提带宽选择方法对多元非参数模型的适应性要更好。最后,本文通过一组实际数据说明有限样本下所提贝叶斯带宽选择的表现很好。  相似文献   

18.
In this article, we consider nonparametric smoothing and variable selection in varying-coefficient models. Varying-coefficient models are commonly used for analyzing the time-dependent effects of covariates on responses measured repeatedly (such as longitudinal data). We present the P-spline estimator in this context and show its estimation consistency for a diverging number of knots (or B-spline basis functions). The combination of P-splines with nonnegative garrote (which is a variable selection method) leads to good estimation and variable selection. Moreover, we consider APSO (additive P-spline selection operator), which combines a P-spline penalty with a regularization penalty, and show its estimation and variable selection consistency. The methods are illustrated with a simulation study and real-data examples. The proofs of the theoretical results as well as one of the real-data examples are provided in the online supplementary materials.  相似文献   

19.
The reliability for Weibull distribution with homogeneous heavily censored data is analyzed in this study. The universal model of heavily censored data and existing methods, including maximum likelihood, least-squares, E-Bayesian estimation, and hierarchical Bayesian methods, are introduced. An improved method is proposed based on Bayesian inference and least-squares method. In this method, the Bayes estimations of failure probabilities are focused on for all the samples. The conjugate prior distribution of failure probability is set, and an optimization model is developed by maximizing the information entropy of prior distribution to determine the hyper-parameters. By integrating the likelihood function, the posterior distribution of failure probability is then derived to yield the Bayes estimation of failure probability. The estimations of reliability parameters are obtained by fitting distribution curve using least-squares method. The four existing methods are compared with the proposed method in terms of applicability, precision, efficiency, robustness, and simplicity. Specifically, the closed form expressions concerning E-Bayesian estimation and hierarchical Bayesian methods are derived and used. The comparisons demonstrate that the improved method is superior. Finally, three illustrative examples are presented to show the application of the proposed method.  相似文献   

20.
An empirical Bayes method to select basis functions and knots in multivariate adaptive regression spline (MARS) is proposed, which takes both advantages of frequentist model selection approaches and Bayesian approaches. A penalized likelihood is maximized to estimate regression coefficients for selected basis functions, and an approximated marginal likelihood is maximized to select knots and variables involved in basis functions. Moreover, the Akaike Bayes information criterion (ABIC) is used to determine the number of basis functions. It is shown that the proposed method gives estimation of regression structure that is relatively parsimonious and more stable for some example data sets.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号