共查询到20条相似文献,搜索用时 15 毫秒
1.
We focus on Bayesian variable selection in regression models. One challenge is to search the huge model space adequately, while identifying high posterior probability regions. In the past decades, the main focus has been on the use of Markov chain Monte Carlo (MCMC) algorithms for these purposes. In this article, we propose a new computational approach based on sequential Monte Carlo (SMC), which we refer to as particle stochastic search (PSS). We illustrate PSS through applications to linear regression and probit models. 相似文献
2.
Jeff Goldsmith Lei Huang Ciprian M. Crainiceanu 《Journal of computational and graphical statistics》2013,22(1):46-64
We develop scalar-on-image regression models when images are registered multidimensional manifolds. We propose a fast and scalable Bayes’ inferential procedure to estimate the image coefficient. The central idea is the combination of an Ising prior distribution, which controls a latent binary indicator map, and an intrinsic Gaussian Markov random field, which controls the smoothness of the nonzero coefficients. The model is fit using a single-site Gibbs sampler, which allows fitting within minutes for hundreds of subjects with predictor images containing thousands of locations. The code is simple and is provided in the online Appendix (see the “Supplementary Materials” section). We apply this method to a neuroimaging study where cognitive outcomes are regressed on measures of white-matter microstructure at every voxel of the corpus callosum for hundreds of subjects. 相似文献
3.
《Journal of computational and graphical statistics》2013,22(2):362-382
Bayesian approaches to prediction and the assessment of predictive uncertainty in generalized linear models are often based on averaging predictions over different models, and this requires methods for accounting for model uncertainty. When there are linear dependencies among potential predictor variables in a generalized linear model, existing Markov chain Monte Carlo algorithms for sampling from the posterior distribution on the model and parameter space in Bayesian variable selection problems may not work well. This article describes a sampling algorithm based on the Swendsen-Wang algorithm for the Ising model, and which works well when the predictors are far from orthogonality. In problems of variable selection for generalized linear models we can index different models by a binary parameter vector, where each binary variable indicates whether or not a given predictor variable is included in the model. The posterior distribution on the model is a distribution on this collection of binary strings, and by thinking of this posterior distribution as a binary spatial field we apply a sampling scheme inspired by the Swendsen-Wang algorithm for the Ising model in order to sample from the model posterior distribution. The algorithm we describe extends a similar algorithm for variable selection problems in linear models. The benefits of the algorithm are demonstrated for both real and simulated data. 相似文献
4.
Anandamayee Majumdar Debashis Paul 《Journal of computational and graphical statistics》2016,25(3):727-747
We introduce new classes of stationary spatial processes with asymmetric, sub-Gaussian marginal distributions using the idea of expectiles. We derive theoretical properties of the proposed processes. Moreover, we use the proposed spatial processes to formulate a spatial regression model for point-referenced data where the spatially correlated errors have skewed marginal distribution. We introduce a Bayesian computational procedure for model fitting and inference for this class of spatial regression models. We compare the performance of the proposed method with the traditional Gaussian process-based spatial regression through simulation studies and by applying it to a dataset on air pollution in California. 相似文献
5.
《Journal of computational and graphical statistics》2013,22(1):76-94
This article presents a Markov chain Monte Carlo algorithm for both variable and covariance selection in the context of logistic mixed effects models. This algorithm allows us to sample solely from standard densities with no additional tuning. We apply a stochastic search variable approach to select explanatory variables as well as to determine the structure of the random effects covariance matrix. Prior determination of explanatory variables and random effects is not a prerequisite because the definite structure is chosen in a data-driven manner in the course of the modeling procedure. To illustrate the method, we give two bank data examples. 相似文献
6.
Daniel Taylor-Rodriguez Andrew Womack Nikolay Bliznyuk 《Journal of computational and graphical statistics》2016,25(2):515-535
This article investigates Bayesian variable selection when there is a hierarchical dependence structure on the inclusion of predictors in the model. In particular, we study the type of dependence found in polynomial response surfaces of orders two and higher, whose model spaces are required to satisfy weak or strong heredity conditions. These conditions restrict the inclusion of higher-order terms depending upon the inclusion of lower-order parent terms. We develop classes of priors on the model space, investigate their theoretical and finite sample properties, and provide a Metropolis–Hastings algorithm for searching the space of models. The tools proposed allow fast and thorough exploration of model spaces that account for hierarchical polynomial structure in the predictors and provide control of the inclusion of false positives in high posterior probability models. 相似文献
7.
Matthew T. Pratola Hugh A. Chipman James R. Gattiker David M. Higdon Robert McCulloch William N. Rust 《Journal of computational and graphical statistics》2013,22(3):830-852
Bayesian additive regression trees (BART) is a Bayesian approach to flexible nonlinear regression which has been shown to be competitive with the best modern predictive methods such as those based on bagging and boosting. BART offers some advantages. For example, the stochastic search Markov chain Monte Carlo (MCMC) algorithm can provide a more complete search of the model space and variation across MCMC draws can capture the level of uncertainty in the usual Bayesian way. The BART prior is robust in that reasonable results are typically obtained with a default prior specification. However, the publicly available implementation of the BART algorithm in the R package BayesTree is not fast enough to be considered interactive with over a thousand observations, and is unlikely to even run with 50,000 to 100,000 observations. In this article we show how the BART algorithm may be modified and then computed using single program, multiple data (SPMD) parallel computation implemented using the Message Passing Interface (MPI) library. The approach scales nearly linearly in the number of processor cores, enabling the practitioner to perform statistical inference on massive datasets. Our approach can also handle datasets too massive to fit on any single data repository. 相似文献
8.
Lizhen Xu Radu V. Craiu Lei Sun Andrew D. Paterson 《Journal of computational and graphical statistics》2016,25(2):405-425
Motivated by genetic association studies of pleiotropy, we propose a Bayesian latent variable approach to jointly study multiple outcomes. The models studied here can incorporate both continuous and binary responses, and can account for serial and cluster correlations. We consider Bayesian estimation for the model parameters, and we develop a novel MCMC algorithm that builds upon hierarchical centering and parameter expansion techniques to efficiently sample from the posterior distribution. We evaluate the proposed method via extensive simulations and demonstrate its utility with an application to an association study of various complication outcomes related to Type 1 diabetes. This article has supplementary material online. 相似文献
9.
Univariate or multivariate ordinal responses are often assumed to arise from a latent continuous parametric distribution, with covariate effects that enter linearly. We introduce a Bayesian nonparametric modeling approach for univariate and multivariate ordinal regression, which is based on mixture modeling for the joint distribution of latent responses and covariates. The modeling framework enables highly flexible inference for ordinal regression relationships, avoiding assumptions of linearity or additivity in the covariate effects. In standard parametric ordinal regression models, computational challenges arise from identifiability constraints and estimation of parameters requiring nonstandard inferential techniques. A key feature of the nonparametric model is that it achieves inferential flexibility, while avoiding these difficulties. In particular, we establish full support of the nonparametric mixture model under fixed cut-off points that relate through discretization the latent continuous responses with the ordinal responses. The practical utility of the modeling approach is illustrated through application to two datasets from econometrics, an example involving regression relationships for ozone concentration, and a multirater agreement problem. Supplementary materials with technical details on theoretical results and on computation are available online. 相似文献
10.
《Journal of computational and graphical statistics》2013,22(2):373-387
This article proposes a new Bayesian approach for monotone curve fitting based on the isotonic regression model. The unknown monotone regression function is approximated by a cubic spline and the constraints are represented by the intersection of quadratic cones. We treat the number and locations of knots as free parameters and use reversible jump Markov chain Monte Carlo to obtain posterior samples of knot configurations. Given the number and locations of the knots, second-order cone programming is used to estimate the remaining parameters. Simulation results suggest the method performs well and we illustrate the approach using the ASA car data. 相似文献
11.
Bledar A. Konomi Huiyan Sang Bani K. Mallick 《Journal of computational and graphical statistics》2013,22(3):802-829
Gaussian process models have been widely used in spatial statistics but face tremendous modeling and computational challenges for very large nonstationary spatial datasets. To address these challenges, we develop a Bayesian modeling approach using a nonstationary covariance function constructed based on adaptively selected partitions. The partitioned nonstationary class allows one to knit together local covariance parameters into a valid global nonstationary covariance for prediction, where the local covariance parameters are allowed to be estimated within each partition to reduce computational cost. To further facilitate the computations in local covariance estimation and global prediction, we use the full-scale covariance approximation (FSA) approach for the Bayesian inference of our model. One of our contributions is to model the partitions stochastically by embedding a modified treed partitioning process into the hierarchical models that leads to automated partitioning and substantial computational benefits. We illustrate the utility of our method with simulation studies and the global Total Ozone Matrix Spectrometer (TOMS) data. Supplementary materials for this article are available online. 相似文献
12.
《Journal of computational and graphical statistics》2013,22(2):378-394
In this article we study penalized regression splines (P-splines), which are low-order basis splines with a penalty to avoid undersmoothing. Such P-splines are typically not spatially adaptive, and hence can have trouble when functions are varying rapidly. Our approach is to model the penalty parameter inherent in the P-spline method as a heteroscedastic regression function. We develop a full Bayesian hierarchical structure to do this and use Markov chain Monte Carlo techniques for drawing random samples from the posterior for inference. The advantage of using a Bayesian approach to P-splines is that it allows for simultaneous estimation of the smooth functions and the underlying penalty curve in addition to providing uncertainty intervals of the estimated curve. The Bayesian credible intervals obtained for the estimated curve are shown to have pointwise coverage probabilities close to nominal. The method is extended to additive models with simultaneous spline-based penalty functions for the unknown functions. In simulations, the approach achieves very competitive performance with the current best frequentist P-spline method in terms of frequentist mean squared error and coverage probabilities of the credible intervals, and performs better than some of the other Bayesian methods. 相似文献
13.
When the data has heavy tail feature or contains outliers, conventional variable selection methods based on penalized least squares or likelihood functions perform poorly. Based on Bayesian inference method, we study the Bayesian variable selection problem for median linear models. The Bayesian estimation method is proposed by using Bayesian model selection theory and Bayesian estimation method through selecting the Spike and Slab prior for regression coefficients, and the effective posterior Gibbs sampling procedure is also given. Extensive numerical simulations and Boston house price data analysis are used to illustrate the effectiveness of the proposed method. 相似文献
14.
??When the data has heavy tail feature or contains outliers, conventional variable selection methods based on penalized least squares or likelihood functions perform poorly. Based on Bayesian inference method, we study the Bayesian variable selection problem for median linear models. The Bayesian estimation method is proposed by using Bayesian model selection theory and Bayesian estimation method through selecting the Spike and Slab prior for regression coefficients, and the effective posterior Gibbs sampling procedure is also given. Extensive numerical simulations and Boston house price data analysis are used to illustrate the effectiveness of the proposed method. 相似文献
15.
Tore Selland Kleppe 《Journal of computational and graphical statistics》2013,22(3):493-507
Dynamically rescaled Hamiltonian Monte Carlo is introduced as a computationally fast and easily implemented method for performing full Bayesian analysis in hierarchical statistical models. The method relies on introducing a modified parameterization so that the reparameterized target distribution has close to constant scaling properties, and thus is easily sampled using standard (Euclidian metric) Hamiltonian Monte Carlo. Provided that the parameterizations of the conditional distributions specifying the hierarchical model are “constant information parameterizations” (CIPs), the relation between the modified- and original parameterization is bijective, explicitly computed, and admit exploitation of sparsity in the numerical linear algebra involved. CIPs for a large catalogue of statistical models are presented, and from the catalogue, it is clear that many CIPs are currently routinely used in statistical computing. A relation between the proposed methodology and a class of explicitly integrated Riemann manifold Hamiltonian Monte Carlo methods is discussed. The methodology is illustrated on several example models, including a model for inflation rates with multiple levels of nonlinearly dependent latent variables. Supplementary materials for this article are available online. 相似文献
16.
Hemant Ishwaran 《Journal of computational and graphical statistics》2013,22(4):779-799
Abstract The “leapfrog” hybrid Monte Carlo algorithm is a simple and effective MCMC method for fitting Bayesian generalized linear models with canonical link. The algorithm leads to large trajectories over the posterior and a rapidly mixing Markov chain, having superior performance over conventional methods in difficult problems like logistic regression with quasicomplete separation. This method offers a very attractive solution to this common problem, providing a method for identifying datasets that are quasicomplete separated, and for identifying the covariates that are at the root of the problem. The method is also quite successful in fitting generalized linear models in which the link function is extended to include a feedforward neural network. With a large number of hidden units, however, or when the dataset becomes large, the computations required in calculating the gradient in each trajectory can become very demanding. In this case, it is best to mix the algorithm with multivariate random walk Metropolis—Hastings. However, this entails very little additional programming work. 相似文献
17.
Steven L. Scott 《Journal of computational and graphical statistics》2013,22(3):662-670
Abstract We postulate observations from a Poisson process whose rate parameter modulates between two values determined by an unobserved Markov chain. The theory switches from continuous to discrete time by considering the intervals between observations as a sequence of dependent random variables. A result from hidden Markov models allows us to sample from the posterior distribution of the model parameters given the observed event times using a Gibbs sampler with only two steps per iteration. 相似文献
18.
19.
《Journal of computational and graphical statistics》2013,22(4):811-830
This article proposes a new Bayesian approach to prediction on continuous covariates. The Bayesian partition model constructs arbitrarily complex regression and classification surfaces by splitting the covariate space into an unknown number of disjoint regions. Within each region the data are assumed to be exchangeable and come from some simple distribution. Using conjugate priors, the marginal likelihoods of the models can be obtained analytically for any proposed partitioning of the space where the number and location of the regions is assumed unknown a priori. Markov chain Monte Carlo simulation techniques are used to obtain predictive distributions at the design points by averaging across posterior samples of partitions. 相似文献
20.
Chunfeng Huang Tailen Hsing Noel Cressie Auroop R. Ganguly Vladimir A. Protopopescu Nageswara S. Rao 《商业与工业应用随机模型》2010,26(4):331-348
We consider a network of sensors that measure the intensities of a complex plume composed of multiple absorption–diffusion source components. We address the problem of estimating the plume parameters, including the spatial and temporal source origins and the parameters of the diffusion model for each source, based on a sequence of sensor measurements. The approach not only leads to multiple‐source detection, but also the characterization and prediction of the combined plume in space and time. The parameter estimation is formulated as a Bayesian inference problem, and the solution is obtained using a Markov chain Monte Carlo algorithm. The approach is applied to a simulation study, which shows that an accurate parameter estimation is achievable. Copyright © 2010 John Wiley & Sons, Ltd. 相似文献