首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 187 毫秒
1.
Topic models, and more specifically the class of latent Dirichlet allocation (LDA), are widely used for probabilistic modeling of text. Markov chain Monte Carlo (MCMC) sampling from the posterior distribution is typically performed using a collapsed Gibbs sampler. We propose a parallel sparse partially collapsed Gibbs sampler and compare its speed and efficiency to state-of-the-art samplers for topic models on five well-known text corpora of differing sizes and properties. In particular, we propose and compare two different strategies for sampling the parameter block with latent topic indicators. The experiments show that the increase in statistical inefficiency from only partial collapsing is smaller than commonly assumed, and can be more than compensated by the speedup from parallelization and sparsity on larger corpora. We also prove that the partially collapsed samplers scale well with the size of the corpus. The proposed algorithm is fast, efficient, exact, and can be used in more modeling situations than the ordinary collapsed sampler. Supplementary materials for this article are available online.  相似文献   

2.
Gaussian Markov random fields (GMRF) are important families of distributions for the modeling of spatial data and have been extensively used in different areas of spatial statistics such as disease mapping, image analysis and remote sensing. GMRFs have been used for the modeling of spatial data, both as models for the sampling distribution of the observed data and as models for the prior of latent processes/random effects; we consider mainly the former use of GMRFs. We study a large class of GMRF models that includes several models previously proposed in the literature. An objective Bayesian analysis is presented for the parameters of the above class of GMRFs, where explicit expressions for the Jeffreys (two versions) and reference priors are derived, and for each of these priors results on posterior propriety of the model parameters are established. We describe a simple MCMC algorithm for sampling from the posterior distribution of the model parameters, and study frequentist properties of the Bayesian inferences resulting from the use of these automatic priors. Finally, we illustrate the use of the proposed GMRF model and reference prior for studying the spatial variability of lip cancer cases in the districts of Scotland over the period 1975-1980.  相似文献   

3.
We develop scalar-on-image regression models when images are registered multidimensional manifolds. We propose a fast and scalable Bayes’ inferential procedure to estimate the image coefficient. The central idea is the combination of an Ising prior distribution, which controls a latent binary indicator map, and an intrinsic Gaussian Markov random field, which controls the smoothness of the nonzero coefficients. The model is fit using a single-site Gibbs sampler, which allows fitting within minutes for hundreds of subjects with predictor images containing thousands of locations. The code is simple and is provided in the online Appendix (see the “Supplementary Materials” section). We apply this method to a neuroimaging study where cognitive outcomes are regressed on measures of white-matter microstructure at every voxel of the corpus callosum for hundreds of subjects.  相似文献   

4.
The Gibbs sampler is a popular Markov chain Monte Carlo routine for generating random variates from distributions otherwise difficult to sample. A number of implementations are available for running a Gibbs sampler varying in the order through which the full conditional distributions used by the Gibbs sampler are cycled or visited. A common, and in fact the original, implementation is the random scan strategy, whereby the full conditional distributions are updated in a randomly selected order each iteration. In this paper, we introduce a random scan Gibbs sampler which adaptively updates the selection probabilities or “learns” from all previous random variates generated during the Gibbs sampling. In the process, we outline a number of variations on the random scan Gibbs sampler which allows the practitioner many choices for setting the selection probabilities and prove convergence of the induced (Markov) chain to the stationary distribution of interest. Though we emphasize flexibility in user choice and specification of these random scan algorithms, we present a minimax random scan which determines the selection probabilities through decision theoretic considerations on the precision of estimators of interest. We illustrate and apply the results presented by using the adaptive random scan Gibbs sampler developed to sample from multivariate Gaussian target distributions, to automate samplers for posterior simulation under Dirichlet process mixture models, and to fit mixtures of distributions.  相似文献   

5.
The Gaussian geostatistical model has been widely used for modeling spatial data. However, this model suffers from a severe difficulty in computation: it requires users to invert a large covariance matrix. This is infeasible when the number of observations is large. In this article, we propose an auxiliary lattice-based approach for tackling this difficulty. By introducing an auxiliary lattice to the space of observations and defining a Gaussian Markov random field on the auxiliary lattice, our model completely avoids the requirement of matrix inversion. It is remarkable that the computational complexity of our method is only O(n), where n is the number of observations. Hence, our method can be applied to very large datasets with reasonable computational (CPU) times. The numerical results indicate that our model can approximate Gaussian random fields very well in terms of predictions, even for those with long correlation lengths. For real data examples, our model can generally outperform conventional Gaussian random field models in both prediction errors and CPU times. Supplemental materials for the article are available online.  相似文献   

6.
Summary  The Gibbs sampler, being a popular routine amongst Markov chain Monte Carlo sampling methodologies, has revolutionized the application of Monte Carlo methods in statistical computing practice. The performance of the Gibbs sampler relies heavily on the choice of sweep strategy, that is, the means by which the components or blocks of the random vector X of interest are visited and updated. We develop an automated, adaptive algorithm for implementing the optimal sweep strategy as the Gibbs sampler traverses the sample space. The decision rules through which this strategy is chosen are based on convergence properties of the induced chain and precision of statistical inferences drawn from the generated Monte Carlo samples. As part of the development, we analytically derive closed form expressions for the decision criteria of interest and present computationally feasible implementations of the adaptive random scan Gibbs sampler via a Gaussian approximation to the target distribution. We illustrate the results and algorithms presented by using the adaptive random scan Gibbs sampler developed to sample multivariate Gaussian target distributions, and screening test and image data. Research by RL and ZY supported in part by a US National Science Foundation FRG grant 0139948 and a grant from Lawrence Livermore National Laboratory, Livermore, California, USA.  相似文献   

7.
In this paper we develop set of novel Markov chain Monte Carlo algorithms for Bayesian smoothing of partially observed non-linear diffusion processes. The sampling algorithms developed herein use a deterministic approximation to the posterior distribution over paths as the proposal distribution for a mixture of an independence and a random walk sampler. The approximating distribution is sampled by simulating an optimized time-dependent linear diffusion process derived from the recently developed variational Gaussian process approximation method. The novel diffusion bridge proposal derived from the variational approximation allows the use of a flexible blocking strategy that further improves mixing, and thus the efficiency, of the sampling algorithms. The algorithms are tested on two diffusion processes: one with double-well potential drift and another with SINE drift. The new algorithm’s accuracy and efficiency is compared with state-of-the-art hybrid Monte Carlo based path sampling. It is shown that in practical, finite sample applications the algorithm is accurate except in the presence of large observation errors and low observation densities, which lead to a multi-modal structure in the posterior distribution over paths. More importantly, the variational approximation assisted sampling algorithm outperforms hybrid Monte Carlo in terms of computational efficiency, except when the diffusion process is densely observed with small errors in which case both algorithms are equally efficient.  相似文献   

8.
本文研究泊松逆高斯回归模型的贝叶斯统计推断.基于应用Gibbs抽样,Metropolis-Hastings算法以及Multiple-Try Metropolis算法等MCMC统计方法计算模型未知参数和潜变量的联合贝叶斯估计,并引入两个拟合优度统计量来评价提出的泊松逆高斯回归模型的合理性.若干模拟研究与一个实证分析说明方法的可行性.  相似文献   

9.
This article proposes a four-pronged approach to efficient Bayesian estimation and prediction for complex Bayesian hierarchical Gaussian models for spatial and spatiotemporal data. The method involves reparameterizing the covariance structure of the model, reformulating the means structure, marginalizing the joint posterior distribution, and applying a simplex-based slice sampling algorithm. The approach permits fusion of point-source data and areal data measured at different resolutions and accommodates nonspatial correlation and variance heterogeneity as well as spatial and/or temporal correlation. The method produces Markov chain Monte Carlo samplers with low autocorrelation in the output, so that fewer iterations are needed for Bayesian inference than would be the case with other sampling algorithms. Supplemental materials are available online.  相似文献   

10.
Abstract

A discrete image of several colors is viewed as a discrete random field obtained by clipping or quantizing a Gaussian random field at several levels. Given a discrete image, parameters of the unobserved original Gaussian random field are estimated. Discrete images, statistically similar to the original image, are then obtained by generating different realizations of the Gaussian field and clipping them. To overcome the computational difficulties, the block Toeplitz covariance matrix of the Gaussian field is embedded into a block circulant matrix which is diagonalized by the fast Fourier transform. The Gibbs sampler is used to apply the stochastic EM algorithm for the estimation of the field's parameters.  相似文献   

11.
Contour maps are widely used to display estimates of spatial fields. Instead of showing the estimated field, a contour map only shows a fixed number of contour lines for different levels. However, despite the ubiquitous use of these maps, the uncertainty associated with them has been given a surprisingly small amount of attention. We derive measures of the statistical uncertainty, or quality, of contour maps, and use these to decide an appropriate number of contour lines, which relates to the uncertainty in the estimated spatial field. For practical use in geostatistics and medical imaging, computational methods are constructed, that can be applied to Gaussian Markov random fields, and in particular be used in combination with integrated nested Laplace approximations for latent Gaussian models. The methods are demonstrated on simulated data and an application to temperature estimation is presented.  相似文献   

12.
A Markov random field (MRF) is a useful technical tool for modeling dynamics systems exhibiting some type of spatio-temporal variability. In this paper, we propose optimal filters for the states of a partially observed temporal Markov random field. We also discuss parameters estimation. This generalizes an earlier work by Elliott and Aggoun [1].  相似文献   

13.
To every Markov process with a symmetric transition density, there correspond two random fields over the state space: a Gaussian field (the free field) φ and the occupation field T which describes amount of time the particle spends at each state. A relation between these two random fields is established which is useful both for the field theory and theory of Markov processes.  相似文献   

14.
The evolution of DNA sequences can be described by discrete state continuous time Markov processes on a phylogenetic tree. We consider neighbor-dependent evolutionary models where the instantaneous rate of substitution at a site depends on the states of the neighboring sites. Neighbor-dependent substitution models are analytically intractable and must be analyzed using either approximate or simulation-based methods. We describe statistical inference of neighbor-dependent models using a Markov chain Monte Carlo expectation maximization (MCMC-EM) algorithm. In the MCMC-EM algorithm, the high-dimensional integrals required in the EM algorithm are estimated using MCMC sampling. The MCMC sampler requires simulation of sample paths from a continuous time Markov process, conditional on the beginning and ending states and the paths of the neighboring sites. An exact path sampling algorithm is developed for this purpose.  相似文献   

15.
We study a class of Gaussian random fields with negative correlations. These fields are easy to simulate. They are defined in a natural way from a Markov chain that has the index space of the Gaussian field as its state space. In parallel with Dynkin's investigation of Gaussian fields having covariance given by the Green's function of a Markov process, we develop connections between the occupation times of the Markov chain and the prediction properties of the Gaussian field. Our interest in such fields was initiated by their appearance in random matrix theory.  相似文献   

16.
Series models have several functions: comprehending the functional dependence of variable of interest on covariates, forecasting the dependent variable for future values of covariates and estimating variance disintegration, co-integration and steady-state relations. Although the regression function in a time series model has been extensively modeled both parametrically and nonparametrically, modeling of the error autocorrelation is mainly restricted to the parametric setup. A proper modeling of autocorrelation not only helps to reduce the bias in regression function estimate, but also enriches forecasting via a better forecast of the error term. In this article, we present a nonparametric modeling of autocorrelation function under a Bayesian framework. Moving into the frequency domain from the time domain, we introduce a Gaussian process prior to the log of the spectral density, which is then updated by using a Whittle approximation for the likelihood function (Whittle likelihood). The posterior computation is simplified due to the fact that Whittle likelihood is approximated by the likelihood of a normal mixture distribution with log-spectral density as a location shift parameter, where the mixture is of only five components with known means, variances, and mixture probabilities. The problem then becomes conjugate conditional on the mixture components, and a Gibbs sampler is used to initiate the unknown mixture components as latent variables. We present a simulation study for performance comparison, and apply our method to the two real data examples.  相似文献   

17.
Fitting hierarchical Bayesian models to spatially correlated datasets using Markov chain Monte Carlo (MCMC) techniques is computationally expensive. Complicated covariance structures of the underlying spatial processes, together with high-dimensional parameter space, mean that the number of calculations required grows cubically with the number of spatial locations at each MCMC iteration. This necessitates the need for efficient model parameterizations that hasten the convergence and improve the mixing of the associated algorithms. We consider partially centred parameterizations (PCPs) which lie on a continuum between what are known as the centered (CP) and noncentered parameterizations (NCP). By introducing a weight matrix we remove the conditional posterior correlation between the fixed and the random effects, and hence construct a PCP which achieves immediate convergence for a three-stage model, based on multiple Gaussian processes with known covariance parameters. When the covariance parameters are unknown we dynamically update the parameterization within the sampler. The PCP outperforms both the CP and the NCP and leads to a fully automated algorithm which has been demonstrated in two simulation examples. The effectiveness of the spatially varying PCP is illustrated with a practical dataset of nitrogen dioxide concentration levels. Supplemental materials consisting of appendices, datasets, and computer code to reproduce the results are available online.  相似文献   

18.
We describe various sets of conditional independence relationships, sufficient for qualitatively comparing non-vanishing squared partial correlations of a Gaussian random vector. These sufficient conditions are satisfied by several graphical Markov models. Rules for comparing degree of association among the vertices of such Gaussian graphical models are also developed. We apply these rules to compare conditional dependencies on Gaussian trees. In particular for trees, we show that such dependence can be completely characterised by the length of the paths joining the dependent vertices to each other and to the vertices conditioned on. We also apply our results to postulate rules for model selection for polytree models. Our rules apply to mutual information of Gaussian random vectors as well.  相似文献   

19.
This article compares three binary Markov random fields (MRFs) which are popular Bayesian priors for spatial smoothing. These are the Ising prior and two priors based on latent Gaussian MRFs. Concern is given to the selection of a suitable Markov chain Monte Carlo (MCMC) sampling scheme for each prior. The properties of the three priors and sampling schemes are investigated in the context of three empirical examples. The first is a simulated dataset, the second involves a confocal fluorescence microscopy dataset, while the third is based on the analysis of functional magnetic resonance imaging (fMRI) data. In the case of the Ising prior, single site and multi-site Swendsen-Wang sampling schemes are both considered. The single site scheme is shown to work consistently well, while it is shown that the Swendsen-Wang algorithm can have convergence problems. The sampling schemes for the priors are extended to generate the smoothing parameters, so that estimation becomes fully automatic. Although this works well, it is found that for highly contiguous images fixing smoothing parameters to very high values can improve results by injecting additional prior information concerning the level of contiguity in the image. The relative properties of the three binary MRFs are investigated, and it is shown how the Ising prior in particular defines sharp edges and encourages clustering. In addition, one of the latent Gaussian MRF priors is shown to be unable to distinguish between higher levels of smoothing. In the context of the fMRI example we also undertake a simulation study.  相似文献   

20.
We describe adaptive Markov chain Monte Carlo (MCMC) methods for sampling posterior distributions arising from Bayesian variable selection problems. Point-mass mixture priors are commonly used in Bayesian variable selection problems in regression. However, for generalized linear and nonlinear models where the conditional densities cannot be obtained directly, the resulting mixture posterior may be difficult to sample using standard MCMC methods due to multimodality. We introduce an adaptive MCMC scheme that automatically tunes the parameters of a family of mixture proposal distributions during simulation. The resulting chain adapts to sample efficiently from multimodal target distributions. For variable selection problems point-mass components are included in the mixture, and the associated weights adapt to approximate marginal posterior variable inclusion probabilities, while the remaining components approximate the posterior over nonzero values. The resulting sampler transitions efficiently between models, performing parameter estimation and variable selection simultaneously. Ergodicity and convergence are guaranteed by limiting the adaptation based on recent theoretical results. The algorithm is demonstrated on a logistic regression model, a sparse kernel regression, and a random field model from statistical biophysics; in each case the adaptive algorithm dramatically outperforms traditional MH algorithms. Supplementary materials for this article are available online.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号