首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We describe adaptive Markov chain Monte Carlo (MCMC) methods for sampling posterior distributions arising from Bayesian variable selection problems. Point-mass mixture priors are commonly used in Bayesian variable selection problems in regression. However, for generalized linear and nonlinear models where the conditional densities cannot be obtained directly, the resulting mixture posterior may be difficult to sample using standard MCMC methods due to multimodality. We introduce an adaptive MCMC scheme that automatically tunes the parameters of a family of mixture proposal distributions during simulation. The resulting chain adapts to sample efficiently from multimodal target distributions. For variable selection problems point-mass components are included in the mixture, and the associated weights adapt to approximate marginal posterior variable inclusion probabilities, while the remaining components approximate the posterior over nonzero values. The resulting sampler transitions efficiently between models, performing parameter estimation and variable selection simultaneously. Ergodicity and convergence are guaranteed by limiting the adaptation based on recent theoretical results. The algorithm is demonstrated on a logistic regression model, a sparse kernel regression, and a random field model from statistical biophysics; in each case the adaptive algorithm dramatically outperforms traditional MH algorithms. Supplementary materials for this article are available online.  相似文献   

2.
3.
Hidden Markov models are used as tools for pattern recognition in a number of areas, ranging from speech processing to biological sequence analysis. Profile hidden Markov models represent a class of so-called “left–right” models that have an architecture that is specifically relevant to classification of proteins into structural families based on their amino acid sequences. Standard learning methods for such models employ a variety of heuristics applied to the expectation-maximization implementation of the maximum likelihood estimation procedure in order to find the global maximum of the likelihood function. Here, we compare maximum likelihood estimation to fully Bayesian estimation of parameters for profile hidden Markov models with a small number of parameters. We find that, relative to maximum likelihood methods, Bayesian methods assign higher scores to data sequences that are distantly related to the pattern consensus, show better performance in classifying these sequences correctly, and continue to perform robustly with regard to misspecification of the number of model parameters. Though our study is limited in scope, we expect our results to remain relevant for models with a large number of parameters and other types of left–right hidden Markov models.  相似文献   

4.
The unknown or unobservable risk factors in the survival analysis cause heterogeneity between the individuals. Frailty models are used in the survival analysis to account for the unobserved heterogeneity in the individual risks to disease and death. In this paper, we suggest the shared gamma frailty model with the reversed hazard rate. We introduce the Bayesian estimation procedure using MCMC technique to estimate the parameters involved in the model and compare the frailty model with the baseline model. We apply the proposed models to Australian twin data set and suggest a better model.  相似文献   

5.
Abstract

In this article we discuss the problem of assessing the performance of Markov chain Monte Carlo (MCMC) algorithms on the basis of simulation output. In essence, we extend the original ideas of Gelman and Rubin and, more recently, Brooks and Gelman, to problems where we are able to split the variation inherent within the MCMC simulation output into two distinct groups. We show how such a diagnostic may be useful in assessing the performance of MCMC samplers addressing model choice problems, such as the reversible jump MCMC algorithm. In the model choice context, we show how the reversible jump MCMC simulation output for parameters that retain a coherent interpretation throughout the simulation, can be used to assess convergence. By considering various decompositions of the sampling variance of this parameter, we can assess the performance of our MCMC sampler in terms of its mixing properties both within and between models and we illustrate our approach in both the graphical Gaussian models and normal mixtures context. Finally, we provide an example of the application of our diagnostic to the assessment of the influence of different starting values on MCMC simulation output, thereby illustrating the wider utility of our method beyond the Bayesian model choice and reversible jump MCMC context.  相似文献   

6.
Extreme value theory has been widely used in analyzing catastrophic risk. The theory mentioned that the generalized Pareto distribution (GPD) could be used to estimate the limiting distribution of the excess value over a certain threshold; thus the tail behaviors are analyzed. However, the central behavior is important because it may affect the estimation of model parameters in GPD, and the evaluation of catastrophic insurance premiums also depends on the central behavior. This paper proposes four mixture models to model earthquake catastrophic loss and proposes Bayesian approaches to estimate the unknown parameters and the threshold in these mixture models. MCMC methods are used to calculate the Bayesian estimates of model parameters, and deviance information criterion values are obtained for model comparison. The earthquake loss of Yunnan province is analyzed to illustrate the proposed methods. Results show that the estimation of the threshold and the shape and scale of GPD are quite different. Value-at-risk and expected shortfall for the proposed mixture models are calculated under different confidence levels.  相似文献   

7.
本文研究泊松逆高斯回归模型的贝叶斯统计推断.基于应用Gibbs抽样,Metropolis-Hastings算法以及Multiple-Try Metropolis算法等MCMC统计方法计算模型未知参数和潜变量的联合贝叶斯估计,并引入两个拟合优度统计量来评价提出的泊松逆高斯回归模型的合理性.若干模拟研究与一个实证分析说明方法的可行性.  相似文献   

8.
Data generated in forestry biometrics are not normal in statistical sense as they rarely follow the normal regression model. Hence, there is a need to develop models and methods in forest biometric applications for non-normal models. Due to generality of Bayesian methods it can be implemented in the situations when Gaussian regression models do not fit the data. Data on diameter at breast height (dbh), which is a very important characteristic in forestry has been fitted to Weibull and gamma models in Bayesian paradigm and comparisons have also been made with its classical counterpart. It may be noted that MCMC simulation tools are used in this study. An attempt has been made to apply Bayesian simulation tools using \textbf{R} software.  相似文献   

9.
Parametric mortality models capture the cross section of mortality rates. These models fit the older ages better, because of the more complex cross section of mortality at younger and middle ages. Dynamic parametric mortality models fit a time series to the parameters, such as a Vector-auto-regression (VAR), in order to capture trends and uncertainty in mortality improvements. We consider the full age range using the Heligman and Pollard (1980) model, a cross-sectional mortality model with parameters that capture specific features of different age ranges. We make the Heligman–Pollard model dynamic using a Bayesian Vector Autoregressive (BVAR) model for the parameters and compare with more commonly used VAR models. We fit the models using Australian data, a country with similar mortality experience to many developed countries. We show how the Bayesian Vector Autoregressive (BVAR) models improve forecast accuracy compared to VAR models and quantify parameter risk which is shown to be significant.  相似文献   

10.
Complex hierarchical models lead to a complicated likelihood and then, in a Bayesian analysis, to complicated posterior distributions. To obtain Bayes estimates such as the posterior mean or Bayesian confidence regions, it is therefore necessary to simulate the posterior distribution using a method such as an MCMC algorithm. These algorithms often get slower as the number of observations increases, especially when the latent variables are considered. To improve the convergence of the algorithm, we propose to decrease the number of parameters to simulate at each iteration by using a Laplace approximation on the nuisance parameters. We provide a theoretical study of the impact that such an approximation has on the target posterior distribution. We prove that the distance between the true target distribution and the approximation becomes of order O(N?a) with a ∈ (0, 1), a close to 1, as the number of observations N increases. A simulation study illustrates the theoretical results. The approximated MCMC algorithm behaves extremely well on an example which is driven by a study on HIV patients.  相似文献   

11.
A hierarchical model is developed for the joint mortality analysis of pension scheme datasets. The proposed model allows for a rigorous statistical treatment of missing data. While our approach works for any missing data pattern, we are particularly interested in a scenario where some covariates are observed for members of one pension scheme but not the other. Therefore, our approach allows for the joint modelling of datasets which contain different information about individual lives. The proposed model generalizes the specification of parametric models when accounting for covariates. We consider parameter uncertainty using Bayesian techniques. Model parametrization is analysed in order to obtain an efficient MCMC sampler, and address model selection. The inferential framework described here accommodates any missing-data pattern, and turns out to be useful to analyse statistical relationships among covariates. Finally, we assess the financial impact of using the covariates, and of the optimal use of the whole available sample when combining data from different mortality experiences.  相似文献   

12.
We present a unified semiparametric Bayesian approach based on Markov random field priors for analyzing the dependence of multicategorical response variables on time, space and further covariates. The general model extends dynamic, or state space, models for categorical time series and longitudinal data by including spatial effects as well as nonlinear effects of metrical covariates in flexible semiparametric form. Trend and seasonal components, different types of covariates and spatial effects are all treated within the same general framework by assigning appropriate priors with different forms and degrees of smoothness. Inference is fully Bayesian and uses MCMC techniques for posterior analysis. The approach in this paper is based on latent semiparametric utility models and is particularly useful for probit models. The methods are illustrated by applications to unemployment data and a forest damage survey.  相似文献   

13.
Stochastic epidemic models describe the dynamics of an epidemic as a disease spreads through a population. Typically, only a fraction of cases are observed at a set of discrete times. The absence of complete information about the time evolution of an epidemic gives rise to a complicated latent variable problem in which the state space size of the epidemic grows large as the population size increases. This makes analytically integrating over the missing data infeasible for populations of even moderate size. We present a data augmentation Markov chain Monte Carlo (MCMC) framework for Bayesian estimation of stochastic epidemic model parameters, in which measurements are augmented with subject-level disease histories. In our MCMC algorithm, we propose each new subject-level path, conditional on the data, using a time-inhomogenous continuous-time Markov process with rates determined by the infection histories of other individuals. The method is general, and may be applied to a broad class of epidemic models with only minimal modifications to the model dynamics and/or emission distribution. We present our algorithm in the context of multiple stochastic epidemic models in which the data are binomially sampled prevalence counts, and apply our method to data from an outbreak of influenza in a British boarding school. Supplementary material for this article is available online.  相似文献   

14.
本文主要研究广义非参数模型B样条Bayes估计 .将回归函数按照B样条基展开 ,我们不具体选择节点的个数 ,而是节点个数取均匀的无信息先验 ,样条函数系数取正态先验 ,用B样条函数的后验均值估计回归函数 .并给出了回归函数B样条Bayes估计的MCMC的模拟计算方法 .通过对Logistic非参数回归的模拟研究 ,表明B样条Bayes估计得到了很好的估计效果  相似文献   

15.
We develop NHPP models to characterize categorized event data, with application to modelling the discovery process for categorized software defects. Conditioning on the total number of defects, multivariate models are proposed for modelling the defects by type. A latent vector autoregressive structure is used to characterize dependencies among the different types. We show how Bayesian inference can be achieved via MCMC procedures, with a posterior prediction‐based L‐measure used for model selection. The results are illustrated for defects of different types found during the System Test phase of a large operating system software development project. Copyright © 2005 John Wiley & Sons, Ltd.  相似文献   

16.
Evaluation tests for air surveillance radars are often formulated in terms of the probability to detect a target at a specified range. Statistical methods applied in these tests do not explore all data in a full probabilistic model, which is crucial when dealing with small samples. The collected data are arranged longitudinally, in different levels (altitude), indexed both in time and distance. In this context we propose the application of dynamic Bayesian hierarchical models as an efficient way to incorporate the complete data set. Markov Chain Monte Carlo methods (MCMC) are used to make inference and to evaluate the proposed models.  相似文献   

17.
In the present paper we study switching state space models from a Bayesian point of view. We discuss various MCMC methods for Bayesian estimation, among them unconstrained Gibbs sampling, constrained sampling and permutation sampling. We address in detail the problem of unidentifiability, and discuss potential information available from an unidentified model. Furthermore the paper discusses issues in model selection such as selecting the number of states or testing for the presence of Markov switching heterogeneity. The model likelihoods of all possible hypotheses are estimated by using the method of bridge sampling. We conclude the paper with applications to simulated data as well as to modelling the U.S./U.K. real exchange rate.  相似文献   

18.
Bayesian model selection depends on the integrated likelihood of the data given the model. Newton and Raftery’s harmonic mean estimator (HME) is simple to implement by computing the likelihood of the data at MCMC draws from the posterior distribution. Alternative methods in the literature require additional simulations or more extensive computations. In theory HME is consistent but can have an infinite variance. In practice, the computed HME often is simulation pseudo-biased. This article identifies the source of the pseudo-bias and recommends several algorithms for adjusting the HME to remove it. The pseudo-bias can be substantial and can negatively affect HME’s ability to select the correct model in Bayesian model selection. The pseudo-bias often causes the computed HME to overestimate the integrated likelihood, and the amount of pseudo-bias tends to be larger for more complex models. When the computed HME errs, it tends to select models that are too complex. Simulation studies of linear and logistic regression models demonstrate that the adjusted HME effectively removes the pseudo-bias, is more accurate, and indicates more reliably the best model.

Supplemental materials are available online. These materials include the appendices and Gauss program.  相似文献   

19.
In this paper the usage of a stochastic optimization algorithm as a model search tool is proposed for the Bayesian variable selection problem in generalized linear models. Combining aspects of three well known stochastic optimization algorithms, namely, simulated annealing, genetic algorithm and tabu search, a powerful model search algorithm is produced. After choosing suitable priors, the posterior model probability is used as a criterion function for the algorithm; in cases when it is not analytically tractable Laplace approximation is used. The proposed algorithm is illustrated on normal linear and logistic regression models, for simulated and real-life examples, and it is shown that, with a very low computational cost, it achieves improved performance when compared with popular MCMC algorithms, such as the MCMC model composition, as well as with “vanilla” versions of simulated annealing, genetic algorithm and tabu search.  相似文献   

20.
We describe a serial algorithm called feature-inclusion stochastic search, or FINCS, that uses online estimates of edge-inclusion probabilities to guide Bayesian model determination in Gaussian graphical models. FINCS is compared to MCMC, to Metropolis-based search methods, and to the popular lasso; it is found to be superior along a variety of dimensions, leading to better sets of discovered models, greater speed and stability, and reasonable estimates of edge-inclusion probabilities. We illustrate FINCS on an example involving mutual-fund data, where we compare the model-averaged predictive performance of models discovered with FINCS to those discovered by competing methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号