共查询到20条相似文献,搜索用时 15 毫秒
1.
《Journal of computational and graphical statistics》2013,22(3):702-718
When using a model-based approach to geostatistical problems, often, due to the complexity of the models, inference relies on Markov chain Monte Carlo methods. This article focuses on the generalized linear spatial models, and demonstrates that parameter estimation and model selection using Markov chain Monte Carlo maximum likelihood is a feasible and very useful technique. A dataset of radionuclide concentrations on Rongelap Island is used to illustrate the techniques. For this dataset we demonstrate that the log-link function is not a good choice, and that there exists additional nonspatial variation which cannot be attributed to the Poisson error distribution. We also show that the interpretation of this additional variation as either micro-scale variation or measurement error has a significant impact on predictions. The techniques presented in this article would also be useful for other types of geostatistical models. 相似文献
2.
Steven J. Lewis Alpan Raval John E. Angus 《Mathematical and Computer Modelling》2008,47(11-12):1198-1216
Hidden Markov models are used as tools for pattern recognition in a number of areas, ranging from speech processing to biological sequence analysis. Profile hidden Markov models represent a class of so-called “left–right” models that have an architecture that is specifically relevant to classification of proteins into structural families based on their amino acid sequences. Standard learning methods for such models employ a variety of heuristics applied to the expectation-maximization implementation of the maximum likelihood estimation procedure in order to find the global maximum of the likelihood function. Here, we compare maximum likelihood estimation to fully Bayesian estimation of parameters for profile hidden Markov models with a small number of parameters. We find that, relative to maximum likelihood methods, Bayesian methods assign higher scores to data sequences that are distantly related to the pattern consensus, show better performance in classifying these sequences correctly, and continue to perform robustly with regard to misspecification of the number of model parameters. Though our study is limited in scope, we expect our results to remain relevant for models with a large number of parameters and other types of left–right hidden Markov models. 相似文献
3.
Bayesian hierarchical models have been used for smoothing splines, thin-plate splines, and L-splines. In analyzing high dimensional data sets, additive models and backfitting methods are often used. A full Bayesian analysis for such models may include a large number of random effects, many of which are not intuitive, so researchers typically use noninformative improper or nearly improper priors. We investigate propriety of the posterior for these cases. Our findings extend known results for normal linear mixed models to certain cases with Bayesian additive smoothing spline models. Supported by National Science Foundation grant SES-0351523 and by National Institutes of Health grants R01-CA100760 and R01-MH071418. 相似文献
4.
Tore Selland Kleppe 《Journal of computational and graphical statistics》2013,22(3):493-507
Dynamically rescaled Hamiltonian Monte Carlo is introduced as a computationally fast and easily implemented method for performing full Bayesian analysis in hierarchical statistical models. The method relies on introducing a modified parameterization so that the reparameterized target distribution has close to constant scaling properties, and thus is easily sampled using standard (Euclidian metric) Hamiltonian Monte Carlo. Provided that the parameterizations of the conditional distributions specifying the hierarchical model are “constant information parameterizations” (CIPs), the relation between the modified- and original parameterization is bijective, explicitly computed, and admit exploitation of sparsity in the numerical linear algebra involved. CIPs for a large catalogue of statistical models are presented, and from the catalogue, it is clear that many CIPs are currently routinely used in statistical computing. A relation between the proposed methodology and a class of explicitly integrated Riemann manifold Hamiltonian Monte Carlo methods is discussed. The methodology is illustrated on several example models, including a model for inflation rates with multiple levels of nonlinearly dependent latent variables. Supplementary materials for this article are available online. 相似文献
5.
Jeffrey W. Miller 《Journal of computational and graphical statistics》2019,28(2):476-480
The gamma distribution arises frequently in Bayesian models, but there is not an easy-to-use conjugate prior for the shape parameter of a gamma. This inconvenience is usually dealt with by using either Metropolis–Hastings moves, rejection sampling methods, or numerical integration. However, in models with a large number of shape parameters, these existing methods are slower or more complicated than one would like, making them burdensome in practice. It turns out that the full conditional distribution of the gamma shape parameter is well approximated by a gamma distribution, even for small sample sizes, when the prior on the shape parameter is also a gamma distribution. This article introduces a quick and easy algorithm for finding a gamma distribution that approximates the full conditional distribution of the shape parameter. We empirically demonstrate the speed and accuracy of the approximation across a wide range of conditions. If exactness is required, the approximation can be used as a proposal distribution for Metropolis–Hastings. Supplementary material for this article is available online. 相似文献
6.
The pricing of insurance policies requires estimates of the total loss. The traditional compound model imposes an independence assumption on the number of claims and their individual sizes. Bivariate models, which model both variables jointly, eliminate this assumption. A regression approach allows policy holder characteristics and product features to be included in the model. This article presents a bivariate model that uses joint random effects across both response variables to induce dependence effects. Bayesian posterior estimation is done using Markov Chain Monte Carlo (MCMC) methods. A real data example demonstrates that our proposed model exhibits better fitting and forecasting capabilities than existing models. 相似文献
7.
The study of factors affecting human fertility is an important problem affording interesting statistical and computational challenges. Analyses of human fertility rates must cope with extra variability in fecundability parameters as well as a host of covariates ranging from the obvious, such as coital frequency, to the subtle, like the smoking habits of the female’s mother. In retrospective human fecundity studies, researchers ask couples the time required to conceive. This time-to-pregnancy data often exhibits digit preference bias, among other problems. We introduce computationally intensive models with sufficient flexibility to represent such bias and other causes yielding a similar lack of monotonicity in conception probabilities. 相似文献
8.
One of the issues contributing to the success of any extreme value modeling is the choice of the number of upper order statistics used for inference, or equivalently, the selection of an appropriate threshold. In this paper we propose a Bayesian predictive approach to the peaks over threshold method with the purpose of estimating extreme quantiles beyond the range of the data. In the peaks over threshold (POT) method, we assume that the threshold identifies a model with a specified prior probability, from a set of possible models. For each model, the predictive distribution of a future excess over the corresponding threshold is computed, as well as a conditional estimate for the corresponding tail probability. The unconditional tail probability for a given future extreme observation from the unknown distribution is then obtained as an average of the conditional tail estimates with weights given by the posterior probability of each model. 相似文献
9.
《Applied Mathematical Modelling》2014,38(5-6):1698-1709
We consider Bayesian estimation of the stress–strength reliability based on record values. The estimators are derived under the squared error loss function in the one parameter as well as two-parameter exponential distributions. The Bayes estimators are derived, in some cases in closed form, and their performance is investigated in terms of their bias and mean squared errors and compared with the maximum likelihood estimators. An illustrative example is given. 相似文献
10.
A data smoothing method is described where the roughness penalty depends on a parameter that must be estimated from the data.
Three levels of parameters are involved in this situation: Local parameters are the coefficients of the basis function expansion defining the smooth, global parameters define low-dimensional trend and the roughness penalty, and a complexity parameter controls the amount of roughness in the smooth. By defining local parameters as regularized functions of global
parameters, and global parameters in turn as functions of complexity parameter, we define a parameter cascade, and show that
the accompanying multi-criterion optimization problem leads to good estimates of all levels of parameters and their precisions.
The approach is illustrated with real and simulated data, and this application is a prototype for a wide range of problems
involving nuisance or local parameters. 相似文献
11.
应用Monte Carlo EM(MCEM)算法给出了多层线性模型参数估计的新方法,解决了EM算法用于模型时积分计算困难的问题,并通过数值模拟将方法的估计结果与EM算法的进行比较,验证了方法的有效性和可行性. 相似文献
12.
本文使用蒙特卡罗方法, 求得广义线性混合模型之最大似然估计, 并提供用来评估统计参数之收敛和精确度之实用方法bd 仿真研究显示无偏之固定效应参数估计, 而方差分量估计之误差则相近于前人结果bd 应用举例为使用泊松分布求取乳癌死亡率之小区域估计. 相似文献
13.
The capability of implementing a complete Bayesian analysis of experimental data has emerged over recent years due to computational advances developed within the statistical community. The objective of this paper is to provide a practical exposition of these methods in the illustrative context of a financial event study. The customary assumption of Gaussian errors underlying development of the model is later supplemented by considering Student-t errors, thus permitting a Bayesian sensitivity analysis. The supplied data analysis illustrates the advantages of the sampling-based Bayesian approach in allowing investigation of quantities beyond the scope of classical methods. 相似文献
14.
The Dirichlet process and its extension, the Pitman–Yor process, are stochastic processes that take probability distributions as a parameter. These processes can be stacked up to form a hierarchical nonparametric Bayesian model. In this article, we present efficient methods for the use of these processes in this hierarchical context, and apply them to latent variable models for text analytics. In particular, we propose a general framework for designing these Bayesian models, which are called topic models in the computer science community. We then propose a specific nonparametric Bayesian topic model for modelling text from social media. We focus on tweets (posts on Twitter) in this article due to their ease of access. We find that our nonparametric model performs better than existing parametric models in both goodness of fit and real world applications. 相似文献
15.
《Journal of computational and graphical statistics》2013,22(2):378-394
In this article we study penalized regression splines (P-splines), which are low-order basis splines with a penalty to avoid undersmoothing. Such P-splines are typically not spatially adaptive, and hence can have trouble when functions are varying rapidly. Our approach is to model the penalty parameter inherent in the P-spline method as a heteroscedastic regression function. We develop a full Bayesian hierarchical structure to do this and use Markov chain Monte Carlo techniques for drawing random samples from the posterior for inference. The advantage of using a Bayesian approach to P-splines is that it allows for simultaneous estimation of the smooth functions and the underlying penalty curve in addition to providing uncertainty intervals of the estimated curve. The Bayesian credible intervals obtained for the estimated curve are shown to have pointwise coverage probabilities close to nominal. The method is extended to additive models with simultaneous spline-based penalty functions for the unknown functions. In simulations, the approach achieves very competitive performance with the current best frequentist P-spline method in terms of frequentist mean squared error and coverage probabilities of the credible intervals, and performs better than some of the other Bayesian methods. 相似文献
16.
Hemant Ishwaran 《Journal of computational and graphical statistics》2013,22(4):779-799
Abstract The “leapfrog” hybrid Monte Carlo algorithm is a simple and effective MCMC method for fitting Bayesian generalized linear models with canonical link. The algorithm leads to large trajectories over the posterior and a rapidly mixing Markov chain, having superior performance over conventional methods in difficult problems like logistic regression with quasicomplete separation. This method offers a very attractive solution to this common problem, providing a method for identifying datasets that are quasicomplete separated, and for identifying the covariates that are at the root of the problem. The method is also quite successful in fitting generalized linear models in which the link function is extended to include a feedforward neural network. With a large number of hidden units, however, or when the dataset becomes large, the computations required in calculating the gradient in each trajectory can become very demanding. In this case, it is best to mix the algorithm with multivariate random walk Metropolis—Hastings. However, this entails very little additional programming work. 相似文献
17.
《Journal of computational and graphical statistics》2013,22(3):590-609
In this article we consider the sequential monitoring process in normal dynamic linear models as a Bayesian sequential decision problem. We use this approach to build a general procedure that jointly analyzes the existence of outliers, level changes, variance changes, and the development of local correlations. In addition, we study the frequentist performance of this procedure and compare it with the monitoring algorithm proposed in an earlier article. 相似文献
18.
Vitaly Schetinin Jonathan E. Fieldsend Derek Partridge Wojtek J. Krzanowski Richard M. Everson Trevor C. Bailey Adolfo Hernandez 《Journal of Mathematical Modelling and Algorithms》2006,5(4):397-416
Multiple Classifier Systems (MCSs) allow evaluation of the uncertainty of classification outcomes that is of crucial importance for safety critical applications. The uncertainty of classification is determined by a trade-off between the amount of data available for training, the classifier diversity and the required performance. The interpretability of MCSs can also give useful information for experts responsible for making reliable classifications. For this reason Decision Trees (DTs) seem to be attractive classification models for experts. The required diversity of MCSs exploiting such classification models can be achieved by using two techniques, the Bayesian model averaging and the randomised DT ensemble. Both techniques have revealed promising results when applied to real-world problems. In this paper we experimentally compare the classification uncertainty of the Bayesian model averaging with a restarting strategy and the randomised DT ensemble on a synthetic dataset and some domain problems commonly used in the machine learning community. To make the Bayesian DT averaging feasible, we use a Markov Chain Monte Carlo technique. The classification uncertainty is evaluated within an Uncertainty Envelope technique dealing with the class posterior distribution and a given confidence probability. Exploring a full posterior distribution, this technique produces realistic estimates which can be easily interpreted in statistical terms. In our experiments we found out that the Bayesian DTs are superior to the randomised DT ensembles within the Uncertainty Envelope technique. 相似文献
19.
Implementations of the Monte Carlo EM Algorithm 总被引:1,自引:0,他引:1
《Journal of computational and graphical statistics》2013,22(3):422-439
The Monte Carlo EM (MCEM) algorithm is a modification of the EM algorithm where the expectation in the E-step is computed numerically through Monte Carlo simulations. The most exible and generally applicable approach to obtaining a Monte Carlo sample in each iteration of an MCEM algorithm is through Markov chain Monte Carlo (MCMC) routines such as the Gibbs and Metropolis–Hastings samplers. Although MCMC estimation presents a tractable solution to problems where the E-step is not available in closed form, two issues arise when implementing this MCEM routine: (1) how do we minimize the computational cost in obtaining an MCMC sample? and (2) how do we choose the Monte Carlo sample size? We address the first question through an application of importance sampling whereby samples drawn during previous EM iterations are recycled rather than running an MCMC sampler each MCEM iteration. The second question is addressed through an application of regenerative simulation. We obtain approximate independent and identical samples by subsampling the generated MCMC sample during different renewal periods. Standard central limit theorems may thus be used to gauge Monte Carlo error. In particular, we apply an automated rule for increasing the Monte Carlo sample size when the Monte Carlo error overwhelms the EM estimate at any given iteration. We illustrate our MCEM algorithm through analyses of two datasets fit by generalized linear mixed models. As a part of these applications, we demonstrate the improvement in computational cost and efficiency of our routine over alternative MCEM strategies. 相似文献
20.
《Journal of computational and graphical statistics》2013,22(2):216-229
This article aims to provide a method for approximately predetermining convergence properties of the Gibbs sampler. This is to be done by first finding an approximate rate of convergence for a normal approximation of the target distribution. The rates of convergence for different implementation strategies of the Gibbs sampler are compared to find the best one. In general, the limiting convergence properties of the Gibbs sampler on a sequence of target distributions (approaching a limit) are not the same as the convergence properties of the Gibbs sampler on the limiting target distribution. Theoretical results are given in this article to justify that under conditions, the convergence properties of the Gibbs sampler can be approximated as well. A number of practical examples are given for illustration. 相似文献