期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Bayesian Nonparametric Modeling for Multivariate Ordinal Regression

Maria DeYoreo Athanasios Kottas 《Journal of computational and graphical statistics》2018,27(1):71-84

Univariate or multivariate ordinal responses are often assumed to arise from a latent continuous parametric distribution, with covariate effects that enter linearly. We introduce a Bayesian nonparametric modeling approach for univariate and multivariate ordinal regression, which is based on mixture modeling for the joint distribution of latent responses and covariates. The modeling framework enables highly flexible inference for ordinal regression relationships, avoiding assumptions of linearity or additivity in the covariate effects. In standard parametric ordinal regression models, computational challenges arise from identifiability constraints and estimation of parameters requiring nonstandard inferential techniques. A key feature of the nonparametric model is that it achieves inferential flexibility, while avoiding these difficulties. In particular, we establish full support of the nonparametric mixture model under fixed cut-off points that relate through discretization the latent continuous responses with the ordinal responses. The practical utility of the modeling approach is illustrated through application to two datasets from econometrics, an example involving regression relationships for ozone concentration, and a multirater agreement problem. Supplementary materials with technical details on theoretical results and on computation are available online. 相似文献

2.

Nonparametric Bayesian Modeling for Stochastic Order

Alan E. Gelfand Athanasios Kottas 《Annals of the Institute of Statistical Mathematics》2001,53(4):865-876

In comparing two populations, sometimes a model incorporating stochastic order is desired. Customarily, such modeling is done parametrically. The objective of this paper is to formulate nonparametric (possibly semiparametric) stochastic order specifications providing richer, more flexible modeling. We adopt a fully Bayesian approach using Dirichlet process mixing. An attractive feature of the Bayesian approach is that full inference is available regarding the population distributions. Prior information can conveniently be incorporated. Also, prior stochastic order is preserved to the posterior analysis. Apart from the two sample setting, the approach handles the matched pairs problem, the k-sample slippage problem, ordered ANOVA and ordered regression models. We illustrate by comparing two rather small samples, one of diabetic men, the other of diabetic women. Measurements are of androstenedione levels. Males are anticipated to produce levels which will tend to be higher than those of females. 相似文献

3.

Modeling Variability Order: A Semiparametric Bayesian Approach

Kottas Athanasios Gelfand Alan E. 《Methodology and Computing in Applied Probability》2001,3(4):427-442

In comparing two populations, sometimes a model incorporating a certain probability order is desired. In this setting, Bayesian modeling is attractive since a probability order restriction imposed a priori on the population distributions is retained a posteriori. Extending the work in Gelfand and Kottas (2001) for stochastic order specifications, we formulate modeling for distributions ordered in variability. We work with Dirichlet process mixtures resulting in a fully Bayesian semiparametric approach. The details for simulation-based model fitting and prior specification are provided. An example, based on two small subsets of time intervals between eruptions of the Old Faithful geyser, illustrates the methodology. 相似文献

4.

A Computational Approach for Full Nonparametric Bayesian Inference Under Dirichlet Process Mixture Models

《Journal of computational and graphical statistics》2013,22(2):289-305

Widely used parametric generalized linear models are, unfortunately, a somewhat limited class of specifications. Nonparametric aspects are often introduced to enrich this class, resulting in semiparametric models. Focusing on single or k-sample problems, many classical nonparametric approaches are limited to hypothesis testing. Those that allow estimation are limited to certain functionals of the underlying distributions. Moreover, the associated inference often relies upon asymptotics when nonparametric specifications are often most appealing for smaller sample sizes. Bayesian nonparametric approaches avoid asymptotics but have, to date, been limited in the range of inference. Working with Dirichlet process priors, we overcome the limitations of existing simulation-based model fitting approaches which yield inference that is confined to posterior moments of linear functionals of the population distribution. This article provides a computational approach to obtain the entire posterior distribution for more general functionals. We illustrate with three applications: investigation of extreme value distributions associated with a single population, comparison of medians in a k-sample problem, and comparison of survival times from different populations under fairly heavy censoring. 相似文献

5.

Some characterizations of minimal Markov basis for sampling from discrete conditional distributions

Akimichi Takemura Satoshi Aoki 《Annals of the Institute of Statistical Mathematics》2004,56(1):1-17

In this paper we given some basic characterizations of minimal Markov basis for a connected Markov chain, which is used for performing exact tests in discrete exponential families given a sufficient statistic. We also give a necessary and sufficient condition for uniqueness of minimal Markov basis. A general algebraic algorithm for constructing a connected Markov chain was given by Diaconis and Sturmfels (1998,The Annals of Statistics,26, 363–397). Their algorithm is based on computing Gröbner basis for a certain ideal in a polynomial ring, which can be carried out by using available computer algebra packages. However structure and interpretation of Gröbner basis produced by the packages are sometimes not clear, due to the lack of symmetry and minimality in Gröbner basis computation. Our approach clarifies partially ordered structure of minimal Markov basis. 相似文献

6.

Adaptive Bayesian Nonstationary Modeling for Large Spatial Datasets Using Covariance Approximations

Bledar A. Konomi Huiyan Sang Bani K. Mallick 《Journal of computational and graphical statistics》2013,22(3):802-829

Gaussian process models have been widely used in spatial statistics but face tremendous modeling and computational challenges for very large nonstationary spatial datasets. To address these challenges, we develop a Bayesian modeling approach using a nonstationary covariance function constructed based on adaptively selected partitions. The partitioned nonstationary class allows one to knit together local covariance parameters into a valid global nonstationary covariance for prediction, where the local covariance parameters are allowed to be estimated within each partition to reduce computational cost. To further facilitate the computations in local covariance estimation and global prediction, we use the full-scale covariance approximation (FSA) approach for the Bayesian inference of our model. One of our contributions is to model the partitions stochastically by embedding a modified treed partitioning process into the hierarchical models that leads to automated partitioning and substantial computational benefits. We illustrate the utility of our method with simulation studies and the global Total Ozone Matrix Spectrometer (TOMS) data. Supplementary materials for this article are available online. 相似文献

7.

Dynamic Markov Bases

Adrian Dobra 《Journal of computational and graphical statistics》2013,22(2):496-517

This article presents a computational approach for generating Markov bases for multiway contingency tables whose cell counts might be constrained by fixed marginals and by lower and upper bounds. Our framework includes tables with structural zeros as a particular case. Instead of computing the entire Markov bases in an initial step, our framework finds sets of local moves that connect each table in the reference set with a set of neighbor tables. We construct a Markov chain on the reference set of tables that requires only a set of local moves at each iteration. The union of these sets of local moves forms a dynamic Markov basis. We illustrate the practicality of our algorithms in the estimation of exact p-values for a three-way table with structural zeros and a sparse eight-way table. This article has online supplementary materials. 相似文献

8.

Parameter Expanded Algorithms for Bayesian Latent Variable Modeling of Genetic Pleiotropy Data

Lizhen Xu Radu V. Craiu Lei Sun Andrew D. Paterson 《Journal of computational and graphical statistics》2016,25(2):405-425

Motivated by genetic association studies of pleiotropy, we propose a Bayesian latent variable approach to jointly study multiple outcomes. The models studied here can incorporate both continuous and binary responses, and can account for serial and cluster correlations. We consider Bayesian estimation for the model parameters, and we develop a novel MCMC algorithm that builds upon hierarchical centering and parameter expansion techniques to efficiently sample from the posterior distribution. We evaluate the proposed method via extensive simulations and demonstrate its utility with an application to an association study of various complication outcomes related to Type 1 diabetes. This article has supplementary material online. 相似文献

9.

Inference for the Number of Topics in the Latent Dirichlet Allocation Model via Bayesian Mixture Modeling

Zhe Chen 《Journal of computational and graphical statistics》2013,22(3):567-585

In latent Dirichlet allocation, the number of topics, T, is a hyperparameter of the model that must be specified before one can fit the model. The need to specify T in advance is restrictive. One way of dealing with this problem is to put a prior on T, but unfortunately the distribution on the latent variables of the model is then a mixture of distributions on spaces of different dimensions, and estimating this mixture distribution by Markov chain Monte Carlo is very difficult. We present a variant of the Metropolis–Hastings algorithm that can be used to estimate this mixture distribution, and in particular the posterior distribution of the number of topics. We evaluate our methodology on synthetic data and compare it with procedures that are currently used in the machine learning literature. We also give an illustration on two collections of articles from Wikipedia. Supplemental materials for this article are available online. 相似文献

10.

Efficient Bayesian Inference for Multivariate Probit Models With Sparse Inverse Correlation Matrices

Aline Talhouk Arnaud Doucet Kevin Murphy 《Journal of computational and graphical statistics》2013,22(3):739-757

We propose a Bayesian approach for inference in the multivariate probit model, taking into account the association structure between binary observations. We model the association through the correlation matrix of the latent Gaussian variables. Conditional independence is imposed by setting some off-diagonal elements of the inverse correlation matrix to zero and this sparsity structure is modeled using a decomposable graphical model. We propose an efficient Markov chain Monte Carlo algorithm relying on a parameter expansion scheme to sample from the resulting posterior distribution. This algorithm updates the correlation matrix within a simple Gibbs sampling framework and allows us to infer the correlation structure from the data, generalizing methods used for inference in decomposable Gaussian graphical models to multivariate binary observations. We demonstrate the performance of this model and of the Markov chain Monte Carlo algorithm on simulated and real datasets. This article has online supplementary materials. 相似文献

11.

服从Dirichlet分布的成分数据的贝叶斯分析

章栋恩《应用概率统计》2002,18(1):19-26

本文研究了Dirichlet分布总体的参数和其他感光趣的量的贝叶斯估计。在参数的有实际意义的函数上设置均匀的先验分布，对适当变换后的参数用Metropolis算法得到马尔可夫链蒙特卡罗后验样本，由此即得参数和其他感兴趣的量的贝叶斯估计。相似文献

12.

A markov chain sampler for contingency table exact inference

Ao Yuan Yimin Yang 《Computational Statistics》2005,20(1):63-80

Summary In the inference of contingency table, when the cell counts are not large enough for asymptotic approximation, conditioning exact method is used and often computationally impractical for large tables. Instead, various sampling methods can be used. Based on permutation, the Monte Carlo sampling may become again impractical for large tables. For this, existing the Markov chain method is to sample a few elements of the table at each iteration and is inefficient. Here we consider a Markov chain, in which a sub-table of user specified size is updated at each iteration, and it achieves high sampling efficiency. Some theoretical properties of the chain and its applications to some commonly used tables are discussed. As an illustration, this method is applied to the exact test of the Hardy-Weinberg equilibrium in the population genetics context. 相似文献

13.

Properties of Prior and Posterior Distributions for Multivariate Categorical Response Data Models

Ming-Hui Chen Qi-Man Shao 《Journal of multivariate analysis》1999,71(2):97

In this article, we model multivariate categorical (binary and ordinal) response data using a very rich class of scale mixture of multivariate normal (SMMVN) link functions to accommodate heavy tailed distributions. We consider both noninformative as well as informative prior distributions for SMMVN-link models. The notation of informative prior elicitation is based on available similar historical studies. The main objectives of this article are (i) to derive theoretical properties of noninformative and informative priors as well as the resulting posteriors and (ii) to develop an efficient Markov chain Monte Carlo algorithm to sample from the resulting posterior distribution. A real data example from prostate cancer studies is used to illustrate the proposed methodologies. 相似文献

14.

Bayesian Case Influence Measures for Statistical Models With Missing Data

《Journal of computational and graphical statistics》2013,22(1):253-271

We examine three Bayesian case influence measures including the φ-divergence, Cook’s posterior mode distance, and Cook’s posterior mean distance for identifying a set of influential observations for a variety of statistical models with missing data including models for longitudinal data and latent variable models in the absence/presence of missing data. Since it can be computationally prohibitive to compute these Bayesian case influence measures in models with missing data, we derive simple first-order approximations to the three Bayesian case influence measures by using the Laplace approximation formula and examine the applications of these approximations to the identification of influential sets. All of the computations for the first-order approximations can be easily done using Markov chain Monte Carlo samples from the posterior distribution based on the full data. Simulated data and an AIDS dataset are analyzed to illustrate the methodology. Supplemental materials for the article are available online. 相似文献

15.

混合狄雷克利分布和高维表的贝叶斯估计

陈拥君张尧庭《应用数学》1996,9(4):480-484

本文讨论多项分布情况下的高维列联表使用混合狄雷克利分布为先验分布时，贝叶斯估计的表达，以及独立性条件的表述．将文献［４］和［５］的结论推广到高维列联表中．相似文献

16.

A Pivotal Allocation-Based Algorithm for Solving the Label-Switching Problem in Bayesian Mixture Models

Han Li Xiaodan Fan 《Journal of computational and graphical statistics》2016,25(1):266-283

In Bayesian analysis of mixture models, the label-switching problem occurs as a result of the posterior distribution being invariant to any permutation of cluster indices under symmetric priors. To solve this problem, we propose a novel relabeling algorithm and its variants by investigating an approximate posterior distribution of the latent allocation variables instead of dealing with the component parameters directly. We demonstrate that our relabeling algorithm can be formulated in a rigorous framework based on information theory. Under some circumstances, it is shown to resemble the classical Kullback-Leibler relabeling algorithm and include the recently proposed equivalence classes representatives relabeling algorithm as a special case. Using simulation studies and real data examples, we illustrate the efficiency of our algorithm in dealing with various label-switching phenomena. Supplemental materials for this article are available online. 相似文献

17.

Semi- and Nonparametric Modeling of Ordinal Data

《Journal of computational and graphical statistics》2013,22(1):176-196

Parametric models for categorical ordinal response variables, like the proportional odds model or the continuation ratio model, assume that the predictor is given by a linear form of covariates. In this article the parametric models are extended to include smooth components in a semiparametric or partially parametric fashion. Parts of the covariates are thereby modeled linearly while other covariates are modeled as unspecified but smooth functions. Estimation is based on a combination of local likelihood and profile likelihood and asymptotic properties of the estimates are derived. In a simulation study it is demonstrated that the profile likelihood approach is to be preferred over a backfitting procedure. Two data examples demonstrate the applicability of the models. 相似文献

18.

Smooth Scalar-on-Image Regression via Spatial Bayesian Variable Selection

Jeff Goldsmith Lei Huang Ciprian M. Crainiceanu 《Journal of computational and graphical statistics》2013,22(1):46-64

We develop scalar-on-image regression models when images are registered multidimensional manifolds. We propose a fast and scalable Bayes’ inferential procedure to estimate the image coefficient. The central idea is the combination of an Ising prior distribution, which controls a latent binary indicator map, and an intrinsic Gaussian Markov random field, which controls the smoothness of the nonzero coefficients. The model is fit using a single-site Gibbs sampler, which allows fitting within minutes for hundreds of subjects with predictor images containing thousands of locations. The code is simple and is provided in the online Appendix (see the “Supplementary Materials” section). We apply this method to a neuroimaging study where cognitive outcomes are regressed on measures of white-matter microstructure at every voxel of the corpus callosum for hundreds of subjects. 相似文献

19.

Bayesian Inference in Hidden Markov Random Fields for Binary Data Defined on Large Lattices

《Journal of computational and graphical statistics》2013,22(2):243-261

Hidden Markov random fields represent a complex hierarchical model, where the hidden latent process is an undirected graphical structure. Performing inference for such models is difficult primarily because the likelihood of the hidden states is often unavailable. The main contribution of this article is to present approximate methods to calculate the likelihood for large lattices based on exact methods for smaller lattices. We introduce approximate likelihood methods by relaxing some of the dependencies in the latent model, and also by extending tractable approximations to the likelihood, the so-called pseudolikelihood approximations, for a large lattice partitioned into smaller sublattices. Results are presented based on simulated data as well as inference for the temporal-spatial structure of the interaction between up- and down-regulated states within the mitochondrial chromosome of the Plasmodium falciparum organism. Supplemental material for this article is available online. 相似文献

20.

Bayesian Multivariate Distributional Regression With Skewed Responses and Skewed Random Effects

Patrick Michaelis Nadja Klein Thomas Kneib 《Journal of computational and graphical statistics》2018,27(3):602-611

The normal and the t distribution are classical tools for building random effects regression models where both can be used for the specification of either the conditional response distribution or the random effects distribution. However, the underlying assumption of symmetry can be questionable in many applications. We, therefore, propose regression models where the skew-normal and skew-t distribution are considered for both the response and the random effects specification and embed these models in the framework of distributional regression such that regression predictors can be specified for all distributional parameters. The distributional regression framework also allows us to consider multivariate versions of the skew-normal and the skew-t distribution. For Bayesian inference, we adapt iteratively weighted least-square proposals within Markov chain Monte Carlo simulations such that they can also facilitate the inclusion of nonnormal random effects specifications. Model choice is based on the Watanabe–Akaike information criterion, in particular, to differentiate between skew and nonskew distributional specifications in a number of simulation studies. Finally, to illustrate their practical applicability, the developed models are applied to a study on cholesterol levels originating from the Framingham Heart Study and a dataset from the Demographic and Health Surveys on undernutrition among children in Nigeria. Supplementary material for this article is available online. 相似文献