期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A “Density-Based” Algorithm for Cluster Analysis Using Species Sampling Gaussian Mixture Models

Raffaele Argiento Andrea Cremaschi Alessandra Guglielmi 《Journal of computational and graphical statistics》2013,22(4):1126-1142

We propose a new model for cluster analysis in a Bayesian nonparametric framework. Our model combines two ingredients, species sampling mixture models of Gaussian distributions on one hand, and a deterministic clustering procedure (DBSCAN) on the other. Here, two observations from the underlying species sampling mixture model share the same cluster if the distance between the densities corresponding to their latent parameters is smaller than a threshold; this yields a random partition which is coarser than the one induced by the species sampling mixture. Since this procedure depends on the value of the threshold, we suggest a strategy to fix it. In addition, we discuss implementation and applications of the model; comparison with more standard clustering algorithms will be given as well. Supplementary materials for the article are available online. 相似文献

2.

Posterior Simulation in the Generalized Linear Mixed Model With Semiparametric Random Effects

《Journal of computational and graphical statistics》2013,22(2):410-425

Generalized linear mixed models with semiparametric random effects are useful in a wide variety of Bayesian applications. When the random effects arise from a mixture of Dirichlet process (MDP) model with normal base measure, Gibbs samplingalgorithms based on the Pólya urn scheme are often used to simulate posterior draws in conjugate models (essentially, linear regression models and models for binary outcomes). In the nonconjugate case, some common problems associated with existing simulation algorithms include convergence and mixing difficulties.

This article proposes an algorithm for MDP models with exponential family likelihoods and normal base measures. The algorithm proceeds by making a Laplace approximation to the likelihood function, thereby matching the proposal with that of the Gibbs sampler. The proposal is accepted or rejected via a Metropolis-Hastings step. For conjugate MDP models, the algorithm is identical to the Gibbs sampler. The performance of the technique is investigated using a Poisson regression model with semi-parametric random effects. The algorithm performs efficiently and reliably, even in problems where large-sample results do not guarantee the success of the Laplace approximation. This is demonstrated by a simulation study where most of the count data consist of small numbers. The technique is associated with substantial benefits relative to existing methods, both in terms of convergence properties and computational cost. 相似文献

3.

A Bayesian nonparametric model and its application in insurance loss prediction

《Insurance: Mathematics and Economics》2020

Predicting insurance losses is an eternal focus of actuarial science in the insurance sector. Due to the existence of complicated features such as skewness, heavy tail, and multi-modality, traditional parametric models are often inadequate to describe the distribution of losses, calling for a mature application of Bayesian methods. In this study we explore a Gaussian mixture model based on Dirichlet process priors. Using three automobile insurance datasets, we employ the probit stick-breaking method to incorporate the effect of covariates into the weight of the mixture component, improve its hierarchical structure, and propose a Bayesian nonparametric model that can identify the unique regression pattern of different samples. Moreover, an advanced updating algorithm of slice sampling is integrated to apply an improved approximation to the infinite mixture model. We compare our framework with four common regression techniques: three generalized linear models and a dependent Dirichlet process ANOVA model. The empirical results show that the proposed framework flexibly characterizes the actual loss distribution in the insurance datasets and demonstrates superior performance in the accuracy of data fitting and extrapolating predictions, thus greatly extending the application of Bayesian methods in the insurance sector. 相似文献

4.

Reparameterized and Marginalized Posterior and Predictive Sampling for Complex Bayesian Geostatistical Models

《Journal of computational and graphical statistics》2013,22(2):262-282

This article proposes a four-pronged approach to efficient Bayesian estimation and prediction for complex Bayesian hierarchical Gaussian models for spatial and spatiotemporal data. The method involves reparameterizing the covariance structure of the model, reformulating the means structure, marginalizing the joint posterior distribution, and applying a simplex-based slice sampling algorithm. The approach permits fusion of point-source data and areal data measured at different resolutions and accommodates nonspatial correlation and variance heterogeneity as well as spatial and/or temporal correlation. The method produces Markov chain Monte Carlo samplers with low autocorrelation in the output, so that fewer iterations are needed for Bayesian inference than would be the case with other sampling algorithms. Supplemental materials are available online. 相似文献

5.

A Marginal Sampler for σ-Stable Poisson–Kingman Mixture Models

María Lomelí Stefano Favaro Yee Whye Teh 《Journal of computational and graphical statistics》2017,26(1):44-53

We investigate the class of σ-stable Poisson–Kingman random probability measures (RPMs) in the context of Bayesian nonparametric mixture modeling. This is a large class of discrete RPMs, which encompasses most of the popular discrete RPMs used in Bayesian nonparametrics, such as the Dirichlet process, Pitman–Yor process, the normalized inverse Gaussian process, and the normalized generalized Gamma process. We show how certain sampling properties and marginal characterizations of σ-stable Poisson–Kingman RPMs can be usefully exploited for devising a Markov chain Monte Carlo (MCMC) algorithm for performing posterior inference with a Bayesian nonparametric mixture model. Specifically, we introduce a novel and efficient MCMC sampling scheme in an augmented space that has a small number of auxiliary variables per iteration. We apply our sampling scheme to a density estimation and clustering tasks with unidimensional and multidimensional datasets, and compare it against competing MCMC sampling schemes. Supplementary materials for this article are available online. 相似文献

6.

Posterior Simulation with Priors Specified on Functionals

Kert Viele 《Journal of computational and graphical statistics》2013,22(2):235-248

Abstract

Many Bayesian analyses use Markov chain Monte Carlo (MCMC) techniques. MCMC techniques work fastest (per iteration) when the prior distribution of the parameters is chosen conveniently, such as a conjugate prior. However, this is sometimes at odds with the prior desired by the investigator. We describe two motivating examples where nonconjugate priors are preferred. One is a Dirichlet process where it is difficult to implement alternative, nonconjugate priors. We develop a method that allows computation to be done with a convenient prior but adjusts the equilibrium distribution of the Markov chain to be the posterior distribution from the desired prior. In addition to allowing more freedom in choosing prior distributions, the method enables the investigator to perform quick sensitivity analyses, even in nonparametric settings. 相似文献

7.

A semiparametric Bayesian approach to the analysis of financial time series with applications to value at risk estimation

M. Concepción Ausín Pedro Galeano Pulak Ghosh 《European Journal of Operational Research》2014

GARCH models are commonly used for describing, estimating and predicting the dynamics of financial returns. Here, we relax the usual parametric distributional assumptions of GARCH models and develop a Bayesian semiparametric approach based on modeling the innovations using the class of scale mixtures of Gaussian distributions with a Dirichlet process prior on the mixing distribution. The proposed specification allows for greater flexibility in capturing the usual patterns observed in financial returns. It is also shown how to undertake Bayesian prediction of the Value at Risk (VaR). The performance of the proposed semiparametric method is illustrated using simulated and real data from the Hang Seng Index (HSI) and Bombay Stock Exchange index (BSE30). 相似文献

8.

Slice Sampling σ-Stable Poisson-Kingman Mixture Models

Stefano Favaro Stephen G. Walker 《Journal of computational and graphical statistics》2013,22(4):830-847

The article is concerned with the use of Markov chain Monte Carlo methods for posterior sampling in Bayesian nonparametric mixture models.In particular, we consider the problem of slice sampling mixture models for a large class of mixing measures generalizing the celebrated Dirichlet process. Such a class of measures, known in the literature as σ-stable Poisson-Kingman models, includes as special cases most of the discrete priors currently known in Bayesian nonparametrics, for example, the two-parameter Poisson-Dirichlet process and the normalized generalized Gamma process. The proposed approach is illustrated on some simulated data examples. This article has online supplementary material. 相似文献

9.

混合狄雷克利分布和高维表的贝叶斯估计

陈拥君张尧庭《应用数学》1996,9(4):480-484

本文讨论多项分布情况下的高维列联表使用混合狄雷克利分布为先验分布时，贝叶斯估计的表达，以及独立性条件的表述．将文献［４］和［５］的结论推广到高维列联表中．相似文献

10.

Identifying Mixtures of Mixtures Using Bayesian Estimation

Gertraud Malsiner-Walli Sylvia Frühwirth-Schnatter Bettina Grün 《Journal of computational and graphical statistics》2017,26(2):285-295

The use of a finite mixture of normal distributions in model-based clustering allows us to capture non-Gaussian data clusters. However, identifying the clusters from the normal components is challenging and in general either achieved by imposing constraints on the model or by using post-processing procedures. Within the Bayesian framework, we propose a different approach based on sparse finite mixtures to achieve identifiability. We specify a hierarchical prior, where the hyperparameters are carefully selected such that they are reflective of the cluster structure aimed at. In addition, this prior allows us to estimate the model using standard MCMC sampling methods. In combination with a post-processing approach which resolves the label switching issue and results in an identified model, our approach allows us to simultaneously (1) determine the number of clusters, (2) flexibly approximate the cluster distributions in a semiparametric way using finite mixtures of normals and (3) identify cluster-specific parameters and classify observations. The proposed approach is illustrated in two simulation studies and on benchmark datasets. Supplementary materials for this article are available online. 相似文献

11.

Smoothness Properties and Gradient Analysis Under Spatial Dirichlet Process Models

Michele Guindani Alan E. Gelfand 《Methodology and Computing in Applied Probability》2006,8(2):159-189

When analyzing point-referenced spatial data, interest will be in the first order or global behavior of associated surfaces. However, in order to better understand these surfaces, we may also be interested in second order or local behavior, e.g., in the rate of change of a spatial surface at a given location in a given direction. In a Bayesian parametric setting, such smoothness analysis has been pursued by Banerjee and Gelfand (2003) and Banerjee et al. (2003). We study continuity and differentiability of random surfaces in the Bayesian nonparametric setting proposed by Gelfand et al. (2005), which is based on the formulation of a spatial Dirichlet process (SDP). We provide conditions under which the random surfaces sampled from a SDP are smooth. We also obtain complete distributional theory for the directional finite difference and derivative processes associated with those random surfaces. We present inference under a Bayesian framework and illustrate our methodology with a simulated dataset. 相似文献

12.

基于区间数度量的区间值模糊集合的归一化距离、相似度、模糊度和包含度的关系研究

曾文艺赵宜宾《模糊系统与数学》2012,26(2):81-90

区间值模糊集合的距离、相似度、模糊度和包含度及其关系研究是区间值模糊集合的一个研究热点.考虑到区间值模糊集合所表示信息的丰富性,本文使用区间数而非实数来刻画区间值模糊集合的距离,首先给出基于区间数度量的区间值模糊集合的归一化距离的公理化定义,然后通过五个定理详细研究了基于公理化定义的区间值模糊集合的归一化距离、相似度、模糊度和包含度之间的相互转换关系,最后,给出了若干公式来计算基于区间数度量的区间值模糊集合的相似度、模糊度和包含度.这些结论,一方面丰富了区间值模糊集合的信息测度(距离、相似度、模糊度和包含度)的内容,另一方面也为区间值模糊集合的近似推理、决策分析、模式识别等领域的应用提供了新方法和新理论. 相似文献

13.

Statistical Analysis of Mixtures and the Empirical Probability Measure

Philippe Barbe 《Acta Appl Math》1998,50(3):253-340

We consider the problem of estimating a mixture of probability measures in an abstract setting. Twelve examples are worked out, in order to show the applicability of the theory. 相似文献

14.

On exit time from balls of jump-type symmetric Markov processes

Toshihiro Uemura 《数学学报(英文版)》2010,26(1):185-192

We obtain upper and lower bounds of the exit times from balls of a jump-type symmetric Markov process. The proofs are delivered separately. The upper bounds are obtained by using the Levy system corresponding to the process, while the precise expression of the （L^2-）generator of the Dirichlet form associated with the process is used to obtain the lower bounds. 相似文献

15.

The moment function for the ratio of correlated generalized gamma variables

Çağatay Candan Umut Orguner 《Statistics & probability letters》2013,83(10):2353-2356

The moment function for the ratio of correlated generalized gamma variables is expressed in terms of special functions. The expression presented generalizes the known moment expression for the integer valued moments to the real valued moments. Approximate formulas, in terms of elementary functions, are provided for low and high correlation regions and some application examples are given. 相似文献

16.

Bayesian modeling of retrospective time-to-pregnancy data with digit preference bias

Karen L. Price John W. Seaman 《Mathematical and Computer Modelling》2006,43(11-12):1424-1433

The study of factors affecting human fertility is an important problem affording interesting statistical and computational challenges. Analyses of human fertility rates must cope with extra variability in fecundability parameters as well as a host of covariates ranging from the obvious, such as coital frequency, to the subtle, like the smoking habits of the female’s mother. In retrospective human fecundity studies, researchers ask couples the time required to conceive. This time-to-pregnancy data often exhibits digit preference bias, among other problems. We introduce computationally intensive models with sufficient flexibility to represent such bias and other causes yielding a similar lack of monotonicity in conception probabilities. 相似文献

17.

On the quasi-regularity of non-sectorial Dirichlet forms by processes having the same polar sets

Lucian Beznea 《Journal of Mathematical Analysis and Applications》2011,384(1):33-48

We obtain a criterion for the quasi-regularity of generalized (non-sectorial) Dirichlet forms, which extends the result of P.J. Fitzsimmons on the quasi-regularity of (sectorial) semi-Dirichlet forms. Given the right (Markov) process associated to a semi-Dirichlet form, we present sufficient conditions for a second right process to be a standard one, having the same state space. The above mentioned quasi-regularity criterion is then an application. The conditions are expressed in terms of the associated capacities, nests of compacts, polar sets, and quasi-continuity. The second application is on the quasi-regularity of the generalized Dirichlet forms obtained by perturbing a semi-Dirichlet form with kernels. 相似文献

18.

On the inconsistency of Bayesian non-parametric estimators in competing risks/multiple decrement models

Barry C. Arnold Patrick L. Brockett William Torrez A.Larry Wright 《Insurance: Mathematics and Economics》1984,3(1):49-55

In the competing risks/multiple decrement model, the joint distribution is often not identifiable given only the observed time of failure and the cause of failure. The traditional approach is consequently to assume a parametric model. In this paper we shall not do this, but rather assume a Bayesian stance, take a Dirichlet process as a prior distribution, and then calculate the posterior distribution given the data. In this paper we show that in dimensions ? 2, the posterior mean yields an inconsistent estimator of the joint probability law, contrary to the common assumption that the prior law ‘washes out’ with large samples. For single decrement mortality tables however, the non-parametric Bayesian method allows a flexible method for adjusting a standard mortality table to reflect mortality experience, or covariate information. 相似文献

19.

Computational Aspects of Nonparametric Bayesian Analysis with Applications to the Modeling of Multiple Binary Sequences

Fernando A. Quintana Michael A. Newton 《Journal of computational and graphical statistics》2013,22(4):711-737

Abstract

We consider Markov mixture models for multiple longitudinal binary sequences. Prior uncertainty in the mixing distribution is characterized by a Dirichlet process centered on a matrix beta measure. We use this setting to evaluate and compare the performance of three competing algorithms that arise more generally in Dirichlet process mixture calculations: sequential imputations, Gibbs sampling, and a predictive recursion, for which an extension of the sequential calculations is introduced. This facilitates the estimation of quantities related to clustering structure which is not available in the original formulation. A numerical comparison is carried out in three examples. Our findings suggest that the sequential imputations method is most useful for relatively small problems, and that the predictive recursion can be an efficient preliminary tool for more reliable, but computationally intensive, Gibbs sampling implementations. 相似文献

20.

Logarithmic Pooling of Priors Linked by a Deterministic Simulation Model

Geof H. Givens Paul J. Roback 《Journal of computational and graphical statistics》2013,22(3):452-478

Abstract

We consider Bayesian inference when priors and likelihoods are both available for inputs and outputs of a deterministic simulation model. This problem is fundamentally related to the issue of aggregating (i.e., pooling) expert opinion. We survey alternative strategies for aggregation, then describe computational approaches for implementing pooled inference for simulation models. Our approach (1) numerically transforms all priors to the same space; (2) uses log pooling to combine priors; and (3) then draws standard Bayesian inference. We use importance sampling methods, including an iterative, adaptive approach that is more flexible and has less bias in some instances than a simpler alternative. Our exploratory examples are the first steps toward extension of the approach for highly complex and even noninvertible models. 相似文献