首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In this paper, we address the problem of learning discrete Bayesian networks from noisy data. A graphical model based on a mixture of Gaussian distributions with categorical mixing structure coming from a discrete Bayesian network is considered. The network learning is formulated as a maximum likelihood estimation problem and performed by employing an EM algorithm. The proposed approach is relevant to a variety of statistical problems for which Bayesian network models are suitable—from simple regression analysis to learning gene/protein regulatory networks from microarray data.  相似文献   

2.
A changepoint in a time series is a time of change in the marginal distribution, autocovariance, or any other distributional structure of the series. Examples include mean level shifts and volatility (variance) changes. Climate data, for example, is replete with mean shift changepoints, occurring whenever a recording instrument is changed or the observing station is moved. Here, we consider the problem of incorporating known changepoint times into a regression model framework. Specifically, we establish consistency and asymptotic normality of ordinary least squares regression estimators that account for an arbitrary number of mean shifts in the record. In a sense, this provides an alternative to the customary infill asymptotics for regression models that assume an asymptotic infinity of data observations between all changepoint times.  相似文献   

3.
A Bayesian inference for a linear Gaussian random coefficient regression model with inhomogeneous within-class variances is presented. The model is motivated by an application in metrology, but it may well find interest in other fields. We consider the selection of a noninformative prior for the Bayesian inference to address applications where the available prior knowledge is either vague or shall be ignored. The noninformative prior is derived by applying the Berger and Bernardo reference prior principle with the means of the random coefficients forming the parameters of interest. We show that the resulting posterior is proper and specify conditions for the existence of first and second moments of the marginal posterior. Simulation results are presented which suggest good frequentist properties of the proposed inference. The calibration of sonic nozzle data is considered as an application from metrology. The proposed inference is applied to these data and the results are compared to those obtained by alternative approaches.  相似文献   

4.
Process monitoring and control requires the detection of structural changes in a data stream in real time. This article introduces an efficient sequential Monte Carlo algorithm designed for learning unknown changepoints in continuous time. The method is intuitively simple: new changepoints for the latest window of data are proposed by conditioning only on data observed since the most recent estimated changepoint, as these observations carry most of the information about the current state of the process. The proposed method shows improved performance over the current state of the art. Another advantage of the proposed algorithm is that it can be made adaptive, varying the number of particles according to the apparent local complexity of the target changepoint probability distribution. This saves valuable computing time when changes in the changepoint distribution are negligible, and enables rebalancing of the importance weights of existing particles when a significant change in the target distribution is encountered. The plain and adaptive versions of the method are illustrated using the canonical continuous time changepoint problem of inferring the intensity of an inhomogeneous Poisson process, although the method is generally applicable to any changepoint problem. Performance is demonstrated using both conjugate and nonconjugate Bayesian models for the intensity. Appendices to the article are available online, illustrating the method on other models and applications.  相似文献   

5.
The time-evolving precision matrix of a piecewise-constant Gaussian graphical model encodes the dynamic conditional dependency structure of a multivariate time-series. Traditionally, graphical models are estimated under the assumption that data are drawn identically from a generating distribution. Introducing sparsity and sparse-difference inducing priors, we relax these assumptions and propose a novel regularized M-estimator to jointly estimate both the graph and changepoint structure. The resulting estimator possesses the ability to therefore favor sparse dependency structures and/or smoothly evolving graph structures, as required. Moreover, our approach extends current methods to allow estimation of changepoints that are grouped across multiple dependencies in a system. An efficient algorithm for estimating structure is proposed. We study the empirical recovery properties in a synthetic setting. The qualitative effect of grouped changepoint estimation is then demonstrated by applying the method on a genetic time-course dataset. Supplementary material for this article is available online.  相似文献   

6.
A general approach to Bayesian isotonic changepoint problems is developed. Such isotonic changepoint analysis includes trends and other constraint problems and it captures linear, non-smooth as well as abrupt changes. Desired marginal posterior densities are obtained using a Markov chain Monte Carlo method. The methodology is exemplified using one simulated and two real data examples, where it is shown that our proposed Bayesian approach captures the qualitative conclusion about the shape of the trend change.  相似文献   

7.
This article proposes a four-pronged approach to efficient Bayesian estimation and prediction for complex Bayesian hierarchical Gaussian models for spatial and spatiotemporal data. The method involves reparameterizing the covariance structure of the model, reformulating the means structure, marginalizing the joint posterior distribution, and applying a simplex-based slice sampling algorithm. The approach permits fusion of point-source data and areal data measured at different resolutions and accommodates nonspatial correlation and variance heterogeneity as well as spatial and/or temporal correlation. The method produces Markov chain Monte Carlo samplers with low autocorrelation in the output, so that fewer iterations are needed for Bayesian inference than would be the case with other sampling algorithms. Supplemental materials are available online.  相似文献   

8.
We propose a Bayesian approach for inference in the multivariate probit model, taking into account the association structure between binary observations. We model the association through the correlation matrix of the latent Gaussian variables. Conditional independence is imposed by setting some off-diagonal elements of the inverse correlation matrix to zero and this sparsity structure is modeled using a decomposable graphical model. We propose an efficient Markov chain Monte Carlo algorithm relying on a parameter expansion scheme to sample from the resulting posterior distribution. This algorithm updates the correlation matrix within a simple Gibbs sampling framework and allows us to infer the correlation structure from the data, generalizing methods used for inference in decomposable Gaussian graphical models to multivariate binary observations. We demonstrate the performance of this model and of the Markov chain Monte Carlo algorithm on simulated and real datasets. This article has online supplementary materials.  相似文献   

9.
Bayesian networks (BNs) are widely used graphical models usable to draw statistical inference about directed acyclic graphs. We presented here Graph_sampler a fast free C language software for structural inference on BNs. Graph_sampler uses a fully Bayesian approach in which the marginal likelihood of the data and prior information about the network structure are considered. This new software can handle both the continuous as well as discrete data and based on the data type two different models are formulated. The software also provides a wide variety of structure prior which can depict either the global or local properties of the graph structure. Now based on the type of structure prior selected, we considered a wide range of possible values for the prior making it either informative or uninformative. We proposed a new and much faster jumping kernel strategy in the Metropolis–Hastings algorithm. The source C code distributed is very compact, fast, uses low memory and disk storage. We performed out several analyses based on different simulated data sets and synthetic as well as real networks to discuss the performance of Graph_sampler.  相似文献   

10.
In this paper a comparative evaluation study on popular non-homogeneous Poisson models for count data is performed. For the study the standard homogeneous Poisson model (HOM) and three non-homogeneous variants, namely a Poisson changepoint model (CPS), a Poisson free mixture model (MIX), and a Poisson hidden Markov model (HMM) are implemented in both conceptual frameworks: a frequentist and a Bayesian framework. This yields eight models in total, and the goal of the presented study is to shed some light onto their relative merits and shortcomings. The first major objective is to cross-compare the performances of the four models (HOM, CPS, MIX and HMM) independently for both modelling frameworks (Bayesian and frequentist). Subsequently, a pairwise comparison between the four Bayesian and the four frequentist models is performed to elucidate to which extent the results of the two paradigms (‘Bayesian vs. frequentist’) differ. The evaluation study is performed on various synthetic Poisson data sets as well as on real-world taxi pick-up counts, extracted from the recently published New York City Taxi database.  相似文献   

11.
Probabilistic Decision Graphs (PDGs) are a class of graphical models that can naturally encode some context specific independencies that cannot always be efficiently captured by other popular models, such as Bayesian Networks. Furthermore, inference can be carried out efficiently over a PDG, in time linear in the size of the model. The problem of learning PDGs from data has been studied in the literature, but only for the case of complete data. We propose an algorithm for learning PDGs in the presence of missing data. The proposed method is based on the Expectation-Maximisation principle for estimating the structure of the model as well as the parameters. We test our proposal on both artificially generated data with different rates of missing cells and real incomplete data. We also compare the PDG models learnt by our approach to the commonly used Bayesian Network (BN) model. The results indicate that the PDG model is less sensitive to the rate of missing data than BN model. Also, though the BN models usually attain higher likelihood, the PDGs are close to them also in size, which makes the learnt PDGs preferable for probabilistic inference purposes.  相似文献   

12.
Increasingly large volumes of space–time data are collected everywhere by mobile computing applications, and in many of these cases, temporal data are obtained by registering events, for example, telecommunication or Web traffic data. Having both the spatial and temporal dimensions adds substantial complexity to data analysis and inference tasks. The computational complexity increases rapidly for fitting Bayesian hierarchical models, as such a task involves repeated inversion of large matrices. The primary focus of this paper is on developing space–time autoregressive models under the hierarchical Bayesian setup. To handle large data sets, a recently developed Gaussian predictive process approximation method is extended to include autoregressive terms of latent space–time processes. Specifically, a space–time autoregressive process, supported on a set of a smaller number of knot locations, is spatially interpolated to approximate the original space–time process. The resulting model is specified within a hierarchical Bayesian framework, and Markov chain Monte Carlo techniques are used to make inference. The proposed model is applied for analysing the daily maximum 8‐h average ground level ozone concentration data from 1997 to 2006 from a large study region in the Eastern United States. The developed methods allow accurate spatial prediction of a temporally aggregated ozone summary, known as the primary ozone standard, along with its uncertainty, at any unmonitored location during the study period. Trends in spatial patterns of many features of the posterior predictive distribution of the primary standard, such as the probability of noncompliance with respect to the standard, are obtained and illustrated. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

13.
One of the main advantages of Bayesian approaches is that they offer principled methods of inference in models of varying dimensionality and of models of infinite dimensionality. What is less widely appreciated is how the model inference is sensitive to prior distributions and therefore how priors should be set for real problems. In this paper prior sensitivity is considered with respect to the problem of inference in Gaussian mixture models. Two distinct Bayesian approaches have been proposed. The first is to use Bayesian model selection based upon the marginal likelihood; the second is to use an infinite mixture model which ‘side steps’ model selection. Explanations for the prior sensitivity are given in order to give practitioners guidance in setting prior distributions. In particular the use of conditionally conjugate prior distributions instead of purely conjugate prior distributions are advocated as a method for investigating prior sensitivity of the mean and variance individually.  相似文献   

14.
We introduce new classes of stationary spatial processes with asymmetric, sub-Gaussian marginal distributions using the idea of expectiles. We derive theoretical properties of the proposed processes. Moreover, we use the proposed spatial processes to formulate a spatial regression model for point-referenced data where the spatially correlated errors have skewed marginal distribution. We introduce a Bayesian computational procedure for model fitting and inference for this class of spatial regression models. We compare the performance of the proposed method with the traditional Gaussian process-based spatial regression through simulation studies and by applying it to a dataset on air pollution in California.  相似文献   

15.
利用基因表达数据提出一种新的网络模型—贝叶斯网络,发现基因的互作.一个贝叶斯网络是多变量联合概率分布的有向图模型,表示变量间的条件独立属性.首先我们阐明贝叶斯网络如何表示基因间的互作,然后介绍从基因芯片数据学习贝叶斯网络的方法.  相似文献   

16.
This work develops a Bayesian approach to perform inference and prediction in Gaussian random fields based on spatial censored data. These type of data occur often in the earth sciences due either to limitations of the measuring device or particular features of the sampling process used to collect the data. Inference and prediction on the underlying Gaussian random field is performed, through data augmentation, by using Markov chain Monte Carlo methods. Previous approaches to deal with spatial censored data are reviewed, and their limitations pointed out. The proposed Bayesian approach is applied to a spatial dataset of depths of a geologic horizon that contains both left- and right-censored data, and comparisons are made between inferences based on the censored data and inferences based on “complete data” obtained by two imputation methods. It is seen that the differences in inference between the two approaches can be substantial.  相似文献   

17.
Variational Bayesian methods aim to address some of the weaknesses (computation time, storage costs and convergence monitoring) of mainstream Markov chain Monte Carlo based inference at the cost of a biased but more tractable approximation to the posterior distribution. We investigate the performance of variational approximations in the context of the mixed logit model, which is one of the most used models for discrete choice data. A typical treatment using the variational Bayesian methodology is hindered by the fact that the expectation of the so called log-sum-exponential function has no explicit expression. Therefore additional approximations are required to maintain tractability. In this paper we compare seven different possible bounds or approximations. We found that quadratic bounds are not sufficiently accurate. A recently proposed non-quadratic bound did perform well. We also found that the Taylor series approximation used in a previous study of variational Bayes for mixed logit models is only accurate for specific settings. Our proposed approximation based on quasi Monte Carlo sampling performed consistently well across all simulation settings while remaining computationally tractable.  相似文献   

18.
Time series are found widely in engineering and science. We study forecasting of stochastic, dynamic systems based on observations from multivariate time series. We model the domain as a dynamic multiply sectioned Bayesian network (DMSBN) and populate the domain by a set of proprietary, cooperative agents. We propose an algorithm suite that allows the agents to perform one-step forecasts with distributed probabilistic inference. We show that as long as the DMSBN is structural time-invariant (possibly parametric time-variant), the forecast is exact and its time complexity is exponentially more efficient than using dynamic Bayesian networks (DBNs). In comparison with independent DBN-based agents, multiagent DMSBNs produce more accurate forecasts. The effectiveness of the framework is demonstrated through experiments on a supply chain testbed.  相似文献   

19.
20.
We consider alternate formulations of recently proposed hierarchical nearest neighbor Gaussian process (NNGP) models for improved convergence, faster computing time, and more robust and reproducible Bayesian inference. Algorithms are defined that improve CPU memory management and exploit existing high-performance numerical linear algebra libraries. Computational and inferential benefits are assessed for alternate NNGP specifications using simulated datasets and remotely sensed light detection and ranging data collected over the U.S. Forest Service Tanana Inventory Unit (TIU) in a remote portion of Interior Alaska. The resulting data product is the first statistically robust map of forest canopy for the TIU. Supplemental materials for this article are available online.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号