首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 453 毫秒
1.
The main goal of this paper is to describe an architecture for solving large general hybrid Bayesian networks (BNs) with deterministic conditionals for continuous variables using local computation. In the presence of deterministic conditionals for continuous variables, we have to deal with the non-existence of the joint density function for the continuous variables. We represent deterministic conditional distributions for continuous variables using Dirac delta functions. Using the properties of Dirac delta functions, we can deal with a large class of deterministic functions. The architecture we develop is an extension of the Shenoy-Shafer architecture for discrete BNs. We extend the definitions of potentials to include conditional probability density functions and deterministic conditionals for continuous variables. We keep track of the units of continuous potentials. Inference in hybrid BNs is then done in the same way as in discrete BNs but by using discrete and continuous potentials and the extended definitions of combination and marginalization. We describe several small examples to illustrate our architecture. In addition, we solve exactly an extended version of the crop problem that includes non-conditional linear Gaussian distributions and non-linear deterministic functions.  相似文献   

2.
We discuss two issues in using mixtures of polynomials (MOPs) for inference in hybrid Bayesian networks. MOPs were proposed by Shenoy and West for mitigating the problem of integration in inference in hybrid Bayesian networks. First, in defining MOP for multi-dimensional functions, one requirement is that the pieces where the polynomials are defined are hypercubes. In this paper, we discuss relaxing this condition so that each piece is defined on regions called hyper-rhombuses. This relaxation means that MOPs are closed under transformations required for multi-dimensional linear deterministic conditionals, such as Z = X + Y, etc. Also, this relaxation allows us to construct MOP approximations of the probability density functions (PDFs) of the multi-dimensional conditional linear Gaussian distributions using a MOP approximation of the PDF of the univariate standard normal distribution. Second, Shenoy and West suggest using the Taylor series expansion of differentiable functions for finding MOP approximations of PDFs. In this paper, we describe a new method for finding MOP approximations based on Lagrange interpolating polynomials (LIP) with Chebyshev points. We describe how the LIP method can be used to find efficient MOP approximations of PDFs. We illustrate our methods using conditional linear Gaussian PDFs in one, two, and three dimensions, and conditional log-normal PDFs in one and two dimensions. We compare the efficiencies of the hyper-rhombus condition with the hypercube condition. Also, we compare the LIP method with the Taylor series method.  相似文献   

3.
Mixtures of truncated basis functions   总被引:2,自引:0,他引:2  
In this paper we propose a framework, called mixtures of truncated basis functions (MoTBFs), for representing general hybrid Bayesian networks. The proposed framework generalizes both the mixture of truncated exponentials (MTEs) framework and the Mixture of Polynomials (MoPs) framework. Similar to MTEs and MoPs, MoTBFs are defined so that the potentials are closed under combination and marginalization, which ensures that inference in MoTBF networks can be performed efficiently using the Shafer-Shenoy architecture.Based on a generalized Fourier series approximation, we devise a method for efficiently approximating an arbitrary density function using the MoTBF framework. The translation method is more flexible than existing MTE or MoP-based methods, and it supports an online/anytime tradeoff between the accuracy and the complexity of the approximation. Experimental results show that the approximations obtained are either comparable or significantly better than the approximations obtained using existing methods.  相似文献   

4.
In this paper we investigate methods for learning hybrid Bayesian networks from data. First we utilize a kernel density estimate of the data in order to translate the data into a mixture of truncated basis functions (MoTBF) representation using a convex optimization technique. When utilizing a kernel density representation of the data, the estimation method relies on the specification of a kernel bandwidth. We show that in most cases the method is robust wrt. the choice of bandwidth, but for certain data sets the bandwidth has a strong impact on the result. Based on this observation, we propose an alternative learning method that relies on the cumulative distribution function of the data.Empirical results demonstrate the usefulness of the approaches: Even though the methods produce estimators that are slightly poorer than the state of the art (in terms of log-likelihood), they are significantly faster, and therefore indicate that the MoTBF framework can be used for inference and learning in reasonably sized domains. Furthermore, we show how a particular sub-class of MoTBF potentials (learnable by the proposed methods) can be exploited to significantly reduce complexity during inference.  相似文献   

5.
Bayesian networks (BNs) provide a powerful graphical model for encoding the probabilistic relationships among a set of variables, and hence can naturally be used for classification. However, Bayesian network classifiers (BNCs) learned in the common way using likelihood scores usually tend to achieve only mediocre classification accuracy because these scores are less specific to classification, but rather suit a general inference problem. We propose risk minimization by cross validation (RMCV) using the 0/1 loss function, which is a classification-oriented score for unrestricted BNCs. RMCV is an extension of classification-oriented scores commonly used in learning restricted BNCs and non-BN classifiers. Using small real and synthetic problems, allowing for learning all possible graphs, we empirically demonstrate RMCV superiority to marginal and class-conditional likelihood-based scores with respect to classification accuracy. Experiments using twenty-two real-world datasets show that BNCs learned using an RMCV-based algorithm significantly outperform the naive Bayesian classifier (NBC), tree augmented NBC (TAN), and other BNCs learned using marginal or conditional likelihood scores and are on par with non-BN state of the art classifiers, such as support vector machine, neural network, and classification tree. These experiments also show that an optimized version of RMCV is faster than all unrestricted BNCs and comparable with the neural network with respect to run-time. The main conclusion from our experiments is that unrestricted BNCs, when learned properly, can be a good alternative to restricted BNCs and traditional machine-learning classifiers with respect to both accuracy and efficiency.  相似文献   

6.
A major difficulty in building Bayesian network (BN) models is the size of conditional probability tables, which grow exponentially in the number of parents. One way of dealing with this problem is through parametric conditional probability distributions that usually require only a number of parameters that is linear in the number of parents. In this paper, we introduce a new class of parametric models, the Probabilistic Independence of Causal Influences (PICI) models, that aim at lowering the number of parameters required to specify local probability distributions, but are still capable of efficiently modeling a variety of interactions. A subset of PICI models is decomposable and this leads to significantly faster inference as compared to models that cannot be decomposed. We present an application of the proposed method to learning dynamic BNs for modeling a woman's menstrual cycle. We show that PICI models are especially useful for parameter learning from small data sets and lead to higher parameter accuracy than when learning CPTs.  相似文献   

7.
Variable elimination (VE) and join tree propagation (JTP) are two alternatives to inference in Bayesian networks (BNs). VE, which can be viewed as one-way propagation in a join tree, answers each query against the BN meaning that computation can be repeated. On the other hand, answering a single query with JTP involves two-way propagation, of which some computation may remain unused. In this paper, we propose marginal tree inference (MTI) as a new approach to exact inference in discrete BNs. MTI seeks to avoid recomputation, while at the same time ensuring that no constructed probability information remains unused. Thereby, MTI stakes out middle ground between VE and JTP. The usefulness of MTI is demonstrated in multiple probabilistic reasoning sessions.  相似文献   

8.
Bayesian Networks (BNs) are probabilistic inference engines that support reasoning under uncertainty. This article presents a methodology for building an information technology (IT) implementation BN from client–server survey data. The article also demonstrates how to use the BN to predict the attainment of IT benefits, given specific implementation characteristics (e.g., application complexity) and activities (e.g., reengineering). The BN is an outcome of a machine learning process that finds the network’s structure and its associated parameters, which best fit the data. The article will be of interest to academicians who want to learn more about building BNs from real data and practitioners who are interested in IT implementation models that make probabilistic statements about certain implementation decisions.  相似文献   

9.
The specification of conditional probability tables (CPTs) is a difficult task in the construction of probabilistic graphical models. Several types of canonical models have been proposed to ease that difficulty. Noisy-threshold models generalize the two most popular canonical models: the noisy-or and the noisy-and. When using the standard inference techniques the inference complexity is exponential with respect to the number of parents of a variable. More efficient inference techniques can be employed for CPTs that take a special form. CPTs can be viewed as tensors. Tensors can be decomposed into linear combinations of rank-one tensors, where a rank-one tensor is an outer product of vectors. Such decomposition is referred to as Canonical Polyadic (CP) or CANDECOMP-PARAFAC (CP) decomposition. The tensor decomposition offers a compact representation of CPTs which can be efficiently utilized in probabilistic inference. In this paper we propose a CP decomposition of tensors corresponding to CPTs of threshold functions, exactly -out-of-k functions, and their noisy counterparts. We prove results about the symmetric rank of these tensors in the real and complex domains. The proofs are constructive and provide methods for CP decomposition of these tensors. An analytical and experimental comparison with the parent-divorcing method (which also has a polynomial complexity) shows superiority of the CP decomposition-based method. The experiments were performed on subnetworks of the well-known QMRT-DT network generalized by replacing noisy-or by noisy-threshold models.  相似文献   

10.
Non-parametric density estimation is an important technique in probabilistic modeling and reasoning with uncertainty. We present a method for learning mixtures of polynomials (MoPs) approximations of one-dimensional and multidimensional probability densities from data. The method is based on basis spline interpolation, where a density is approximated as a linear combination of basis splines. We compute maximum likelihood estimators of the mixing coefficients of the linear combination. The Bayesian information criterion is used as the score function to select the order of the polynomials and the number of pieces of the MoP. The method is evaluated in two ways. First, we test the approximation fitting. We sample artificial datasets from known one-dimensional and multidimensional densities and learn MoP approximations from the datasets. The quality of the approximations is analyzed according to different criteria, and the new proposal is compared with MoPs learned with Lagrange interpolation and mixtures of truncated basis functions. Second, the proposed method is used as a non-parametric density estimation technique in Bayesian classifiers. Two of the most widely studied Bayesian classifiers, i.e., the naive Bayes and tree-augmented naive Bayes classifiers, are implemented and compared. Results on real datasets show that the non-parametric Bayesian classifiers using MoPs are comparable to the kernel density-based Bayesian classifiers. We provide a free R package implementing the proposed methods.  相似文献   

11.
Bayesian networks (BNs) have attained widespread use in data analysis and decision making. Well-studied topics include efficient inference, evidence propagation, parameter learning from data for complete and incomplete data scenarios, expert elicitation for calibrating BN probabilities, and structure learning. It is common for the researcher to assume the structure of the BN or to glean the structure from expert elicitation or domain knowledge. In this scenario, the model may be calibrated through learning the parameters from relevant data. There is a lack of work on model diagnostics for fitted BNs; this is the contribution of this article. We key on the definition of (conditional) independence to develop a graphical diagnostic that indicates whether the conditional independence assumptions imposed, when one assumes the structure of the BN, are supported by the data. We develop the approach theoretically and describe a Monte Carlo method to generate uncertainty measures for the consistency of the data with conditional independence assumptions under the model structure. We describe how this theoretical information and the data are presented in a graphical diagnostic tool. We demonstrate the approach through data simulated from BNs under different conditional independence assumptions. We also apply the diagnostic to a real-world dataset. The results presented in this article show that this approach is most feasible for smaller BNs—this is not peculiar to the proposed diagnostic graphic, but rather is related to the general difficulty of combining large BNs with data in any manner (such as through parameter estimation). It is the authors’ hope that this article helps highlight the need for more research into BN model diagnostics. This article has supplementary materials online.  相似文献   

12.
Mixtures of truncated exponentials (MTE) potentials are an alternative to discretization for solving hybrid Bayesian networks. Any probability density function (PDF) can be approximated with an MTE potential, which can always be marginalized in closed form. This allows propagation to be done exactly using the Shenoy–Shafer architecture for computing marginals, with no restrictions on the construction of a join tree. This paper presents MTE potentials that approximate an arbitrary normal PDF with any mean and a positive variance. The properties of these MTE potentials are presented, along with examples that demonstrate their use in solving hybrid Bayesian networks. Assuming that the joint density exists, MTE potentials can be used for inference in hybrid Bayesian networks that do not fit the restrictive assumptions of the conditional linear Gaussian (CLG) model, such as networks containing discrete nodes with continuous parents.  相似文献   

13.
Approximate inference in Bayesian networks using binary probability trees   总被引:2,自引:0,他引:2  
The present paper introduces a new kind of representation for the potentials in a Bayesian network: Binary Probability Trees. They enable the representation of context-specific independences in more detail than probability trees. This enhanced capability leads to more efficient inference algorithms for some types of Bayesian networks. This paper explains the procedure for building a binary probability tree from a given potential, which is similar to the one employed for building standard probability trees. It also offers a way of pruning a binary tree in order to reduce its size. This allows us to obtain exact or approximate results in inference depending on an input threshold. This paper also provides detailed algorithms for performing the basic operations on potentials (restriction, combination and marginalization) directly to binary trees. Finally, some experiments are described where binary trees are used with the variable elimination algorithm to compare the performance with that obtained for standard probability trees.  相似文献   

14.
Probabilistic inference is among the main topics with reasoning in uncertainty in AI. For this purpose, Bayesian Networks (BNs) is one of the most successful and efficient Probabilistic Graphical Model (PGM) so far. Since the mid-90s, a growing number of BNs extensions have been proposed. Object-oriented, entity-relationship and first-order logic are the main representation paradigms used to extend BNs. While entity-relationship and first-order models have been successfully used for machine learning in defining lifted probabilistic inference, object-oriented models have been mostly underused. Structured inference, which exploits the structural knowledge encoded in an object-oriented PGM, is a surprisingly unstudied technique. In this paper we propose a full object-oriented framework for PRM and propose two extensions of the state-of-the-art structured inference algorithm: SPI which removes the major flaws of existing algorithms and SPISBB which largely enhances SPI by using d-separation.  相似文献   

15.
Rogers and Shi (1995) have used the technique of conditional expectations to derive approximations for the distribution of a sum of lognormals. In this paper we extend their results to more general sums of random variables. In particular we study sums of functions of dependent random variables that are multivariate normally distributed and also derive results for sums of functions of dependent random variables from the additive exponential dispersion family. The usefulness of our results for practical applications is also discussed.  相似文献   

16.
Approximate Bayesian inference by importance sampling derives probabilistic statements from a Bayesian network, an essential part of evidential reasoning with the network and an important aspect of many Bayesian methods. A critical problem in importance sampling on Bayesian networks is the selection of a good importance function to sample a network’s prior and posterior probability distribution. The initially optimal importance functions eventually start deviating from the optimal function when sampling a network’s posterior distribution given evidence, even when adaptive methods are used that adjust an importance function to the evidence by learning. In this article we propose a new family of Refractor Importance Sampling (RIS) algorithms for adaptive importance sampling under evidential reasoning. RIS applies “arc refractors” to a Bayesian network by adding new arcs and refining the conditional probability tables. The goal of RIS is to optimize the importance function for the posterior distribution and reduce the error variance of sampling. Our experimental results show a significant improvement of RIS over state-of-the-art adaptive importance sampling algorithms.  相似文献   

17.
This paper introduces a new probabilistic graphical model called gated Bayesian network (GBN). This model evolved from the need to represent processes that include several distinct phases. In essence, a GBN is a model that combines several Bayesian networks (BNs) in such a manner that they may be active or inactive during queries to the model. We use objects called gates to combine BNs, and to activate and deactivate them when predefined logical statements are satisfied. In this paper we also present an algorithm for semi-automatic learning of GBNs. We use the algorithm to learn GBNs that output buy and sell decisions for use in algorithmic trading systems. We show how the learnt GBNs can substantially lower risk towards invested capital, while they at the same time generate similar or better rewards, compared to the benchmark investment strategy buy-and-hold. We also explore some differences and similarities between GBNs and other related formalisms.  相似文献   

18.
Email: kchang{at}gmu.eduEmail: RobertFung{at}Fairlsaac.comEmail: alan.lucas{at}hotmail.comEmail: BobOliver{at}Fairlsaac.com||Email: NShikaloff{at}Fairlsaac.com The objectives of this paper are to apply the theory and numericalalgorithms of Bayesian networks to risk scoring, and comparethe results with traditional methods for computing scores andposterior predictions of performance variables. Model identification,inference, and prediction of random variables using Bayesiannetworks have been successfully applied in a number of areas,including medical diagnosis, equipment failure, informationretrieval, rare-event prediction, and pattern recognition. Theability to graphically represent conditional dependencies andindependencies among random variables may also be useful incredit scoring. Although several papers have already appearedin the literature which use graphical models for model identification,as far as we know there have been no explicit experimental resultsthat compare a traditionally computed risk score with predictionsbased on Bayesian learning algorithms. In this paper, we examine a database of credit-card applicantsand attempt to ‘learn’ the graphical structure ofthe characteristics or variables that make up the database.We identify representative Bayesian networks in a developmentsample as well as the associated Markov blankets and cliquestructures within the Markov blanket. Once we obtain the structureof the underlying conditional independencies, we are able toestimate the probabilities of each node conditional on its directpredecessor node(s). We then calculate the posterior probabilitiesand scores of a performance variable for the development sample.Finally, we calculate the receiver operating characteristic(ROC) curves and relative profitability of scorecards basedon these identifications. The results of the different modelsand methods are compared with both development and validationsamples. Finally, we report on a statistical entropy calculationthat measures the degree to which cliques identified in theBayesian network are independent of one another.  相似文献   

19.
The paper treats the problem of obtaining numerical solutions to the Fokker-Plank equation for the density of a diffusion, and for the conditional density, given certain “white noise” corrupted observations. These equations generally have a meaning only in the weak sense; the basic assumptions on the diffusion are that the coefficients are bounded, and uniformly continuous, and that the diffusion has a unique solution in the sense of multivariate distributions. It is shown that, if the finite difference approximations are carefully (but naturally) chosen, then the finite difference solutions to the formal adjoints yield immediately a sequence of approximations that converge weakly to the weak sense solution to the Fokker-Plank equation (conditional or not), as the difference intervals go to zero.The approximations seem very natural for this type of problem. They are related to the transition functions of a sequence of Markov chains, the measures of whose continuous time interpolations converge weakly to the (measure of) diffusion, as the difference intervals go to zero, and, hence, seem to have more physical significance than the usual (formal or not) approximations. The method is purely probabilistic and relies heavily on results of the weak convergence of measures on abstract spaces.  相似文献   

20.
The current paper addresses two problems observed in structure learning applications to computational biology.The first one is dealing with mixed data. Most optimization criteria for learning algorithms are applicable to either discrete or continuous data. Mixed datasets are usually handled by discretization of continuous data, which often leads to the loss of information. In order to address this problem, we adapted discrete scoring functions to continuous data. Consequently, the same score is used to both types of variables, and the network structure may be learned from mixed data directly.The second problem is the control of the type I error level. Usually, learning algorithms output a network that is the best according to some optimization criteria, but the reliability of particular relationships represented by this network is unknown. We address this problem by allowing the user to specify the expected error level and adjusting the parameters of the scoring criteria to this level.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号