首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 8 毫秒
1.
Streaming data are relevant to finance, computer science, and engineering while they are becoming increasingly important to medicine and biology. Continuous time Bayesian network classifiers are designed for analyzing multivariate streaming data when time duration of event matters. Structural and parametric learning for the class of continuous time Bayesian network classifiers are considered in the case where complete data is available. Conditional log-likelihood scoring is developed for structural learning on continuous time Bayesian network classifiers. Performance of continuous time Bayesian network classifiers learned when combining conditional log-likelihood scoring and Bayesian parameter estimation are compared with that achieved by continuous time Bayesian network classifiers when learning is based on marginal log-likelihood scoring and to that achieved by dynamic Bayesian network classifiers. Classifiers are compared in terms of accuracy and computation time. Comparison is based on numerical experiments where synthetic and real data are used. Results show that conditional log-likelihood scoring combined with Bayesian parameter estimation outperforms marginal log-likelihood scoring. Conditional log-likelihood scoring becomes even more effective when the amount of available data is limited. Continuous time Bayesian network classifiers outperform in terms of computation time and accuracy dynamic Bayesian network on synthetic and real data sets.  相似文献   

2.
Bayesian networks (BNs) provide a powerful graphical model for encoding the probabilistic relationships among a set of variables, and hence can naturally be used for classification. However, Bayesian network classifiers (BNCs) learned in the common way using likelihood scores usually tend to achieve only mediocre classification accuracy because these scores are less specific to classification, but rather suit a general inference problem. We propose risk minimization by cross validation (RMCV) using the 0/1 loss function, which is a classification-oriented score for unrestricted BNCs. RMCV is an extension of classification-oriented scores commonly used in learning restricted BNCs and non-BN classifiers. Using small real and synthetic problems, allowing for learning all possible graphs, we empirically demonstrate RMCV superiority to marginal and class-conditional likelihood-based scores with respect to classification accuracy. Experiments using twenty-two real-world datasets show that BNCs learned using an RMCV-based algorithm significantly outperform the naive Bayesian classifier (NBC), tree augmented NBC (TAN), and other BNCs learned using marginal or conditional likelihood scores and are on par with non-BN state of the art classifiers, such as support vector machine, neural network, and classification tree. These experiments also show that an optimized version of RMCV is faster than all unrestricted BNCs and comparable with the neural network with respect to run-time. The main conclusion from our experiments is that unrestricted BNCs, when learned properly, can be a good alternative to restricted BNCs and traditional machine-learning classifiers with respect to both accuracy and efficiency.  相似文献   

3.
This paper presents a Bayesian decision theoretic foundation to the selection of a Bayesian network from data. We introduce the class of disintegrable loss functions to diversify the loss incurred in choosing different models. Disintegrable loss functions can iteratively be built from simple 0-L loss functions over pair-wise model comparisons and decompose the search for the model with minimum risk into a sequence of local searches, thus retaining the modularity of the model selection procedures for Bayesian networks.  相似文献   

4.
Variable elimination (VE) and join tree propagation (JTP) are two alternatives to inference in Bayesian networks (BNs). VE, which can be viewed as one-way propagation in a join tree, answers each query against the BN meaning that computation can be repeated. On the other hand, answering a single query with JTP involves two-way propagation, of which some computation may remain unused. In this paper, we propose marginal tree inference (MTI) as a new approach to exact inference in discrete BNs. MTI seeks to avoid recomputation, while at the same time ensuring that no constructed probability information remains unused. Thereby, MTI stakes out middle ground between VE and JTP. The usefulness of MTI is demonstrated in multiple probabilistic reasoning sessions.  相似文献   

5.
6.
One of the hardest challenges in building a realistic Bayesian Network (BN) model is to construct the node probability tables (NPTs). Even with a fixed predefined model structure and very large amounts of relevant data, machine learning methods do not consistently achieve great accuracy compared to the ground truth when learning the NPT entries (parameters). Hence, it is widely believed that incorporating expert judgments can improve the learning process. We present a multinomial parameter learning method, which can easily incorporate both expert judgments and data during the parameter learning process. This method uses an auxiliary BN model to learn the parameters of a given BN. The auxiliary BN contains continuous variables and the parameter estimation amounts to updating these variables using an iterative discretization technique. The expert judgments are provided in the form of constraints on parameters divided into two categories: linear inequality constraints and approximate equality constraints. The method is evaluated with experiments based on a number of well-known sample BN models (such as Asia, Alarm and Hailfinder) as well as a real-world software defects prediction BN model. Empirically, the new method achieves much greater learning accuracy (compared to both state-of-the-art machine learning techniques and directly competing methods) with much less data. For example, in the software defects BN for a sample size of 20 (which would be considered difficult to collect in practice) when a small number of real expert constraints are provided, our method achieves a level of accuracy in parameter estimation that can only be matched by other methods with much larger sample sizes (320 samples required for the standard machine learning method, and 105 for the directly competing method with constraints).  相似文献   

7.
A major difficulty in building Bayesian network (BN) models is the size of conditional probability tables, which grow exponentially in the number of parents. One way of dealing with this problem is through parametric conditional probability distributions that usually require only a number of parameters that is linear in the number of parents. In this paper, we introduce a new class of parametric models, the Probabilistic Independence of Causal Influences (PICI) models, that aim at lowering the number of parameters required to specify local probability distributions, but are still capable of efficiently modeling a variety of interactions. A subset of PICI models is decomposable and this leads to significantly faster inference as compared to models that cannot be decomposed. We present an application of the proposed method to learning dynamic BNs for modeling a woman's menstrual cycle. We show that PICI models are especially useful for parameter learning from small data sets and lead to higher parameter accuracy than when learning CPTs.  相似文献   

8.
Using domain/expert knowledge when learning Bayesian networks from data has been considered a promising idea since the very beginning of the field. However, in most of the previously proposed approaches, human experts do not play an active role in the learning process. Once their knowledge is elicited, they do not participate any more. The interactive approach for integrating domain/expert knowledge we propose in this work aims to be more efficient and effective. In contrast to previous approaches, our method performs an active interaction with the expert in order to guide the search based learning process. This method relies on identifying the edges of the graph structure which are more unreliable considering the information present in the learning data. Another contribution of our approach is the integration of domain/expert knowledge at different stages of the learning process of a Bayesian network: while learning the skeleton and when directing the edges of the directed acyclic graph structure.  相似文献   

9.
10.
This paper considers a Bayesian approach to selecting a primary resolution and wavelet basis functions. Most of papers on wavelet shrinkage have been focused on thresholding of wavelet coefficients, given a primary resolution which is usually determined by the sample size. However, it turns out that a proper primary resolution is much affected by the shape of an unknown function rather than by the sample size. In particular, Bayesian approaches to wavelet series suffer from computational burdens if the chosen primary resolution is too high. A surplus primary resolution may result in a poor estimate. In this paper, we propose a simple Bayesian method to determine a primary resolution and wavelet basis functions independently of the sample size. Results from a simulation study demonstrate the promising empirical properties of the proposed approach.  相似文献   

11.
Bayesian networks model conditional dependencies among the domain variables, and provide a way to deduce their interrelationships as well as a method for the classification of new instances. One of the most challenging problems in using Bayesian networks, in the absence of a domain expert who can dictate the model, is inducing the structure of the network from a large, multivariate data set. We propose a new methodology for the design of the structure of a Bayesian network based on concepts of graph theory and nonlinear integer optimization techniques.  相似文献   

12.
In this paper the usage of a stochastic optimization algorithm as a model search tool is proposed for the Bayesian variable selection problem in generalized linear models. Combining aspects of three well known stochastic optimization algorithms, namely, simulated annealing, genetic algorithm and tabu search, a powerful model search algorithm is produced. After choosing suitable priors, the posterior model probability is used as a criterion function for the algorithm; in cases when it is not analytically tractable Laplace approximation is used. The proposed algorithm is illustrated on normal linear and logistic regression models, for simulated and real-life examples, and it is shown that, with a very low computational cost, it achieves improved performance when compared with popular MCMC algorithms, such as the MCMC model composition, as well as with “vanilla” versions of simulated annealing, genetic algorithm and tabu search.  相似文献   

13.
For a number of situations, a Bayesian network can be split into a core network consisting of a set of latent variables describing the status of a system, and a set of fragments relating the status variables to observable evidence that could be collected about the system state. This situation arises frequently in educational testing, where the status variables represent the student proficiency and the evidence models (graph fragments linking competency variables to observable outcomes) relate to assessment tasks that can be used to assess that proficiency. The traditional approach to knowledge engineering in this situation would be to maintain a library of fragments, where the graphical structure is specified using a graphical editor and then the probabilities are entered using a separate spreadsheet for each node. If many evidence model fragments employ the same design pattern, a lot of repetitive data entry is required. As the parameter values that determine the strength of the evidence can be buried on interior screens of an interface, it can be difficult for a design team to get an impression of the total evidence provided by a collection of evidence models for the system variables, and to identify holes in the data collection scheme. A Q-matrix - an incidence matrix whose rows represent observable outcomes from assessment tasks and whose columns represent competency variables - provides the graphical structure of the evidence models. The Q-matrix can be augmented to provide details of relationship strengths and provide a high level overview of the kind of evidence available. The relationships among the status variables can be represented with an inverse covariance matrix; this is particularly useful in models from the social sciences as often the domain experts’ knowledge about the system states comes from factor analyses and similar procedures that naturally produce covariance matrixes. The representation of the model using matrixes means that the bulk of the specification work can be done using a desktop spreadsheet program and does not require specialized software, facilitating collaboration with external experts. The design idea is illustrated with some examples from prior assessment design projects.  相似文献   

14.
One of the main problems in empirical sciences is the uncertainty about the relevance of variables. In the debate on the variables that provide a systematic and robust explanation of the share of employees that are members of trade unions, i.e. of trade union density, the problem of variable uncertainty is striking. In regression analyses there is the problem of having to select variables. One problem in the union density discussion is that depending on the chosen combination of regressors different results in the identification of relevant variables are achieved. To systematically analyze which variables are relevant the literature suggests model averaging and selection strategies. While the two strategies have advantages and disadvantages, the aim of this paper is to apply both. Based on a characteristic cross-country panel data set we find differences and similarities based on our evaluation and ask whether a methodological triangulation is possible.  相似文献   

15.
Feature selection for high-dimensional data   总被引:2,自引:0,他引:2  
This paper focuses on feature selection for problems dealing with high-dimensional data. We discuss the benefits of adopting a regularized approach with L 1 or L 1L 2 penalties in two different applications—microarray data analysis in computational biology and object detection in computer vision. We describe general algorithmic aspects as well as architecture issues specific to the two domains. The very promising results obtained show how the proposed approach can be useful in quite different fields of application.  相似文献   

16.
The feature selection problem is an interesting and important topic which is relevant for a variety of database applications. This paper utilizes the Tabu Search metaheuristic algorithm to implement a feature subset selection procedure while the nearest neighbor classification method is used for the classification task. Tabu Search is a general metaheuristic procedure that is used in order to guide the search to obtain good solutions in complex solution spaces. Several metrics are used in the nearest neighbor classification method, such as the euclidean distance, the Standardized Euclidean distance, the Mahalanobis distance, the City block metric, the Cosine distance and the Correlation distance, in order to identify the most significant metric for the nearest neighbor classifier. The performance of the proposed algorithms is tested using various benchmark datasets from UCI Machine Learning Repository.  相似文献   

17.
System failures, for example in electrical power systems, can have catastrophic impact on human life and high-cost missions. Due to an electrical fire in Swissair flight 111 on September 2, 1998, all 229 passengers and crew on board sadly lost their lives. A battery failure most likely took place on the Mars Global Surveyor, which unfortunately last communicated with Earth and thus ended its mission on November 2, 2006. Fault diagnosis techniques that seek to hinder similar accidents in the future are being developed in this article. We present comprehensive fault diagnosis methods for dynamic and hybrid domains with uncertainty, and validate them using electrical power system data. Our approach relies on the use of Bayesian networks, which model the electrical power system, compiled to arithmetic circuits. We handle in an integrated way varying fault dynamics (both persistent and intermittent faults), fault progression (both abrupt and drift faults), and fault behavior cardinality (both discrete and continuous behaviors). Our work has resulted in a software system for fault diagnosis, ProDiagnose, that has been the top performer in three of the four international diagnostics competitions in which it participated. In this paper we comprehensively present our methods as well as novel and extensive experimental results on data from a NASA electrical power system.  相似文献   

18.
This study provides operational guidance for building naïve Bayes Bayesian network (BN) models for bankruptcy prediction. First, we suggest a heuristic method that guides the selection of bankruptcy predictors. Based on the correlations and partial correlations among variables, the method aims at eliminating redundant and less relevant variables. A naïve Bayes model is developed using the proposed heuristic method and is found to perform well based on a 10-fold validation analysis. The developed naïve Bayes model consists of eight first-order variables, six of which are continuous. We also provide guidance on building a cascaded model by selecting second-order variables to compensate for missing values of first-order variables. Second, we analyze whether the number of states into which the six continuous variables are discretized has an impact on the model’s performance. Our results show that the model’s performance is the best when the number of states for discretization is either two or three. Starting from four states, the performance starts to deteriorate, probably due to over-fitting. Finally, we experiment whether modeling continuous variables with continuous distributions instead of discretizing them can improve the model’s performance. Our finding suggests that this is not true. One possible reason is that continuous distributions tested by the study do not represent well the underlying distributions of empirical data. Finally, the results of this study could also be applicable to business decision-making contexts other than bankruptcy prediction.  相似文献   

19.
Summary An inverse sampling procedureR is proposed for selecting a randomsize subset which contains the least probable cell (i.e., the cell with the smallest cell probabilities) from a multinomial distribution withk cells. Type 2-Dirichlet integrals are used (i) to express the probability of a correct selection in terms of integrals with parameters only in the limits of integration, (ii) to prove that the least favorable configuration underR is the so-called slippage configuration withk equal cell probabilities, and (iii) to express exactly the expectation of the total number of observations required and the expectation of the subset size under the procedureR.  相似文献   

20.
In this paper, we present a new formulation for the local access network expansion problem. Previously, we have shown that this problem can be seen as an extension of the well-known Capacitated Minimum Spanning Tree Problem and have presented and tested two flow-based models. By including additional information on the definition of the variables, we propose a new flow-based model that permits us to use effectively variable eliminations tests as well as coefficient reduction on some of the constraints. We present computational results for instances with up to 500 nodes in order to show the advantages of the new model in comparison with the others.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号