首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 328 毫秒
1.
《TOP》1986,1(1):127-138
Summary Many estimating procedures are carried out with incomplete data by means of different types of EM algorithms. They allow us to obtain maximum likelihood parameter estimates in classical inference and also estimates based on the posterior mode in Bayesian inference. This paper analyzes in detail the spectral radii of the Jacobian matrices algorithm as a possible way to evaluate convergence rates. The eigenvalues of such matrices are explicitly obtained in some cases and, in all of them, a geometric convergence rate is, at least, guaranteed near the optimum. Finally, a comparison between the leading eigenvalues of EM and direct and approximate EM-Bayes algorithms may suggest the efficiency of each case.  相似文献   

2.
Many estimating procedures are carried out with incomplete data by means of different types of EM algorithms. They allow us to obtain maximum likelihood parameter estimates in classical inference and also estimates based on the posterior mode in Bayesian inference. This paper analyzes in detail the spectral radii of the Jacobian matrices algorithm as a possible way to evaluate convergence rates. The eigenvalues of such matrices are explicitly obtained in some cases and, in all of them, a geometric convergence rate is, at least, guaranteed near the optimum. Finally, a comparison between the leading eigenvalues of EM and direct and approximate EM-Bayes algorithms may suggest the efficiency of each case.  相似文献   

3.
Maximum likelihood estimation in finite mixture distributions is typically approached as an incomplete data problem to allow application of the expectation-maximization (EM) algorithm. In its general formulation, the EM algorithm involves the notion of a complete data space, in which the observed measurements and incomplete data are embedded. An advantage is that many difficult estimation problems are facilitated when viewed in this way. One drawback is that the simultaneous update used by standard EM requires overly informative complete data spaces, which leads to slow convergence in some situations. In the incomplete data context, it has been shown that the use of less informative complete data spaces, or equivalently smaller missing data spaces, can lead to faster convergence without sacrifying simplicity. However, in the mixture case, little progress has been made in speeding up EM. In this article we propose a component-wise EM for mixtures. It uses, at each iteration, the smallest admissible missing data space by intrinsically decoupling the parameter updates. Monotonicity is maintained, although the estimated proportions may not sum to one during the course of the iteration. However, we prove that the mixing proportions will satisfy this constraint upon convergence. Our proof of convergence relies on the interpretation of our procedure as a proximal point algorithm. For performance comparison, we consider standard EM as well as two other algorithms based on missing data space reduction, namely the SAGE and AECME algorithms. We provide adaptations of these general procedures to the mixture case. We also consider the ECME algorithm, which is not a data augmentation scheme but still aims at accelerating EM. Our numerical experiments illustrate the advantages of the component-wise EM algorithm relative to these other methods.  相似文献   

4.
The family of expectation--maximization (EM) algorithms provides a general approach to fitting flexible models for large and complex data. The expectation (E) step of EM-type algorithms is time-consuming in massive data applications because it requires multiple passes through the full data. We address this problem by proposing an asynchronous and distributed generalization of the EM called the distributed EM (DEM). Using DEM, existing EM-type algorithms are easily extended to massive data settings by exploiting the divide-and-conquer technique and widely available computing power, such as grid computing. The DEM algorithm reserves two groups of computing processes called workers and managers for performing the E step and the maximization step (M step), respectively. The samples are randomly partitioned into a large number of disjoint subsets and are stored on the worker processes. The E step of DEM algorithm is performed in parallel on all the workers, and every worker communicates its results to the managers at the end of local E step. The managers perform the M step after they have received results from a γ-fraction of the workers, where γ is a fixed constant in (0, 1]. The sequence of parameter estimates generated by the DEM algorithm retains the attractive properties of EM: convergence of the sequence of parameter estimates to a local mode and linear global rate of convergence. Across diverse simulations focused on linear mixed-effects models, the DEM algorithm is significantly faster than competing EM-type algorithms while having a similar accuracy. The DEM algorithm maintains its superior empirical performance on a movie ratings database consisting of 10 million ratings. Supplementary material for this article is available online.  相似文献   

5.
This article presents an algorithm for accommodating missing data in situations where a natural set of estimating equations exists for the complete data setting. The complete data estimating equations can correspond to the score functions from a standard, partial, or quasi-likelihood, or they can be generalized estimating equations (GEEs). In analogy to the EM, which is a special case, the method is called the ES algorithm, because it iterates between an E-Step wherein functions of the complete data are replaced by their expected values, and an S-Step where these expected values are substituted into the complete-data estimating equation, which is then solved. Convergence properties of the algorithm are established by appealing to general theory for iterative solutions to nonlinear equations. In particular, the ES algorithm (and indeed the EM) are shown to correspond to examples of nonlinear Gauss-Seidel algorithms. An added advantage of the approach is that it yields a computationally simple method for estimating the variance of the resulting parameter estimates.  相似文献   

6.
Bayes-adaptive POMDPs (BAPOMDPs) are partially observable Markov decision problems in which uncertainty in the state-transition and observation-emission probabilities can be captured by a prior distribution over the model parameters. Existing approaches to solving BAPOMDPs rely on model and trajectory sampling to guide exploration and, because of the curse of dimensionality, do not scale well when the degree of model uncertainty is large. In this paper, we begin by presenting two expectation-maximization (EM) approaches to solving BAPOMPs via finite-state controller (FSC) optimization, which at their foundation are extensions of existing EM algorithms for BAMDPs to the more general BAPOMDP setting. The first is a sampling-based EM algorithm that optimizes over a finite number of models drawn from the BAPOMDP prior, and as such is only appropriate for smaller problems with limited model uncertainty; the second approach leverages variational Bayesian methods to ensure tractability without sampling, and is most appropriate for larger domains with greater model uncertainty. Our primary novel contribution is the derivation of the constrained VB-EM algorithm, which addresses an unfavourable preference that often arises towards a certain class of policies when applying the standard VB-EM algorithm. Through an empirical study we show that the sampling-based EM algorithm is competitive with more conventional sampling-based approaches in smaller domains, and that our novel constrained VB-EM algorithm can generate quality solutions in larger domains where sampling-based approaches are no longer viable.  相似文献   

7.
Algorithms inspired by swarm intelligence have been used for many optimization problems and their effectiveness has been proven in many fields. We propose a new swarm intelligence algorithm for structural learning of Bayesian networks, BFO-B, based on bacterial foraging optimization. In the BFO-B algorithm, each bacterium corresponds to a candidate solution that represents a Bayesian network structure, and the algorithm operates under three principal mechanisms: chemotaxis, reproduction, and elimination and dispersal. The chemotaxis mechanism uses four operators to randomly and greedily optimize each solution in a bacterial population, then the reproduction mechanism simulates survival of the fittest to exploit superior solutions and speed convergence of the optimization. Finally, an elimination and dispersal mechanism controls the exploration processes and jumps out of a local optima with a certain probability. We tested the individual contributions of four algorithm operators and compared with two state of the art swarm intelligence based algorithms and seven other well-known algorithms on many benchmark networks. The experimental results verify that the proposed BFO-B algorithm is a viable alternative to learn the structures of Bayesian networks, and is also highly competitive compared to state of the art algorithms.  相似文献   

8.
Bayesian networks are graphical models that represent the joint distribution of a set of variables using directed acyclic graphs. The graph can be manually built by domain experts according to their knowledge. However, when the dependence structure is unknown (or partially known) the network has to be estimated from data by using suitable learning algorithms. In this paper, we deal with a constraint-based method to perform Bayesian networks structural learning in the presence of ordinal variables. We propose an alternative version of the PC algorithm, which is one of the most known procedures, with the aim to infer the network by accounting for additional information inherent to ordinal data. The proposal is based on a nonparametric test, appropriate for ordinal variables. A comparative study shows that, in some situations, the proposal discussed here is a slightly more efficient solution than the PC algorithm.  相似文献   

9.
A finite mixture model has been used to fit the data from heterogeneous populations to many applications. An Expectation Maximization (EM) algorithm is the most popular method to estimate parameters in a finite mixture model. A Bayesian approach is another method for fitting a mixture model. However, the EM algorithm often converges to the local maximum regions, and it is sensitive to the choice of starting points. In the Bayesian approach, the Markov Chain Monte Carlo (MCMC) sometimes converges to the local mode and is difficult to move to another mode. Hence, in this paper we propose a new method to improve the limitation of EM algorithm so that the EM can estimate the parameters at the global maximum region and to develop a more effective Bayesian approach so that the MCMC chain moves from one mode to another more easily in the mixture model. Our approach is developed by using both simulated annealing (SA) and adaptive rejection metropolis sampling (ARMS). Although SA is a well-known approach for detecting distinct modes, the limitation of SA is the difficulty in choosing sequences of proper proposal distributions for a target distribution. Since ARMS uses a piecewise linear envelope function for a proposal distribution, we incorporate ARMS into an SA approach so that we can start a more proper proposal distribution and detect separate modes. As a result, we can detect the maximum region and estimate parameters for this global region. We refer to this approach as ARMS annealing. By putting together ARMS annealing with the EM algorithm and with the Bayesian approach, respectively, we have proposed two approaches: an EM-ARMS annealing algorithm and a Bayesian-ARMS annealing approach. We compare our two approaches with traditional EM algorithm alone and Bayesian approach alone using simulation, showing that our two approaches are comparable to each other but perform better than EM algorithm alone and Bayesian approach alone. Our two approaches detect the global maximum region well and estimate the parameters in this region. We demonstrate the advantage of our approaches using an example of the mixture of two Poisson regression models. This mixture model is used to analyze a survey data on the number of charitable donations.  相似文献   

10.
Mixture models in reliability bring a useful compromise between parametric and nonparametric models, when several failure modes are suspected. The classical methods for estimation in mixture models rarely handle the additional difficulty coming from the fact that lifetime data are often censored, in a deterministic or random way. We present in this paper several iterative methods based on EM and Stochastic EM methodologies, that allow us to estimate parametric or semiparametric mixture models for randomly right censored lifetime data, provided they are identifiable. We consider different levels of completion for the (incomplete) observed data, and provide genuine or EM-like algorithms for several situations. In particular, we show that simulating the missing data coming from the mixture allows to plug a standard R package for survival data analysis in an EM algorithm’s M-step. Moreover, in censored semiparametric situations, a stochastic step is the only practical solution allowing computation of nonparametric estimates of the unknown survival function. The effectiveness of the new proposed algorithms are demonstrated in simulation studies and an actual dataset example from aeronautic industry.  相似文献   

11.
Single-index models have found applications in econometrics and biometrics, where multidimensional regression models are often encountered. This article proposes a nonparametric estimation approach that combines wavelet methods for nonequispaced designs with Bayesian models. We consider a wavelet series expansion of the unknown regression function and set prior distributions for the wavelet coefficients and the other model parameters. To ensure model identifiability, the direction parameter is represented via its polar coordinates. We employ ad hoc hierarchical mixture priors that perform shrinkage on wavelet coefficients and use Markov chain Monte Carlo methods for a posteriori inference. We investigate an independence-type Metropolis-Hastings algorithm to produce samples for the direction parameter. Our method leads to simultaneous estimates of the link function and of the index parameters. We present results on both simulated and real data, where we look at comparisons with other methods.  相似文献   

12.
The parameters of a hidden Markov model (HMM) can be estimated by numerical maximization of the log-likelihood function or, more popularly, using the expectation–maximization (EM) algorithm. In its standard implementation the latter is unsuitable for fitting stationary hidden Markov models (HMMs). We show how it can be modified to achieve this. We propose a hybrid algorithm that is designed to combine the advantageous features of the two algorithms and compare the performance of the three algorithms using simulated data from a designed experiment, and a real data set. The properties investigated are speed of convergence, stability, dependence on initial values, different parameterizations. We also describe the results of an experiment to assess the true coverage probability of bootstrap-based confidence intervals for the parameters.  相似文献   

13.
The expectation–maximization (EM) algorithm is a very general and popular iterative computational algorithm to find maximum likelihood estimates from incomplete data and broadly used to statistical analysis with missing data, because of its stability, flexibility and simplicity. However, it is often criticized that the convergence of the EM algorithm is slow. The various algorithms to accelerate the convergence of the EM algorithm have been proposed. The vector ε algorithm of Wynn (Math Comp 16:301–322, 1962) is used to accelerate the convergence of the EM algorithm in Kuroda and Sakakihara (Comput Stat Data Anal 51:1549–1561, 2006). In this paper, we provide the theoretical evaluation of the convergence of the ε-accelerated EM algorithm. The ε-accelerated EM algorithm does not use the information matrix but only uses the sequence of estimates obtained from iterations of the EM algorithm, and thus it keeps the flexibility and simplicity of the EM algorithm.  相似文献   

14.
Credal networks generalize Bayesian networks by relaxing the requirement of precision of probabilities. Credal networks are considerably more expressive than Bayesian networks, but this makes belief updating NP-hard even on polytrees. We develop a new efficient algorithm for approximate belief updating in credal networks. The algorithm is based on an important representation result we prove for general credal networks: that any credal network can be equivalently reformulated as a credal network with binary variables; moreover, the transformation, which is considerably more complex than in the Bayesian case, can be implemented in polynomial time. The equivalent binary credal network is then updated by L2U, a loopy approximate algorithm for binary credal networks. Overall, we generalize L2U to non-binary credal networks, obtaining a scalable algorithm for the general case, which is approximate only because of its loopy nature. The accuracy of the inferences with respect to other state-of-the-art algorithms is evaluated by extensive numerical tests.  相似文献   

15.
It is well known that the maximum likelihood estimates (MLEs) of a multivariate normal distribution from incomplete data with a monotone pattern have closed-form expressions and that the MLEs from incomplete data with a general missing-data pattern can be obtained using the Expectation-Maximization (EM) algorithm. This article gives closed-form expressions, analogous to the extension of the Bartlett decomposition, for both the MLEs of the parameters and the associated Fisher information matrix from incomplete data with a monotone missing-data pattern. For MLEs of the parameters from incomplete data with a general missing-data pattern, we implement EM and Expectation-Constrained-Maximization-Either (ECME), by augmenting the observed data into a complete monotone sample. We also provide a numerical example, which shows that the monotone EM (MEM) and monotone ECME (MECME) algorithms converge much faster than the EM algorithm.  相似文献   

16.
This article presents methodology that allows a computer to play the role of musical accompanist in a nonimprovised musical composition for soloist and accompaniment. The modeling of the accompaniment incorporates a number of distinct knowledge sources including timing information extracted in real-time from the soloist's acoustic signal, an understanding of the soloist's interpretation learned from rehearsals, and prior knowledge that guides the accompaniment toward musically plausible renditions. The solo and accompaniment parts are represented collectively as a large number of Gaussian random variables with a specified conditional independence structure—a Bayesian belief network. Within this framework a principled and computationally feasible method for generating real-time accompaniment is presented that incorporates the relevant knowledge sources. The EM algorithm is used to adapt the accompaniment to the soloist's interpretation through a series of rehearsals. A demonstration is provided from J.S. Bach's Cantata 12.  相似文献   

17.
In this paper, we address the problem of learning discrete Bayesian networks from noisy data. A graphical model based on a mixture of Gaussian distributions with categorical mixing structure coming from a discrete Bayesian network is considered. The network learning is formulated as a maximum likelihood estimation problem and performed by employing an EM algorithm. The proposed approach is relevant to a variety of statistical problems for which Bayesian network models are suitable—from simple regression analysis to learning gene/protein regulatory networks from microarray data.  相似文献   

18.
俞燕  徐勤丰  孙鹏飞 《应用数学》2006,19(3):600-605
本文基于Dirichlet分布有限混合模型,提出了一种用于成分数据的Bayes聚类方法.采用EM算法获得模型参数的估计,用BIC准则确定类数,用类似于Bayes判别的方法对各观测分类.推导了计算公式,编写出程序.模拟研究结果表明,本文提出的方法有较好的聚类效果.  相似文献   

19.
Approximate inference in Bayesian networks using binary probability trees   总被引:2,自引:0,他引:2  
The present paper introduces a new kind of representation for the potentials in a Bayesian network: Binary Probability Trees. They enable the representation of context-specific independences in more detail than probability trees. This enhanced capability leads to more efficient inference algorithms for some types of Bayesian networks. This paper explains the procedure for building a binary probability tree from a given potential, which is similar to the one employed for building standard probability trees. It also offers a way of pruning a binary tree in order to reduce its size. This allows us to obtain exact or approximate results in inference depending on an input threshold. This paper also provides detailed algorithms for performing the basic operations on potentials (restriction, combination and marginalization) directly to binary trees. Finally, some experiments are described where binary trees are used with the variable elimination algorithm to compare the performance with that obtained for standard probability trees.  相似文献   

20.
The aim of this paper is to model lifetime data for systems that have failure modes by using the finite mixture of Weibull distributions. It involves estimating of the unknown parameters which is an important task in statistics, especially in life testing and reliability analysis. The proposed approach depends on different methods that will be used to develop the estimates such as MLE through the EM algorithm. In addition, Bayesian estimations will be investigated and some other extensions such as Graphic, Non-Linear Median Rank Regression and Monte Carlo simulation methods can be used to model the system under consideration. A numerical application will be used through the proposed approach. This paper also presents a comparison of the fitted probability density functions, reliability functions and hazard functions of the 3-parameter Weibull and Weibull mixture distributions using the proposed approach and other conventional methods which characterize the distribution of failure times for the system components. GOF is used to determine the best distribution for modeling lifetime data, the priority will be for the proposed approach which has more accurate parameter estimates.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号