首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Monte Carlo EM加速算法   总被引:6,自引:0,他引:6       下载免费PDF全文
罗季 《应用概率统计》2008,24(3):312-318
EM算法是近年来常用的求后验众数的估计的一种数据增广算法, 但由于求出其E步中积分的显示表达式有时很困难, 甚至不可能, 限制了其应用的广泛性. 而Monte Carlo EM算法很好地解决了这个问题, 将EM算法中E步的积分用Monte Carlo模拟来有效实现, 使其适用性大大增强. 但无论是EM算法, 还是Monte Carlo EM算法, 其收敛速度都是线性的, 被缺损信息的倒数所控制, 当缺损数据的比例很高时, 收敛速度就非常缓慢. 而Newton-Raphson算法在后验众数的附近具有二次收敛速率. 本文提出Monte Carlo EM加速算法, 将Monte Carlo EM算法与Newton-Raphson算法结合, 既使得EM算法中的E步用Monte Carlo模拟得以实现, 又证明了该算法在后验众数附近具有二次收敛速度. 从而使其保留了Monte Carlo EM算法的优点, 并改进了Monte Carlo EM算法的收敛速度. 本文通过数值例子, 将Monte Carlo EM加速算法的结果与EM算法、Monte Carlo EM算法的结果进行比较, 进一步说明了Monte Carlo EM加速算法的优良性.  相似文献   

2.
The expectation–maximization (EM) algorithm is a very general and popular iterative computational algorithm to find maximum likelihood estimates from incomplete data and broadly used to statistical analysis with missing data, because of its stability, flexibility and simplicity. However, it is often criticized that the convergence of the EM algorithm is slow. The various algorithms to accelerate the convergence of the EM algorithm have been proposed. The vector ε algorithm of Wynn (Math Comp 16:301–322, 1962) is used to accelerate the convergence of the EM algorithm in Kuroda and Sakakihara (Comput Stat Data Anal 51:1549–1561, 2006). In this paper, we provide the theoretical evaluation of the convergence of the ε-accelerated EM algorithm. The ε-accelerated EM algorithm does not use the information matrix but only uses the sequence of estimates obtained from iterations of the EM algorithm, and thus it keeps the flexibility and simplicity of the EM algorithm.  相似文献   

3.
Based on Vector Aitken (VA) method, we propose an acceleration Expectation-Maximization (EM) algorithm, VA-accelerated EM algorithm, whose convergence speed is faster than that of EM algorithm. The VA-accelerated EM algorithm does not use the information matrix but only uses the sequence of estimates obtained from iterations of the EM algorithm, thus it keeps the flexibility and simplicity of the EM algorithm. Considering Steffensen iterative process, we have also given the Steffensen form of the VA-accelerated EM algorithm. It can be proved that the reform process is quadratic convergence. Numerical analysis illustrate the proposed methods are efficient and faster than EM algorithm.  相似文献   

4.
应用Monte Carlo EM加速算法给出了混合指数分布在恒加应力水平下,在定数截尾场合的参数估计问题,并通过模拟试验说明利用Monte Carlo EM加速算法来估计混合指数分布比EM算法更有效,收敛速度更快.  相似文献   

5.
The Expectation-Maximization (EM) algorithm is widely used also in industry for parameter estimation within a Maximum Likelihood (ML) framework in case of missing data. It is well-known that EM shows good convergence in several cases of practical interest. To the best of our knowledge, results showing under which conditions EM converges fast are only available for specific cases. In this paper, we analyze the connection of the EM algorithm to other ascent methods as well as the convergence rates of the EM algorithm in general including also nonlinear models and apply this to the PMHT model. We compare the EM with other known iterative schemes such as gradient and Newton-type methods. It is shown that EM reaches Newton-convergence in case of well-separated objects and a Newton-EM combination turns out to be robust and efficient even in cases of closely-spaced targets.  相似文献   

6.
Abstract

The EM algorithm is widely used in incomplete-data problems (and some complete-data problems) for parameter estimation. One limitation of the EM algorithm is that, upon termination, it is not always near a global optimum. As reported by Wu (1982), when several stationary points exist, convergence to a particular stationary point depends on the choice of starting point. Furthermore, convergence to a saddle point or local minimum is also possible. In the EM algorithm, although the log-likelihood is unknown, an interval containing the gradient of the EM q function can be computed at individual points using interval analysis methods. By using interval analysis to enclose the gradient of the EM q function (and, consequently, the log-likelihood), an algorithm is developed that is able to locate all stationary points of the log-likelihood within any designated region of the parameter space. The algorithm is applied to several examples. In one example involving the t distribution, the algorithm successfully locates (all) seven stationary points of the log-likelihood.  相似文献   

7.
Maximum likelihood estimation in finite mixture distributions is typically approached as an incomplete data problem to allow application of the expectation-maximization (EM) algorithm. In its general formulation, the EM algorithm involves the notion of a complete data space, in which the observed measurements and incomplete data are embedded. An advantage is that many difficult estimation problems are facilitated when viewed in this way. One drawback is that the simultaneous update used by standard EM requires overly informative complete data spaces, which leads to slow convergence in some situations. In the incomplete data context, it has been shown that the use of less informative complete data spaces, or equivalently smaller missing data spaces, can lead to faster convergence without sacrifying simplicity. However, in the mixture case, little progress has been made in speeding up EM. In this article we propose a component-wise EM for mixtures. It uses, at each iteration, the smallest admissible missing data space by intrinsically decoupling the parameter updates. Monotonicity is maintained, although the estimated proportions may not sum to one during the course of the iteration. However, we prove that the mixing proportions will satisfy this constraint upon convergence. Our proof of convergence relies on the interpretation of our procedure as a proximal point algorithm. For performance comparison, we consider standard EM as well as two other algorithms based on missing data space reduction, namely the SAGE and AECME algorithms. We provide adaptations of these general procedures to the mixture case. We also consider the ECME algorithm, which is not a data augmentation scheme but still aims at accelerating EM. Our numerical experiments illustrate the advantages of the component-wise EM algorithm relative to these other methods.  相似文献   

8.
One of the most powerful algorithms for obtaining maximum likelihood estimates for many incomplete-data problems is the EM algorithm. However, when the parameters satisfy a set of nonlinear restrictions, It is difficult to apply the EM algorithm directly. In this paper,we propose an asymptotic maximum likelihood estimation procedure under a set of nonlinear inequalities restrictions on the parameters, in which the EM algorithm can be used. Essentially this kind of estimation problem is a stochastic optimization problem in the M-step. We make use of methods in stochastic optimization to overcome the difficulty caused by nonlinearity in the given constraints.  相似文献   

9.
Model Misspecification: Finite Mixture or Homogeneous?   总被引:1,自引:0,他引:1  
A common problem in statistical modelling is to distinguish between finite mixture distribution and a homogeneous non-mixture distribution. Finite mixture models are widely used in practice and often mixtures of normal densities are indistinguishable from homogenous non-normal densities. This paper illustrates what happens when the EM algorithm for normal mixtures is applied to a distribution that is a homogeneous non-mixture distribution. In particular, a population-based EM algorithm for finite mixtures is introduced and applied directly to density functions instead of sample data. The population-based EM algorithm is used to find finite mixture approximations to common homogeneous distributions. An example regarding the nature of a placebo response in drug treated depressed subjects is used to illustrate ideas.  相似文献   

10.
A finite mixture model has been used to fit the data from heterogeneous populations to many applications. An Expectation Maximization (EM) algorithm is the most popular method to estimate parameters in a finite mixture model. A Bayesian approach is another method for fitting a mixture model. However, the EM algorithm often converges to the local maximum regions, and it is sensitive to the choice of starting points. In the Bayesian approach, the Markov Chain Monte Carlo (MCMC) sometimes converges to the local mode and is difficult to move to another mode. Hence, in this paper we propose a new method to improve the limitation of EM algorithm so that the EM can estimate the parameters at the global maximum region and to develop a more effective Bayesian approach so that the MCMC chain moves from one mode to another more easily in the mixture model. Our approach is developed by using both simulated annealing (SA) and adaptive rejection metropolis sampling (ARMS). Although SA is a well-known approach for detecting distinct modes, the limitation of SA is the difficulty in choosing sequences of proper proposal distributions for a target distribution. Since ARMS uses a piecewise linear envelope function for a proposal distribution, we incorporate ARMS into an SA approach so that we can start a more proper proposal distribution and detect separate modes. As a result, we can detect the maximum region and estimate parameters for this global region. We refer to this approach as ARMS annealing. By putting together ARMS annealing with the EM algorithm and with the Bayesian approach, respectively, we have proposed two approaches: an EM-ARMS annealing algorithm and a Bayesian-ARMS annealing approach. We compare our two approaches with traditional EM algorithm alone and Bayesian approach alone using simulation, showing that our two approaches are comparable to each other but perform better than EM algorithm alone and Bayesian approach alone. Our two approaches detect the global maximum region well and estimate the parameters in this region. We demonstrate the advantage of our approaches using an example of the mixture of two Poisson regression models. This mixture model is used to analyze a survey data on the number of charitable donations.  相似文献   

11.
基于删失数据的指数威布尔分布最大似然估计的新算法   总被引:1,自引:0,他引:1  
本文讨论了指数威布尔分布当观测数据是删失数据情形时参数的最大似然估计问题.因为删失数据是一种不完全数据,我们利用EM算法来计算参数的近似最大似然估计.由于EM算法计算的复杂性,计算效率也不理想.为了克服牛顿-拉普森算法和EM算法的局限性,我们提出了一种新的方法.这种方法联合了指数威布尔分布到指数分布的变换和等效寿命数据的技巧,比牛顿-拉普森算法和EM算法更具有操作性.数据模拟讨论了这一方法的可行性.为了演示本文的方法,我们还提供了一个真实寿命数据分析的例子.  相似文献   

12.
The maximum‐likelihood expectation‐maximization (EM) algorithm has attracted considerable interest in single‐photon emission computed tomography, because it produces superior images in addition to be being flexible, simple, and allowing a physical interpretation. However, it often needs a large number of calculations because of the algorithm's slow rate of convergence. Therefore, there is a large body of literature concerning the EM algorithm's acceleration. One of the accelerated means is increasing an overrelaxation parameter, whereas we have not found any analysis in this method that would provide an immediate answer to the questions of the convergence. In this paper, our main focus is on the continuous version of an accelerated EM algorithm based on Lewitt and Muehllenner. We extend their conclusions to the infinite‐dimensional space and interpret and analyze the convergence of the accelerated EM algorithm. We also obtain some new properties of the modified algorithm. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

13.
In model-based cluster analysis, the expectation-maximization (EM) algorithm has a number of desirable properties, but in some situations, this algorithm can be slow to converge. Some variants are proposed to speed-up EM in reducing the time spent in the E-step, in the case of Gaussian mixture. The main aims of such methods is first to speed-up convergence of EM, and second to yield same results (or not so far) than EM itself. In this paper, we compare these methods from categorical data, with the latent class model, and we propose a new variant that sustains better results on synthetic and real data sets, in terms of convergence speed-up and number of misclassified objects.  相似文献   

14.
This paper proposes a new step called the P-step to handle the linear or nonlinear equality constraint in addition to the conventional EM algorithm. This new step is easy to implement, first because only the first derivatives of the object function and the constraint function are necessary, and secondly, because the P-step is carried out after the conventional EM algorithm. The estimate sequence produced by our method enjoys a monotonic increase in the observed likelihood function. We apply the P-step in addition to the conventional EM algorithm to the two illustrative examples. The first example has a linear constraint function. The second has a nonlinear constraint function. We show finally that there exists a Kuhn–Tucker vector at the limit point produced by our method.  相似文献   

15.
Maximum likelihood estimation of the multivariatetdistribution, especially with unknown degrees of freedom, has been an interesting topic in the development of the EM algorithm. After a brief review of the EM algorithm and its application to finding the maximum likelihood estimates of the parameters of thetdistribution, this paper provides new versions of the ECME algorithm for maximum likelihood estimation of the multivariatetdistribution from data with possibly missing values. The results show that the new versions of the ECME algorithm converge faster than the previous procedures. Most important, the idea of this new implementation is quite general and useful for the development of the EM algorithm. Comparisons of different methods based on two datasets are presented.  相似文献   

16.
The stochastic approximation EM (SAEM) algorithm is a simulation-based alternative to the expectation/maximization (EM) algorithm for situations when the E-step is hard or impossible. One of the appeals of SAEM is that, unlike other Monte Carlo versions of EM, it converges with a fixed (and typically small) simulation size. Another appeal is that, in practice, the only decision that has to be made is the choice of the step size which is a one-time decision and which is usually done before starting the method. The downside of SAEM is that there exist no data-driven and/or model-driven recommendations as to the magnitude of this step size. We argue in this article that a challenging model/data combination coupled with an unlucky step size can lead to very poor algorithmic performance and, in particular, to a premature stop of the method. This article proposes a new heuristic for SAEM's step size selection based on the underlying EM rate of convergence. We also use the much-appreciated EM likelihood-ascent property to derive a new and flexible way of monitoring the progress of the SAEM algorithm. The method is applied to a challenging geostatistical model of online retailing.  相似文献   

17.
Abstract

The primary model for cluster analysis is the latent class model. This model yields the mixture likelihood. Due to numerous local maxima, the success of the EM algorithm in maximizing the mixture likelihood depends on the initial starting point of the algorithm. In this article, good starting points for the EM algorithm are obtained by applying classification methods to randomly selected subsamples of the data. The performance of the resulting two-step algorithm, classification followed by EM, is compared to, and found superior to, the baseline algorithm of EM started from a random partition of the data. Though the algorithm is not complicated, comparing it to the baseline algorithm and assessing its performance with several classification methods is nontrivial. The strategy employed for comparing the algorithms is to identify canonical forms for the easiest and most difficult datasets to cluster within a large collection of cluster datasets and then to compare the performance of the two algorithms on these datasets. This has led to the discovery that, in the case of three homogeneous clusters, the most difficult datasets to cluster are those in which the clusters are arranged on a line and the easiest are those in which the clusters are arranged on an equilateral triangle. The performance of the two-step algorithm is assessed using several classification methods and is shown to be able to cluster large, difficult datasets consisting of three highly overlapping clusters arranged on a line with 10,000 observations and 8 variables.  相似文献   

18.
Changepoint models are widely used to model the heterogeneity of sequential data. We present a novel sequential Monte Carlo (SMC) online expectation–maximization (EM) algorithm for estimating the static parameters of such models. The SMC online EM algorithm has a cost per time which is linear in the number of particles and could be particularly important when the data is representable as a long sequence of observations, since it drastically reduces the computational requirements for implementation. We present an asymptotic analysis for the stability of the SMC estimates used in the online EM algorithm and demonstrate the performance of this scheme by using both simulated and real data originating from DNA analysis. The supplementary materials for the article are available online.  相似文献   

19.
20.
在很多问题中,Beta-Binomial模型有着较为广泛的应用,本文基于EM算法研究了Beta-Binomial模型的参数估计问题,并把它应用于实际的案例中.结果表明我们提出的方法计算方便,对具体问题的解释更具合理性、科学性.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号