Improving the EM algorithm for maximum likelihood inference

Yunxiao He, Purdue University

Abstract

After its booming popularity of 30 years since the publication of Dempster et al. (1977), the EM algorithm is still expanding its application scope in various areas due to its widely recognized simplicity and stability. On the other hand, the two main disadvantages of EM, slow convergence and inability to provide estimation to the asymptotic variance-covariance matrix of the maximum likelihood estimator (MLE), have often been criticized. In this dissertation, after the necessary background knowledge is reviewed in Part I, each of Part II and Part III is devoted to tackling one of the two disadvantages of EM. Part II focuses on improving the efficiency of EM without sacrificing its simplicity and stability much. The ECME algorithm is extended to the Dynamic ECME (DECME) algorithm by dynamically constructing low dimensional acceleration sub-spaces over which the actual (constrained) likelihood is maximized. The DECME algorithm shares with ECME the simplicity and stability of EM, but often with a dramatically increased rate of convergence. The DECME algorithm includes as a special case the classical Successive Overrelaxation (SOR) method, a well known but insufficiently theoretically justified acceleration tool for EM. The scrutinizing of SOR leads to a new realization of DECME, denoted by DECME_v1, and two simplified versions of DECME_v1, denoted by DECME_v2 and DECME_v3. It is shown that DECME_v1 is equivalent to a conjugate direction method. Numerical results for linear mixed-effects model, factor analysis model, multivariate t distribution with missing data, Gaussian mixture model, and Probit latent class model show that DECME is simple, stable, widely applicable, and often converges faster than EM by a factor of one hundred or more in terms of number of iterations and a factor of thirty or more in terms of CPU time when EM is very slow. Part III deals with the problem of computing the asymptotic variance-covariance matrix of the MLE, which is an important supplementation for using EM to do optimization and hence making inference about the unknown parameter. The Conditional Normal Approximation (CNA) algorithm proposed in Liu (1998) is reviewed, simplified and extended. CNA and the newly developed extensions are also compared with other available methods, including the method of Louis (1982) and the SEM algorithm of Meng and Rubin (1991), in terms of generality, automatic applicability, efficiency, and stability. The results show that CNA and its extensions generally outperform the other approaches. Part IV concludes the dissertation with discussions on some possible further developments.

Degree

Ph.D.

Advisors

Liu, Purdue University.

Subject Area

Statistics

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS