Global Convergence of the EM Algorithm for Unconstrained Latent Variable Models with Categorical Indicators

Alexander Weissman

doi:10.1007/s11336-012-9295-z

Global Convergence of the EM Algorithm for Unconstrained Latent Variable Models with Categorical Indicators

Published online by Cambridge University Press: 01 January 2025

Alexander Weissman

Show author details

Alexander Weissman*: Affiliation:
Psychometric Research, Law School Admission Council
*: Requests for reprints should be sent to Alexander Weissman, Psychometric Research, Law School Admission Council, 662 Penn Street, Box 40, Newtown, PA 18940, USA. E-mail: [email protected]

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Convergence of the expectation-maximization (EM) algorithm to a global optimum of the marginal log likelihood function for unconstrained latent variable models with categorical indicators is presented. The sufficient conditions under which global convergence of the EM algorithm is attainable are provided in an information-theoretic context by interpreting the EM algorithm as alternating minimization of the Kullback–Leibler divergence between two convex sets. It is shown that these conditions are satisfied by an unconstrained latent class model, yielding an optimal bound against which more highly constrained models may be compared.

Keywords

EM algorithm latent variable models latent class models information theory Kullback–Leibler divergence relative entropy variational calculus convex optimization optimal bounds

Type: Original Paper
Information: Psychometrika , Volume 78 , Issue 1 , January 2013 , pp. 134 - 153

DOI: https://doi.org/10.1007/s11336-012-9295-z [Opens in a new window]
Copyright: Copyright © The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In Petrov, B.N., & Csaki, F. (Eds.), Proceeding of the second international symposium on information theory, Budapest: Akademiai Kiado 267281Google Scholar

Amari, S.-i. (1994). Information geometry of the EM and em algorithms for neural networks, Tokyo: Department of Mathematical Engineering, University of TokyoGoogle Scholar

Baker, F.B., & Kim, S.-H. (2004). Item response theory: parameter estimation techniques, New York: DekkerCrossRef Google Scholar

Beal, M.J. (2003). Variational algorithms for approximate Bayesian inference. Unpublished Doctoral dissertation. University of London. Google Scholar

Beal, M.J., & Ghahramani, Z. (2003). The variational Bayesian EM algorithm for incomplete data: with application to scoring graphical model structures. In Bernardo, J.M., Bayarri, M.J., Dawid, A.P., Berger, J.O., Heckerman, D., Smith, A.F.M., & West, M. (Eds.), Bayesian statistics 7: proceedings of the seventh Valencia international meeting, June 2–6, 2002, Oxford: Oxford University PressGoogle Scholar

Bock, R.D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: application of an EM algorithm. Psychometrika, 46(4), 443459CrossRef Google Scholar

Boyd, S.P., & Vandenberghe, L. (2004). Convex optimization, Cambridge: Cambridge University PressCrossRef Google Scholar

Bozdogan, H. (1987). Model selection and Akaike’s information criterion (AIC): the general theory and its analytical extensions. Psychometrika, 52(3), 345370CrossRef Google Scholar

Byrd, R.H., Nocedal, J., & Waltz, R.A. (2006). Knitro: an integrated package for nonlinear optimization. Paper presented at the workshop on large scale nonlinear optimization held in Erice, Italy, at the “G. Stampacchia” International School of Mathematics of the “E. Majorana” Centre for Scientific Culture, during June 22–July 1, 2004, Erice, Italy. CrossRef Google Scholar

Cover, T.M., & Thomas, J.A. (2006). Elements of information theory, (2nd ed.). Hoboken: WileyGoogle Scholar

Csiszár, I., & Tusnády, G. (1984). Information geometry and alternating minimization procedures. Statistics & Decisions, Supplement Issue 1, 205–237Google Scholar

Dellaert, F. (2002). The expectation maximization algorithm (No. GIT-GVU-02-20). Atlanta: Georgia Institute of Technology. Google Scholar

Dempster, A.P., Laird, N.M., & Rubin, D.B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B (Methodological), 39(1), 138CrossRef Google Scholar

Fourer, R., Gay, D., & Kernighan, B. (2002). AMPL: a modeling language for mathematical programming, Stamford: Duxbury Press/Brooks/Cole Publishing CompanyGoogle Scholar

Fuchs, M., & Neumaier, A. (2010). Optimization in latent class analysis (Technical Report TR/PA/10/89). Centre Européen de Recherche et de Formation Avancée en Calcul Scientifique (CERFACS). Retrieved from http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.188.5558. Google Scholar

Harpaz, R., & Haralick, R. (2006). The EM algorithm as a lower bound optimization technique (No. TR-2006001). Graduate Center, City University of New York. Google Scholar

Harwell, M.R., Baker, F.B., & Zwarts, M. (1988). Item parameter estimation via marginal maximum likelihood and an EM algorithm: a didactic. Journal of Educational Statistics, 13(3), 243271CrossRef Google Scholar

Humphreys, K., & Titterington, D.M. (2003). Variational approximations for categorical causal modeling with latent variables. Psychometrika, 68(3), 391412CrossRef Google Scholar

Ip, E.H., & Lalwani, N. (2000). Notes and Comments—A note on the geometric interpretation of the EM algorithm in estimating item characteristics and student abilities. Psychometrika, 65(4), 533537CrossRef Google Scholar

Jordan, M.I., Ghahramani, Z., Jaakkola, T.S., & Saul, L.K. (1999). An introduction to variational methods for graphical models. Machine Learning, 37, 183–233CrossRef Google Scholar

Kohlmann, T., & Formann, A.K. (1997). Using latent class models to analyze response patterns in epidemiologic mail surveys. In Rost, J., & Langeheine, R. (Eds.), Applications of latent trait and latent class models in the social sciences, Münster: WaxmannGoogle Scholar

Kotz, S., Read, C.B., & Banks, D.L. (1999). Encyclopedia of statistical sciences, New York: WileyGoogle Scholar

Kullback, S., & Leibler, R.A. (1951). On information and sufficiency. The Annals of Mathematical Statistics, 22(1), 7986CrossRef Google Scholar

McLachlan, G.J., & Krishnan, T. (2008). The EM algorithm and extensions, Hoboken: Wiley-InterscienceCrossRef Google Scholar

Minka, T.P. (1998). Expectation-maximization as lower bound maximization. Retrieved from http://research.microsoft.com/en-us/um/people/minka/papers/em.html. Google Scholar

Minka, T.P. (2009). Automating variational inference for statistics and data mining. Paper presented at the 74th annual and 16th international meeting of the psychometric society, Cambridge, UK. Google Scholar

Moustaki, I., & Knott, M. (2000). Generalized latent trait models. Psychometrika, 65(3), 391411CrossRef Google Scholar

Neal, R.M., & Hinton, G.E. (1999). A view of the EM algorithm that justifies incremental, sparse, and other variants. In Jordan, M.I. (Eds.), Learning in graphical models, Cambridge: MIT Press 355368Google Scholar

Prescher, D. (2004). A tutorial on the expectation-maximization algorithm including maximum-likelihood estimation and EM training of probabilistic context-free grammars. Retrieved from http://arxiv.org/abs/cs/0412015. Google Scholar

Rijmen, F. (2011). A variational approximation estimation method for the item response theory model with random item effects across groups. Paper presented at the 76th annual and 17th international meeting of the psychometric society, Hong Kong. Google Scholar

Rijmen, F., Jeon, M., & Rabe-Hesketh, S. (in press). Variational approximation methods for IRT. In W.J. van der Linden & R.K. Hambleton (Eds.) Handbook of item response theory: models, statistical tools, and applications. London: Chapman & Hall. Google Scholar

Rockafellar, R.T. (1970). Convex analysis, Princeton: Princeton University PressCrossRef Google Scholar

Rossi, N., Wang, X., & Ramsay, J.O. (2002). Nonparametric item response function estimates with the EM algorithm. Journal of Educational and Behavioral Statistics, 27(3), 291317CrossRef Google Scholar

SAS Institute (2008). SAS-IML: Interactive Matrix Language (Version 9.2). Cary, NC. Google Scholar

Wets, R.J.B. (1999). Statistical estimation from an optimization viewpoint. Annals of Operations Research, 85(1), 79CrossRef Google Scholar

Wu, C.F.J. (1983). On the convergence properties of the EM algorithm. The Annals of Statistics, 11(1), 95103CrossRef Google Scholar

Article contents

Global Convergence of the EM Algorithm for Unconstrained Latent Variable Models with Categorical Indicators

Abstract

Keywords

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests