Hostname: page-component-cd9895bd7-gbm5v Total loading time: 0 Render date: 2025-01-05T14:54:12.317Z Has data issue: false hasContentIssue false

On Structural Equation Modeling with Data that are not Missing Completely at Random

Published online by Cambridge University Press:  01 January 2025

Bengt Muthén*
Affiliation:
Graduate School of Education
David Kaplan
Affiliation:
Graduate School of Education
Michael Hollis
Affiliation:
Graduate School of Architecture and Urban Planning, University of California, Los Angeles
*
Requests for reprints should be sent to Bengt Muthén, Graduate School of Education, University of California, Los Angeles, CA 90024.

Abstract

A general latent variable model is given which includes the specification of a missing data mechanism. This framework allows for an elucidating discussion of existing general multivariate theory bearing on maximum likelihood estimation with missing data. Here, missing completely at random is not a prerequisite for unbiased estimation in large samples, as when using the traditional listwise or pairwise present data approaches. The theory is connected with old and new results in the area of selection and factorial invariance. It is pointed out that in many applications, maximum likelihood estimation with missing data may be carried out by existing structural equation modeling software, such as LISREL and LISCOMP. Several sets of artifical data are generated within the general model framework. The proposed estimator is compared to the two traditional ones and found superior.

Type
Original Paper
Copyright
Copyright © 1987 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

The research of the first author was supported by grant No. SES-8312583 from the National Science Foundation and by a Spencer Foundation grant. We wish to thank Chuen-Rong Chan for drawing the path diagram.

References

Anderson, T. W. (1957). Maximum likelihood estimates for a multivariate normal distribution when some observations are missing. Journal of the American Statistical Association, 52, 200203.CrossRefGoogle Scholar
Beale, E. L., Little, R. J. A. (1975). Missing values in multivariate analysis. Journal of the Royal Statistical Society, Series B, 37, 129146.CrossRefGoogle Scholar
Bock, R. D., Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46, 443459.CrossRefGoogle Scholar
Boomsma, A. (1983). On the robustness of the LISREL (maximum likelihood estimation) against small sample size and nonnormality, Groningen, The Netherlands: University of Groningen.Google Scholar
Brown, C. H. (1983). Asymptotic comparison of missing data procedures for estimating factor loadings. Psychometrika, 48, 269291.CrossRefGoogle Scholar
Dempster, A. P., Laird, N. M., Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 39, 138.CrossRefGoogle Scholar
Dixon, W. J. (1983). BMDP Statistical Software, Berkeley: University of California Press.Google Scholar
Finkbeiner, C. (1979). Estimation for the multiple factor model when data are missing. Psychometrika, 44, 409420.CrossRefGoogle Scholar
Hartley, H. O., Hocking, R. R. (1971). The analysis of incomplete data. Biometrics, 14, 174194.CrossRefGoogle Scholar
Hausman, J. A., Wise, D. A. (1979). Attrition bias in experimental and panel data: The Gary income maintenance experiment. Econometrica, 47, 455474.CrossRefGoogle Scholar
Heckman, J. J. (1976). The common structure of statistical models of truncation, sample selection and limited dependent variables and a simple estimator for such models. Annals of Economic and Social Measurement, 5, 475492.Google Scholar
Johnson, N. L., Kotz, S. (1972). Distributions in statistics: Continuous multivariate distributions, New York: John Wiley & Sons.Google Scholar
Jöreskog, K. G. (1969). A general approach to confirmatory maximum likelihood factor analysis. Psychometrika, 34, 183202.CrossRefGoogle Scholar
Jöreskog, K. G. (1971). Simultaneous factor analysis in several populations. Psychometrika, 36, 409426.CrossRefGoogle Scholar
Jöreskog, K. G. (1977). Structural equation models in the social sciences: Specification, estimation and testing. In Krishnaiah, P. R. (Eds.), Applications of statistics, Amsterdam: North Holland.Google Scholar
Jöreskog, K. G., Sörbom, D. (1980). Simultaneous analysis of longitudinal data from several cohorts, Sweden: Department of Statistics, University of Uppsala.Google Scholar
Jöreskog, K. G., & Sörbom, D. (1984). LISREL VI; Analysis of linear structural relationships by maximum likelihood and least squares methods. Scientific Software.Google Scholar
Lawley, D. N. (1943). A note on Karl Pearson's selection formulae. Proceedings of the Royal Society Edinburgh, Section A (Mathematics and Physics Section), 62(1), 2830.Google Scholar
Little, R. J. A. (1982). Models for nonresponse in sample surveys. Journal of the American Statistical Association, 77, 237250.CrossRefGoogle Scholar
Little, R. J. A. (1983). The ignorable case. In Modon, W. G., Olkin, I., Rubin, D. R. (Eds.), Incomplete data in sample surveys, Vol. 2: Theory and bibliographies, New York: Academic Press.Google Scholar
Little, R. J. A. (1985). A note about models for selectivity bias. Econometrica, 53(6), 14691474.CrossRefGoogle Scholar
Little, R. J. A., Rubin, D. R. (1987). Statistical analysis with missing data, New York: John Wiley & Sons.Google Scholar
Marini, M. M., Olsen, A. R., Rubin, D. B. (1980). Maximum likelihood estimation in panel studies with missing data. In Schuessler, K. F. (Eds.), Sociological Methodology, San Francisco: Jossey Bass.Google Scholar
Meredith, W. (1964). Notes on factorial invariance. Psychometrika, 29, 177185.CrossRefGoogle Scholar
Muthén, B. (1984). A general structural equation model with dichotomous, ordered categorical, and continuous latent variable indicators. Psychometrika, 49, 115132.CrossRefGoogle Scholar
Muthén, B. (1985). Moments of the censored and truncated bivariate normal distribution. Submitted for publication.Google Scholar
Muthén, B. (1987). LISCOMP. Analysis of linear structural equations using a comprehensive measurement model. User's guide. Scientific Software.Google Scholar
Muthén, B., Jöreskog, K. (1983). Selectivity problems in quasi-experimental studies. Evaluation Review, 7, 139173.CrossRefGoogle Scholar
Muthén, B., Kaplan, D. (1985). A comparison of some methodologies for the factor analysis of non-normal Likert variables. British Journal of Mathematical and Statistical Psychology, 33, 171189.CrossRefGoogle Scholar
Olsson, U. (1978). Selection bias in confirmatory factor analysis, Uppsala, Sweden: University of Uppsala.Google Scholar
Pearson, K. (1912). On the general theory of the influence of selection on correlation and variation. Biometrika, 8, 437443.Google Scholar
Rosenbaum, S. (1961). Moments of a truncated bivariate normal distribution. Journal of the Royal Statistical Society, Series B, 23, 405408.CrossRefGoogle Scholar
Rubin, D. B. (1974). Characterizing the estimation of parameters in incomplete data problems. Journal of the American Statistical Association, 69, 456474.CrossRefGoogle Scholar
Rubin, D. B. (1976). Inference and missing data. Biometrika, 63, 581592.CrossRefGoogle Scholar
Tallis, G. M. (1961). The moment generating function of the truncated multi-normal distribution. Journal of the Royal Statistical Society, Series B, 23, 223229.CrossRefGoogle Scholar
Trawinski, I. M., Bargmann, R. E. (1964). Maximum likelihood estimation with incomplete multivariate data. Annals of Mathematical Statistics, 35, 647657.CrossRefGoogle Scholar
Wedderburn, R. W. M. (1974). Quasi-likelihood functions, generalized linear models and the Gauss-Newton method. Biometrika, 61, 439447.Google Scholar
Werts, C. E., Rock, D. A., Grandy, J. (1979). Confirmatory factor analysis applications: Missing data problems and comparison of path models between populations. Multivariate Behavioral Research, 14, 199213.CrossRefGoogle ScholarPubMed
Wheaton, B., Muthén, B., Alwin, D. F., Summers, G. F. (1977). Assessing reliability and stability in panel models. In Heise, D. R. (Eds.), Sociological methodology, San Francisco: Jossey Bass.Google Scholar