Missing Data Mechanisms and Homogeneity of Means and Variances–Covariances

Ke-Hai Yuan; Mortaza Jamshidian; Yutaka Kano

doi:10.1007/s11336-018-9609-x

Missing Data Mechanisms and Homogeneity of Means and Variances–Covariances

Published online by Cambridge University Press: 01 January 2025

Ke-Hai Yuan

Mortaza Jamshidian and

Yutaka Kano

Show author details

Ke-Hai Yuan*: Affiliation:
Nanjing University of Posts and Telecommunications, University of Notre Dame
Mortaza Jamshidian: Affiliation:
California State University, Fullerton
Yutaka Kano: Affiliation:
Osaka University
*: Correspondence should be made to Ke-Hai Yuan, University of Notre Dame, Notre Dame, USA. Email: [email protected]

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

Unless data are missing completely at random (MCAR), proper methodology is crucial for the analysis of incomplete data. Consequently, methods for effectively testing the MCAR mechanism become important, and procedures were developed via testing the homogeneity of means and variances–covariances across the observed patterns (e.g., Kim & Bentler in Psychometrika 67:609–624, 2002; Little in J Am Stat Assoc 83:1198–1202, 1988). The current article shows that the population counterparts of the sample means and covariances of a given pattern of the observed data depend on the underlying structure that generates the data, and the normal-distribution-based maximum likelihood estimates for different patterns of the observed sample can converge to the same values even when data are missing at random or missing not at random, although the values may not equal those of the underlying population distribution. The results imply that statistics developed for testing the homogeneity of means and covariances cannot be safely used for testing the MCAR mechanism even when the population distribution is multivariate normal.

Keywords

maximum likelihood missing data Monte Carlo test statistics

Type: Original Paper
Information: Psychometrika , Volume 83 , Issue 2 , June 2018 , pp. 425 - 442

DOI: https://doi.org/10.1007/s11336-018-9609-x [Opens in a new window]
Copyright: Copyright © 2018 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

Footnotes

The research was supported by the National Science Foundation under Grant No. SES-1461355.

References

Anderson, T.W., (1957). Maximum likelihood estimates for the multivariate normal distribution when some observations are missing, Journal of the American Statistical Association, 52, 200–203.CrossRef Google Scholar

Bentler, P.M., (2006). EQS 6 structural equations program manual. Encino, CA:Multivariate Software.Google Scholar

Blanca, M.J., Arnau, J., Löpez-Montiel, D., Bono, R., Bendayan, R., (2015). Skewness and kurtosis in real data samples, Methodology, 9, 78–84.CrossRef Google Scholar

Bradley, J.V., (1978). Robustness?, British Journal of Mathematical and Statistical Psychology, 31, 144–152.CrossRef Google Scholar

Chen, H.Y., Little, R., (1999). A test of missing completely at random for generalised estimating equations with missing data, Biometrika, 86, 1–13.CrossRef Google Scholar

Enders, C.K., (2010). Applied missing data analysis. New York:Guilford.Google Scholar

Galati, J.C., Seaton, K.A., (2016). MCAR is not necessary for the complete cases to constitute a simple random subsample of the target sample, Statistical Methods in Medical Research, 25, 1527–1534.CrossRef Google Scholar

Hawkins, D.M., (1981). A new test for multivariate normality and homoscedasticity, Technometrics, 23, 105–110.CrossRef Google Scholar

Jamshidian, M., Jalal, S., (2010). Tests of homoscedasticity, normality and missing completely at random for incomplete multivariate data, Psychometrika, 75, 649–6743124223.CrossRef Google Scholar PubMed

Jamshidian, M., Jalal, S., Jansen, C., (2014). MissMech: An R Package for testing homoscedasticity, multivariate normality, and missing completely at random (MCAR), Journal of Statistical Software, 56, 1–31.CrossRef Google Scholar

Jöreskog, K.G., (1971). Simultaneous factor analysis in several populations, Psychometrika, 36, 409–426.CrossRef Google Scholar

Kano, Y., Takai, K., (2011). Analysis of NMAR missing data without specifying missing-data mechanisms in a linear latent variate model, Journal of Multivariate Analysis, 102, 1241–1255.CrossRef Google Scholar

Kim, K.H., Bentler, P.M., (2002). Tests of homogeneity of means and covariance matrices for multivariate incomplete data, Psychometrika, 67, 609–624.CrossRef Google Scholar

Li, J., Yu, Y., (2015). A nonparametric test of missing completely at random for incomplete multivariate data, Psychometrika, 80(3), 707–726.CrossRef Google Scholar PubMed

Little, R.J.A., (1988). A test of missing completely at random for multivariate data with missing values, Journal of the American Statistical Association, 83, 1198–1202.CrossRef Google Scholar

Little, R.J.A., Rubin, D.B., (2002). Statistical analysis with missing data. 2 New York:Wiley.CrossRef Google Scholar

Micceri, T., (1989). The unicorn, the normal curve, and other improbable creatures, Psychological Bulletin, 105, 156–166.CrossRef Google Scholar

Park, T., Davis, C.S., (1993). A test of the missing data mechanism for repeated categorical data, Biometrics, 49, 631–638.CrossRef Google Scholar PubMed

Park, T., Lee, S-Y, (1997). A test of missing completely at random for longitudinal data with missing observations, Statistics in Medicine, 16, 1859–1871.3.0.CO;2-3>CrossRef Google Scholar PubMed

Qu, A., Song, P.X.K., (2002). Testing ignorable missingness in estimating equation approaches for longitudinal data, Biometrika, 89, 841–850.CrossRef Google Scholar

Rubin, D.B., (1976). Inference and missing data (with discussions), Biometrika, 63, 581–592.CrossRef Google Scholar

Sörbom, D., (1974). A general method for studying differences in factor means and factor structures between groups, British Journal of Mathematical and Statistical Psychology, 27, 229–239.CrossRef Google Scholar

Tang, M., Bentler, P.M., (1998). Theory and method for constrained estimation in structural equation models with incomplete data, Computational Statistics and Data Analysis, 27, 257–270.CrossRef Google Scholar

Thoemmes, F., & Enders, C. K., (2007). A structural equation model for testing whether data are missing completely at random. In Paper Presented at the Annual Meeting of the American Educational Research Association. IL: Chicago..Google Scholar

Yuan, K-H, (2009). Normal distribution based pseudo ML for missing data: With applications to mean and covariance structure analysis, Journal of Multivariate Analysis, 100, 1900–1918.CrossRef Google Scholar

Yuan, K-H, Chan, W., Tian, Y., (2016). Expectation-robust algorithm and estimating equations for means and dispersion matrix with missing data, Annals of the Institute of Statistical Mathematics, 68, 329–351.CrossRef Google Scholar

Article contents

Missing Data Mechanisms and Homogeneity of Means and Variances–Covariances

Abstract

Keywords

Access options

Article purchase

Temporarily unavailable

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests