Hostname: page-component-745bb68f8f-b95js Total loading time: 0 Render date: 2025-01-08T10:27:28.197Z Has data issue: false hasContentIssue false

Robust Structural Equation Modeling with Missing Data and Auxiliary Variables

Published online by Cambridge University Press:  01 January 2025

Ke-Hai Yuan*
Affiliation:
University of Notre Dame
Zhiyong Zhang
Affiliation:
University of Notre Dame
*
Requests for reprints should be sent to Ke-Hai Yuan, Department of Psychology, University of Notre Dame, Notre Dame, IN 46556, USA. E-mail: [email protected]

Abstract

The paper develops a two-stage robust procedure for structural equation modeling (SEM) and an R package rsem to facilitate the use of the procedure by applied researchers. In the first stage, M-estimates of the saturated mean vector and covariance matrix of all variables are obtained. Those corresponding to the substantive variables are then fitted to the structural model in the second stage. A sandwich-type covariance matrix is used to obtain consistent standard errors (SE) of the structural parameter estimates. Rescaled, adjusted as well as corrected and F-statistics are proposed for overall model evaluation. Using R and EQS, the R package rsem combines the two stages and generates all the test statistics and consistent SEs. Following the robust analysis, multiple model fit indices and standardized solutions are provided in the corresponding output of EQS. An example with open/closed book examination data illustrates the proper use of the package. The method is further applied to the analysis of a data set from the National Longitudinal Survey of Youth 1997 cohort, and results show that the developed procedure not only gives a better endorsement of the substantive models but also yields estimates with uniformly smaller standard errors than the normal-distribution-based maximum likelihood.

Type
Original Paper
Copyright
Copyright © The Psychometric Society 2012

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

The research was supported by Grants DA00017 and DA01070 from the National Institute on Drug Abuse.

References

Arminger, G., & Sobel, M.E. (1990). Pseudo-maximum likelihood estimation of mean and covariance structures with missing data. Journal of the American Statistical Association, 85, 195203.CrossRefGoogle Scholar
Bentler, P.M. (2008). EQS 6 structural equations program manual. Encino: Multivariate Software.Google Scholar
Bentler, P.M., & Yuan, K.-H. (1999). Structural equation modeling with small samples: test statistics. Multivariate Behavioral Research, 34, 181197.CrossRefGoogle ScholarPubMed
Browne, M.W. (1984). Asymptotic distribution-free methods for the analysis of covariance structures. British Journal of Mathematical & Statistical Psychology, 37, 6283.CrossRefGoogle ScholarPubMed
Cheng, T.-C., & Victoria-Feser, M.-P. (2002). High-breakdown estimation of multivariate mean and covariance with missing observations. British Journal of Mathematical & Statistical Psychology, 55, 317335.CrossRefGoogle ScholarPubMed
D’Agostino, R.B., Belanger, A., & D’Agostino, R.B. Jr.. (1990). A suggestion for using powerful and informative tests of normality. American Statistician, 44, 316321.CrossRefGoogle Scholar
Enders, C.K. (2010). Applied missing data analysis. New York: Guilford.Google Scholar
Enders, C.K., & Bandalos, D.L. (2001). The relative performance of full information maximum likelihood estimation for missing data in structural equation models. Structural Equation Modeling, 8, 430457.CrossRefGoogle Scholar
Hampel, F.R., Ronchetti, E.M., Rousseeuw, P.J., & Stahel, W.A. (1986). Robust statistics: the approach based on influence functions. New York: Wiley.Google Scholar
Hu, L., Bentler, P.M., & Kano, Y. (1992). Can test statistics in covariance structure analysis be trusted?. Psychological Bulletin, 112, 351362.CrossRefGoogle ScholarPubMed
Huber, P.J. (1981). Robust statistics. New York: Wiley.CrossRefGoogle Scholar
Lee, S.Y., & Xia, Y.M. (2006). Maximum likelihood methods in treating outliers and symmetrically heavy-tailed distributions for nonlinear structural equation models with missing data. Psychometrika, 71, 565585.CrossRefGoogle Scholar
Lee, S.Y., & Xia, Y.M. (2008). A robust Bayesian approach for structural equation models with missing data. Psychometrika, 73, 343364.CrossRefGoogle Scholar
Little, R.J.A. (1988). Robust estimation of the mean and covariance matrix from data with missing values. Applied Statistics, 37, 2338.CrossRefGoogle Scholar
Liu, C. (1997). ML estimation of the multivariate t distribution and the EM algorithm. Journal of Multivariate Analysis, 63, 296312.CrossRefGoogle Scholar
Lopuhaä, H.P. (1989). On the relation between S-estimators and M-estimators of multivariate location and covariances. Annals of Statistics, 17, 16621683.CrossRefGoogle Scholar
Mair, P., Wu, E., & Bentler, P.M. (2010). EQS goes R: simulations for SEM using the package REQS. Structural Equation Modeling, 17, 333349.CrossRefGoogle Scholar
Mardia, K.V. (1970). Measures of multivariate skewness and kurtosis with applications. Biometrika, 57, 519530.CrossRefGoogle Scholar
Mardia, K.V., Kent, J.T., & Bibby, J.M. (1979). Multivariate analysis. New York: Academic Press.Google Scholar
Micceri, T. (1989). The unicorn, the normal curve, and other improbable creatures. Psychological Bulletin, 105, 156166.CrossRefGoogle Scholar
Preacher, K.J., Wichman, A.L., MacCallum, R.C., & Briggs, N.E. (2008). Latent growth curve modeling. Thousand Oaks: Sage.CrossRefGoogle Scholar
Raykov, T. (2005). Analysis of longitudinal studies with missing data using covariance structure modeling with full-information maximum likelihood. Structural Equation Modeling, 12, 493505.CrossRefGoogle Scholar
Rocke, D.M. (1996). Robustness properties of S-estimators of multivariate location and shape in high dimension. Annals of Statistics, 24, 13271345.CrossRefGoogle Scholar
Rubin, D.B. (1976). Inference and missing data (with discussions). Biometrika, 63, 581592.CrossRefGoogle Scholar
Satorra, A., & Bentler, P.M. (1994). Corrections to test statistics and standard errors in covariance structure analysis. In von Eye, A., & Clogg, C.C. (Eds.), Latent variables analysis: applications for developmental research (pp. 399419). Newbury Park: Sage.Google Scholar
Savalei, V., & Bentler, P.M. (2009). A two-stage ML approach to missing data: theory and application to auxiliary variables. Structural Equation Modeling, 16, 477497.CrossRefGoogle Scholar
Savalei, V., & Falk, C., (in press) Robust two-stage approach outperforms robust FIML with incomplete non-normal data. Structural Equation Modeling.Google Scholar
Schott, J. (2005). Matrix analysis for statistics (2nd ed.). New York: Wiley.Google Scholar
Tong, X., Zhang, Z., & Yuan, K.-H. (2011, October). Evaluation of test statistics for robust structural equation modeling with non-normal missing data. Paper presented at the graduate student pre-conference of the annual meeting of the society of multivariate experimental psychology, Norman, OK. .CrossRefGoogle Scholar
Yuan, K.-H., (2011). Expectation-robust algorithm and estimating equation for means and covariances with missing data. Manuscript under review.Google Scholar
Yuan, K.-H., & Bentler, P.M. (1997). Improving parameter tests in covariance structure analysis. Computational Statistics & Data Analysis, 26, 177198.CrossRefGoogle Scholar
Yuan, K.-H., & Bentler, P.M. (1998). Normal theory based test statistics in structural equation modeling. British Journal of Mathematical & Statistical Psychology, 51, 289309.CrossRefGoogle Scholar
Yuan, K.-H., & Bentler, P.M. (2000). Three likelihood-based methods for mean and covariance structure analysis with non-normal missing data. Sociological Methodology, 30, 167202.CrossRefGoogle Scholar
Yuan, K.-H., & Bentler, P.M. (2001). A unified approach to multigroup structural equation modeling with nonstandard samples. In Marcoulides, G.A., & Schumacker, R.E. (Eds.), Advanced structural equation modeling: new developments and techniques (pp. 3556). Mahwah: Lawrence Erlbaum Associates.Google Scholar
Yuan, K.-H., & Bentler, P.M. (2010). Two simple approximations to the distributions of quadratic forms. British Journal of Mathematical & Statistical Psychology, 63, 273291.CrossRefGoogle Scholar
Yuan, K.-H., Bentler, P.M., & Chan, W. (2004). Structural equation modeling with heavy tailed distributions. Psychometrika, 69, 421436.CrossRefGoogle Scholar
Yuan, K.-H., & Jennrich, R.I. (1998). Asymptotics of estimating equations under natural conditions. Journal of Multivariate Analysis, 65, 245260.CrossRefGoogle Scholar
Yuan, K.-H., Lambert, P.L., & Fouladi, R.T. (2004). Mardia’s multivariate kurtosis with missing data. Multivariate Behavioral Research, 39, 413437.CrossRefGoogle Scholar
Yuan, K.-H., & Lu, L. (2008). SEM with missing data and unknown population using two-stage ML: theory and its application. Multivariate Behavioral Research, 62, 621652.CrossRefGoogle Scholar
Yuan, K.-H., Marshall, L.L., & Bentler, P.M. (2002). A unified approach to exploratory factor analysis with missing data, non-normal data, and in the presence of outliers. Psychometrika, 67, 95122.CrossRefGoogle Scholar
Yuan, K.-H., Wallentin, F., & Bentler, P.M., (in press) ML versus MI for missing data with violation of distribution conditions. Sociological Methods & Research. .Google Scholar
Zhong, X., & Yuan, K.-H. (2011). Bias and efficiency in structural equation modeling: maximum likelihood versus robust methods. Multivariate Behavioral Research, 46, 229265.CrossRefGoogle ScholarPubMed
Zu, J., & Yuan, K.-H. (2010). Local influence and robust procedures for mediation analysis. Multivariate Behavioral Research, 45, 144.CrossRefGoogle ScholarPubMed