Hostname: page-component-745bb68f8f-5r2nc Total loading time: 0 Render date: 2025-01-08T10:01:25.607Z Has data issue: false hasContentIssue false

Methods for Mediation Analysis with Missing Data

Published online by Cambridge University Press:  01 January 2025

Zhiyong Zhang*
Affiliation:
University of Notre Dame
Lijuan Wang
Affiliation:
University of Notre Dame
*
Requests for reprints should be sent to Zhiyong Zhang, University of Notre Dame, Notre Dame, IN, USA. E-mail: [email protected]

Abstract

Despite wide applications of both mediation models and missing data techniques, formal discussion of mediation analysis with missing data is still rare. We introduce and compare four approaches to dealing with missing data in mediation analysis including listwise deletion, pairwise deletion, multiple imputation (MI), and a two-stage maximum likelihood (TS-ML) method. An R package bmem is developed to implement the four methods for mediation analysis with missing data in the structural equation modeling framework, and two real examples are used to illustrate the application of the four methods. The four methods are evaluated and compared under MCAR, MAR, and MNAR missing data mechanisms through simulation studies. Both MI and TS-ML perform well for MCAR and MAR data regardless of the inclusion of auxiliary variables and for AV-MNAR data with auxiliary variables. Although listwise deletion and pairwise deletion have low power and large parameter estimation bias in many studied conditions, they may provide useful information for exploring missing mechanisms.

Type
Original Paper
Copyright
Copyright © The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Azen, S., & Van Guilder, M. (1981). Conclusions regarding algorithms for handling incomplete data. Proceedings of the survey research methods section, 5356Google Scholar
Bauer, D.J., Preacher, K.J., & Gil, K.M. (2006). Conceptualizing and testing random indirect effects and moderated mediation in multilevel models: new procedures and recommendations. Psychological Methods, 11(2), 142163CrossRefGoogle ScholarPubMed
Bentler, P.M., & Weeks, D.G. (1980). Linear structural equations with latent variables. Psychometrika, 45, 289308CrossRefGoogle Scholar
Best, N.G., Spiegelhalter, D.J., Thomas, A., & Brayne, C.E. (1996). Bayesian analysis of realistically complex models. Journal of the Royal Statistical Society. Series A, 159, 323342CrossRefGoogle Scholar
Bollen, K.A., & Stine, R.A. (1990). Direct and indirect effects: classical and bootstrap estimates of variability. Sociological Methodology, 20, 115140CrossRefGoogle Scholar
Brandt, J. (1991). The Hopkins verbal learning test: development of a new memory test with six equivalent forms. Clinical Neuropsychology, 5, 125142CrossRefGoogle Scholar
Chen, Z.X., Aryee, S., & Lee, C. (2005). Test of a mediation model of perceived organizational support. Journal of Vocational Behavior, 66(3), 457470CrossRefGoogle Scholar
Center for Human Resource Research (2006). NLSY79 child & young adult data users guide: a guide to the 1986–2004 child data (Computer software manual). Columbus. Google Scholar
Cladwell, B.M., & Bradley, R.H. (1979). Home observation for measurement of the environment, Little Rock: University of ArkansasGoogle Scholar
Cole, D.A., & Maxwell, S.E. (2003). Testing mediational models with longitudinal data: questions and tips in the use of structural equation modeling. Journal of Abnormal Psychology, 112, 558577CrossRefGoogle ScholarPubMed
Davis-Kean, P.E. (2005). The influence of parent education and family income on child achievement: the indirect role of parental expectations and the home environment. Journal of Family Psychology, 19, 294304CrossRefGoogle ScholarPubMed
Efron, B. (1979). Bootstrap methods: another look at the jackknife. The Annals of Statistics, 7(1), 126CrossRefGoogle Scholar
Efron, B. (1987). Better bootstrap confidence intervals. Journal of the American Statistical Association, 82(397), 171185CrossRefGoogle Scholar
Efron, B. (1994). Missing data, imputation, and the bootstrap. Journal of the American Statistical Association, 89(426), 463478CrossRefGoogle Scholar
Efron, B., & Tibshirani, R. (1993). An introduction to the bootstrap, New York: CRC PressCrossRefGoogle Scholar
Ekstrom, R.B., French, J.W., Harman, H.H., & Derman, D. (1976). Kit of factor-referenced cognitive tests, Princeton: Educational Testing ServiceGoogle Scholar
Enders, C.K. (2003). Using the expectation maximization algorithm to estimate coefficient alpha for scales with item-level missing data. Psychological Methods, 8, 322337CrossRefGoogle ScholarPubMed
Fox, J. (2006). Structural equation modeling with the sem package in r. Structural Equation Modeling, 13, 465486CrossRefGoogle Scholar
Gonda, J., & Schaie, K.W. (1985). Schaie-Thurstone mental abilities test: word series test, Palo Alto: Consulting Psychologists PressGoogle Scholar
Grimm, K.J. (2008). Longitudinal associations between reading and mathematics. Developmental Neuropsychology, 33, 410426CrossRefGoogle ScholarPubMed
Jelicic, H., Phelps, E., & Lerner, R.M. (2009). Use of missing data methods in longitudinal studies: the persistence of bad practices in developmental psychology. Developmental Neuropsychology, 45, 11951199Google ScholarPubMed
Jobe, J.B., Smith, D.M., Ball, K., Tennstedt, S.L., Marsiske, M., Willis, S.L., & Kleinman, K. (2001). Active: a cognitive intervention trial to promote independence in older adults. Controlled Clinical Trials, 22(4), 453479CrossRefGoogle ScholarPubMed
Leppard, P., & Tallis, G.M. (1989). Evaluation of the mean and covariance of the truncated multinormal. Applied Statistics, 38, 543553CrossRefGoogle Scholar
Little, R.J.A., & Rubin, D.B. (2002). Statistical analysis with missing data, (2nd ed.). New York: Wiley-InterscienceCrossRefGoogle Scholar
Lu, Z., Zhang, Z., & Lubke, G. (2011). Bayesian inference for growth mixture models with non-ignorable missing data. Multivariate Behavioral Research, 46, 567597CrossRefGoogle Scholar
MacKinnon, D.P. (2008). Introduction to statistical mediation analysis, London: Taylor & FrancisGoogle Scholar
MacKinnon, D.P., Lockwood, C.M., Hoffman, J.M., West, S.G., & Sheets, V. (2002). A comparison of methods to test mediation and other intervening variable effects. Psychological Methods, 7, 83104CrossRefGoogle ScholarPubMed
MacKinnon, D.P., Lockwood, C.M., & Williams, J. (2004). Confidence limits for the indirect effect: distribution of the product and resampling methods. Multivariate Behavioral Research, 39(1), 99128CrossRefGoogle ScholarPubMed
McArdle, J.J., & Boker, S.M. (1990). Rampath, Hillsdale: Lawrence ErlbaumGoogle Scholar
Preacher, K.J., & Hayes, A.F. (2004). SPSS and sas procedures for estimating indirect effects in simple mediation models. Behavior Research Methods, Instruments, & Computers, 36, 717731CrossRefGoogle ScholarPubMed
Preacher, K.J., & Hayes, A.F. (2008). Asymptotic and resampling strategies for assessing and comparing indirect effects in multiple mediator models. Behavior Research Methods, 40, 879891CrossRefGoogle ScholarPubMed
Rubin, D.B. (1976). Inference and missing data. Biometrika, 63(3), 581592CrossRefGoogle Scholar
Rubin, D.B. (1996). Multiple imputation after 18+ years. Journal of the American Statistical Association, 91, 473489CrossRefGoogle Scholar
Savalei, V., & Bentler, P.M. (2009). A two-stage approach to missing data: theory and application to auxiliary variables. Structural Equation Modeling, 16, 477497CrossRefGoogle Scholar
Savalei, V., & Falk, C. (in press). Robust two-stage approach outperforms robust FIML with incomplete nonnormal data. Structural Equation Modeling. Google Scholar
Schafer, J.L. (1997). Analysis of incomplete multivariate data, London: Chapman & Hall/CRCCrossRefGoogle Scholar
Shrout, P.E., & Bolger, N. (2002). Mediation in experimental and nonexperimental studies: new procedures and recommendations. Psychological Methods, 7, 422445CrossRefGoogle ScholarPubMed
Sobel, M.E. (1982). Asymptotic confidence intervals for indirect effects in structural equation models. In Leinhardt, S. (Eds.), Sociological methodology, San Francisco: Jossey-Bass 290312Google Scholar
Tallis, G.M. (1961). The moment generating function of the truncated multinormal distribution. Journal of the Royal Statistical Society. Series B, 23, 223229CrossRefGoogle Scholar
Tang, M.-L., & Bentler, P.M. (1997). Maximum likelihood estimation in covariance structure analysis with truncated data. British Journal of Mathematical & Statistical Psychology, 50(2), 339349 Available from http://dx.doi.org/10.1111/j.2044-8317.1997.tb01149.xCrossRefGoogle ScholarPubMed
Thurstone, L.L., & Thurstone, T.G. (1949). Examiner manual for the SRA primary mental abilities test (form 10–14), Chicago: Science Research AssociatesGoogle Scholar
Wilhelm, S., & Manjunath, B.G. (2010). tmvtnorm: truncated multivariate normal and Student t distribution [Computer software manual]. Available from http://CRAN.R-project.org/package=tmvtnorm (R package version 1.2-3). Google Scholar
Willis, S.L., & Marsiske, M. (1993). Manual for the everyday problems test, University Park: Pennsylvania State UniversityGoogle Scholar
Yuan, K.-H. (2009). Identifying variables responsible for data not missing at random. Psychometrika, 74, 233256CrossRefGoogle Scholar
Yuan, K.-H., & Bentler, P.M. (2000). Three likelihood-based methods for mean and covariance structure analysis with nonnormal missing data. Sociological Methodology, 30, 165200CrossRefGoogle Scholar
Yung, Y.-F. (1996). Bootstrapping techniques in analysis of mean and covariance structures. In Marcoulides, G.A., & Schumacker, R.E. (Eds.), Advanced structural equation modeling: issues and techniques, Mahwah: Erlbaum 195226Google Scholar
Zhang, Z., & Yuan, K.-H. (2012). WebSEM manual [Computer software manual]. Available from https://websem.psychstat.org. Google Scholar