Hostname: page-component-745bb68f8f-mzp66 Total loading time: 0 Render date: 2025-01-21T21:42:39.408Z Has data issue: false hasContentIssue false

On the Asymptotic Relative Efficiency of Planned Missingness Designs

Published online by Cambridge University Press:  01 January 2025

Mijke Rhemtulla*
Affiliation:
University of Amsterdam
Victoria Savalei
Affiliation:
University of British Columbia
Todd D. Little
Affiliation:
Texas Tech University
*
Correspondence should be sent to Mijke Rhemtulla, Programme Group Psychological Methods, Department of Psychology, University of Amsterdam, Weesperplein 4, Room 208, 1018XA Amsterdam, Netherlands. Email: [email protected]

Abstract

In planned missingness (PM) designs, certain data are set a priori to be missing. PM designs can increase validity and reduce cost; however, little is known about the loss of efficiency that accompanies these designs. The present paper compares PM designs to reduced sample (RN) designs that have the same total number of data points concentrated in fewer participants. In 4 studies, we consider models for both observed and latent variables, designs that do or do not include an “X set” of variables with complete data, and a full range of between- and within-set correlation values. All results are obtained using asymptotic relative efficiency formulas, and thus no data are generated; this novel approach allows us to examine whether PM designs have theoretical advantages over RN designs removing the impact of sampling error. Our primary findings are that (a) in manifest variable regression models, estimates of regression coefficients have much lower relative efficiency in PM designs as compared to RN designs, (b) relative efficiency of factor correlation or latent regression coefficient estimates is maximized when the indicators of each latent variable come from different sets, and (c) the addition of an X set improves efficiency in manifest variable regression models only for the parameters that directly involve the X-set variables, but it substantially improves efficiency of most parameters in latent variable models. We conclude that PM designs can be beneficial when the model of interest is a latent variable model; recommendations are made for how to optimize such a design.

Type
Original paper
Copyright
Copyright © 2014 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Arbuckle, J.L. (1996). Full information estimation in the presence of incomplete data. In Schumacker, GAMRE. Advanced structural equation modeling: Issues and techniques (pp. 243277). Mahwah: Lawrence Erlbaum Associates Inc.Google Scholar
Arminger, G., & Sobel, M.E. (1990). Pseudo-maximum likelihood estimation of mean and covariance structures with missing data. Journal of the American Statistical Association, 85, 195203.CrossRefGoogle Scholar
Bentler, P.M. (2007). EQS 6 structural equations program manual. Encino: Multivariate Software.Google Scholar
Bentler, P.M., & Lee, S-Y. (1978). Matrix derivatives with chain rule and rules for simple, Hadamard, and Kronecker products. Journal of Mathematical Psychology, 17, 255262.CrossRefGoogle Scholar
Bunting, B. P., Adamson, G., & Mulhall, P. K. (2002). A Monte Carlo Examination of an MTMM Model with planned incomplete data structures. Structural Equation Modeling, 9, 369–389. doi:https://doi.org/10.1207/S15328007SEM0903_4.CrossRefGoogle Scholar
Enders, C. K. (2010). Applied missing data analysis. New York: Guilford Press.Google Scholar
Graham, J.W., Hofer, S.M., & Piccinin, A.M. (1994). Analysis with missing data in drug prevention research. In Collins, L.M., & Seitz, L. (Eds.), Advances in data analysis for prevention intervention research: National Institute on Drug Abuse Research monograph series No. 142 (pp. 1363). Washington, DC: National Institute on Drug Abuse.Google Scholar
Graham, J. W., Hofer, S. M., & MacKinnon, D. P. (1996). Maximizing the usefulness of data obtained with planned missing value patterns: An application of maximum likelihood procedures. Multivariate Behavioral Research, 31, 197–218. doi:https://doi.org/10.1207/s15327906mbr3102_3.CrossRefGoogle Scholar
Graham, J.W., Taylor, B.J., & Cumsille, P.E. (2001). Planned missing data designs in the analysis of change. In Collins, L.M., & Sayer, A.G. (Eds.), New methods for the analysis of change (pp. 335353). Washington, DC: American Psychological Association. doi:10.1037/10409-011.CrossRefGoogle Scholar
Graham, J.W., Taylor, B.J., Olchowski, A.E., & Cumsille, P.E. (2006). Planned missing data designs in psychological research. Psychological Methods, 11, 323343. doi:10.1037/1082-989X.11.4.323.CrossRefGoogle ScholarPubMed
Harel, O., Stratton, J., Aseltine, R. (2011). Designed missingness to better estimate efficacy of behavioral studies. Technical Report (11–15), The Department of Statistics, University of Connecticut.Google Scholar
Little, R. J. A., & Rubin, D. B. (2002). Statistical analysis with missing data (2nd ed.). Hoboken, NJ: Wiley.CrossRefGoogle Scholar
Magnus, J.R., & Neudecker, H. (1999). Matrix differential calculus with applications in statistics and econometrics. New York: Wiley.Google Scholar
McArdle, J.J., & Woodcock, R.W. (1997). Expanding test-retest designs to include developmental time-lag components. Psychological Methods, 2, 403435. doi:10.1037/1082-989X.2.4.403.CrossRefGoogle Scholar
Mistler, S.A., & Enders, C.K. (2012). Planned missing data designs for developmental research. In Laursen, B., Little, T.D., & Card, N.A. (Eds.), Handbook of Developmental Research Methods (pp. 742754). New York: Guilford Press.Google Scholar
Mooijaart, A., & Bentler, P.M. (1991). Robustness of normal theory statistics in structural equation models. Statistica Neerlandica, 45, 159170.CrossRefGoogle Scholar
Nel, D.G. (1980). On matrix differentiation in statistics. South African Statistical Journal, 14, 137193.Google Scholar
Orchard, T., & Woodbury, M. A. (1972). A missing information principle: Theory and applications. Paper presented at the Sixth Berkeley Symposium on Mathematical Statistics and Probability. Berkeley, CA: University of California.Google Scholar
R Development Core Team (2011). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org/. Accessed 1 Jan 2014.Google Scholar
Raghunathan, T.E., & Grizzle, J.E. (1995). A split questionnaire survey design. Journal of the American Statistical Association, 90, 5463. doi:10.1080/01621459.1995.10476488.CrossRefGoogle Scholar
Raykov, T., Marcoulides, G.A., & Patelis, T. (2013). Saturated versus identified models: A note on their distinction. Educational and Psychological Measurement, 73, 162168.CrossRefGoogle Scholar
Revilla, M., & Saris, W.E. (2013). The split-ballot multitrait-multimethod approach: Implementation and problems. Structural Equation Modeling, 20, 2746. doi:10.1080/10705511.2013.742379.CrossRefGoogle Scholar
Saris, W.E., Satorra, A., & Coenders, G. (2004). A new approach to evaluating the quality of measurement instruments: The split-ballot MTMM design. Sociological Methodology, 34, 311347. doi:10.1111/j.0081-1750.2004.00155.x.CrossRefGoogle Scholar
Savalei, V. (2010). Expected vs. observed information in SEM with incomplete normal and nonnormal data. Psychological Methods, 15, 352367. doi:10.1037/a0020143.CrossRefGoogle Scholar
Savalei, V., & Rhemtulla, M. (2011). Properties of local and global measures of fraction of missing information: Some explorations. In Talk presented at the 76th Annual and the 17th International Meeting of the Psychometric Society, Hong Kong.Google Scholar
Savalei, V., & Rhemtulla, M. (2012). On obtaining estimates of the fraction of missing information from full information maximum likelihood. Structural Equation Modeling, 19, 477–494.CrossRefGoogle Scholar
Shoemaker, D.M. (1973). Principles and procedures of multiple matrix sampling. Cambridge: Ballinger Publishing Company.Google Scholar
Sirotnik, K., & Wellington, R. (1977). Incidence sampling: An integrated theory for “matrix sampling”. Journal of Educational Measurement, 14, 343399. doi:10.1111/j.1745-3984.1977.tb00050.x.CrossRefGoogle Scholar
Thomas, N., Raghunathan, T.E., Schenker, N., & Katzoff, M.J., & Johnson, C.L. (2006). An evaluation of matrix sampling methods using data from the National Health and Nutrition Examination Survey. Survey Methodology, 32, 217.Google Scholar
Wacholder, S., Carroll, R.J., Pee, D., & Gail, M.H. (1994). The partial questionnaire design for case–control studies. Statistics in Medicine, 13, 623634.CrossRefGoogle ScholarPubMed
Yuan, K., & Bentler, P.M. (2000). Three likelihood-based methods for mean and covariance structure analysis with nonnormal missing data. Sociological Methodology, 30, 165200. doi:10.1111/0081-1750.00078.CrossRefGoogle Scholar