Hostname: page-component-745bb68f8f-b6zl4 Total loading time: 0 Render date: 2025-01-08T10:05:34.033Z Has data issue: false hasContentIssue false

Analysis of Structural Equation Model with Ignorable Missing Continuous and Polytomous Data

Published online by Cambridge University Press:  01 January 2025

Xin-Yuan Song
Affiliation:
Department of Statistics, The Chinese University of Hong Kong
Sik-Yum Lee*
Affiliation:
Department of Statistics, The Chinese University of Hong Kong
*
Requests for reprints should be sent to Sik-Yum Lee, Department of Statistics, Chinese University of Hong Kong, Shatin, N.T., HONG KONG. E-Mail: [email protected]

Abstract

The main purpose of this article is to develop a Bayesian approach for structural equation models with ignorable missing continuous and polytomous data. Joint Bayesian estimates of thresholds, structural parameters and latent factor scores are obtained simultaneously. The idea of data augmentation is used to solve the computational difficulties involved. In the posterior analysis, in addition to the real missing data, latent variables and latent continuous measurements underlying the polytomous data are treated as hypothetical missing data. An algorithm that embeds the Metropolis-Hastings algorithm within the Gibbs sampler is implemented to produce the Bayesian estimates. A goodness-of-fit statistic for testing the posited model is presented. It is shown that the proposed approach is not sensitive to prior distributions and can handle situations with a large number of missing patterns whose underlying sample sizes may be small. Computational efficiency of the proposed procedure is illustrated by simulation studies and a real example.

Type
Articles
Copyright
Copyright © 2002 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

The work described in this paper was fully supported by a grant from the Research Grants Council of the HKSAR (Project No. CUHK 4088/99H). The authors are greatly indebted to the Editor and anonymous reviewers for valuable comments in improving the paper; and also to D. E. Morisky and J.A. Stein for the use of their AIDS data set.

References

Afifi, A.A., & Elashoff, R.M. (1969). Missing observations in multivariate statistics. III. Large sample analysis of simple linear regression. Journal of the American Statistical Association, 64, 359365.Google Scholar
Albert, J.H., & Chib, S. (1993). Bayesian analysis of binary and polychotomous response data. Journal of the American Statistical Association, 88, 669679.CrossRefGoogle Scholar
Allison, P.D. (1987). Estimation of linear models with incomplete data. In Clogg, C.C. (Eds.), Sociological methodology (pp. 71103). Washington, DC: American Sociological Association.Google Scholar
Arminger, G., & Muthén, B.O. (1998). A Bayesian approach to nonlinear latent variable models using the Gibbs sampler and the Metropolis-Hastings algorithm. Psychometrika, 63, 271300.CrossRefGoogle Scholar
Bentler, P.M. (1992). EQS: Structural equation program manual. Los Angeles, CA: BMDP Statistical Software.Google Scholar
Casella, G., & Berger, R.L. (1990). Statistical inference. Belmont, CA: Duxbury Press.Google Scholar
Chen, M.H., & Shao, Q.H. (1999). Monte Carlo estimation of Bayesian credible and HPD intervals. Journal of Computational and Graphical Statistics, 8, 6992.CrossRefGoogle Scholar
Chen, M.H., Shao, Q.M., & Ibrahim, J.G. (2000). Monte Carlo methods in Bayesian computation. New York, NY: Springer.CrossRefGoogle Scholar
Cowles, M.K. (1996). Accelerating Monte Carlo Markov chain convergence for cumulative-link generalized linear models. Statistics and Computing, 6, 101111.CrossRefGoogle Scholar
Dempster, A.P., Laird, N.M., & Rubin, D.B. (1977). Maximum likelihood from incomplete data via the EM algorithm (with discussion). Journal of the Royal Statistical Society, Series B, 39, 138.CrossRefGoogle Scholar
Efron, B. (1994). Missing data, imputation and the bootstrap (with discussion). Journal of the American Statistical Association, 89, 463479.CrossRefGoogle Scholar
Gelfand, A.E., & Smith, A.F.M. (1990). Sampling-based approaches to calculating marginal densities. Journal of the American Statistical Association, 85, 398409.CrossRefGoogle Scholar
Gelman, A. (1996). Inference and monitoring convergence. In Gilks, W.R., Richardson, S., & Spiegelhalter, D.J. (Eds.), Markov Chain Monte Carlo in practice (pp. 131144). London, England: Chapman & Hall.Google Scholar
Gelman, A., & Rubin, D.B. (1992). Inference from iterative simulation using multiple sequence (with discussion). Statistical Science, 7, 457472.CrossRefGoogle Scholar
Gelman, A., Meng, X.L., & Stern, H. (1996). Posterior predictive assessment of model fitness via realized discrepancies. Statistica Sinica, 6, 733807.Google Scholar
Geman, S., & Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721741.CrossRefGoogle ScholarPubMed
Geyer, C.J. (1992). Practical Markov chain Monte Carlo (with discussion). Statistical Science, 7, 473511.Google Scholar
Gilks, W.R., Richardson, S., & Spiegelhalter, D.J. (1996). Introducing Markov Chain Monte Carlo. In Gilks, W.R., Richardson, S., & Spiegelhalter, D.J. (Eds.), Markov Chain Monte Carlo in practice (pp. 119). London, England: Chapman & Hall.Google Scholar
Hartigan, J.A. (1983). Bayes theory. New York, NY: Springer-Verlag.CrossRefGoogle Scholar
Hastings, W.K. (1970). Monte Carlo sampling methods using Markov chains and their application. Biometrika, 57, 97100.CrossRefGoogle Scholar
Jamshidian, M., & Bentler, P.M. (1999). ML estimation of mean and covariance structures with missing data using complete data routines. Journal of Educational and Behavioral Statistics, 24, 2141.CrossRefGoogle Scholar
Jöreskog, K.G., & Sörbom, D. (1996). LISREL 8: Structural equation modeling with the SIMPLIS command language. Hove and London: Scientific Software International.Google Scholar
Kass, R.E., & Raftery, A.E. (1995). Bayes factors. Journal of the American Statistical Association, 90, 773795.CrossRefGoogle Scholar
Lee, S.Y. (1986). Estimation for structural equation models with missing data. Psychometrika, 51, 9399.CrossRefGoogle Scholar
Lee, S.Y., Poon, W.Y., & Bentler, P.M. (1992). Structural equation models with continuous and polytomous variables. Psychometrika, 57, 89106.CrossRefGoogle Scholar
Lee, S.Y., Poon, W.Y., & Bentler, P.M. (1995). A two-stage estimation of structural equation models with continuous and polytomous variables. British Journal of Mathematical and Statistical Psychology, 48, 339358.CrossRefGoogle ScholarPubMed
Lee, S.Y., & Tang, M.L. (1992). Analysis of structural equation models with incomplete polytomous data. Communications in Statistics, Theory and Methods, 21, 213232.CrossRefGoogle Scholar
Lee, S.Y., & Tsang, S.Y. (1999). Constrained maximum likelihood estimation of two-level covariance structure model via EM type algorithms. Psychometrika, 64, 435450.CrossRefGoogle Scholar
Lee, S.Y., & Zhu, H.T. (2000). Statistical analysis of nonlinear structural equation models with continuous and polytomous data. British Journal of Mathematical and Statistical Psychology, 53, 209232.CrossRefGoogle Scholar
Little, R.J.A., & Rubin, D.B. (1987). Statistical analysis with missing data. New York, NY: Wiley.Google Scholar
Liu, J.S. (1994). The collapsed Gibbs sampler in Bayesian computation with applications to a gene regulation problem. Journal of the American Statistical Association, 89, 958966.CrossRefGoogle Scholar
Liu, C., & Rubin, D.B. (1994). The ECME algorithm: A simple extension of EM and ECM with faster monotone convergence. Biometrika, 81, 633648.CrossRefGoogle Scholar
Meng, X.L. (1994). Posterior predictive p-values. Annals of Statistics, 22, 11421160.CrossRefGoogle Scholar
Meng, X.L., & van Dyk, D. (1997). The EM algorithm—An old folk-song sung to a fast new tune (with discussion). Journal of the Royal Statistical Society, Series B, 59, 511567.CrossRefGoogle Scholar
Metropolis, N., Rosenbluth, A. W., Rosenbluth, M.N., Teller, A.H., & Teller, E. (1953). Equations of state calculations by fast computing machine. Journal of Chemical Physics, 21, 10871091.CrossRefGoogle Scholar
Morisky, D.E., Tiglao, T.V., Sneed, C.D., Tempongko, S.B., Baltazar, J.C., Detels, R., & Stein, J.A. (1998). The effects of establishment practices, knowledge and attitudes on condom use among Filipina sex workers. AIDS Care, 10, 213320.CrossRefGoogle ScholarPubMed
Muthén, B. (1987). LISCOMP: Analysis of linear statistical equation with a comprehensive measurement model. Mooresville, IN: Scientific Software.Google Scholar
Nandram, B., & Chen, M.H. (1996). Reparameterizing the generalized linear model to accelerate Gibbs sampler convergence. Journal of Statistical Computation and Simulation, 54, 129144.CrossRefGoogle Scholar
Ritter, C., & Tanner, M.A. (1992). Facilitating the Gibbs sampler: The Gibbs stopper and the Griddy-Gibbs sampler. Journal of the American Statistical Association, 87, 861868.CrossRefGoogle Scholar
Roberts, C.P. (1995). Simulation of truncated normal variables. Statistics and Computing, 5, 121125.CrossRefGoogle Scholar
Roboussin, B.A., & Liang, K.Y. (1998). An estimating equations approach for the LISCOMP model. Psychometrika, 63, 165182.CrossRefGoogle Scholar
Rubin, D.B. (1991). EM and beyond. Psychometrika, 56, 241254.CrossRefGoogle Scholar
Shi, J.Q., & Lee, S.Y. (1998). Bayesian sampling-based approach for factor analysis model with continuous and polytomous data. British Journal of Mathematical and Statistical Psychology, 51, 233252.CrossRefGoogle Scholar
Shi, J.Q., & Lee, S.Y. (2000). Latent variable models with mixed continuous and polytomous data. Journal of the Royal Statistical Society, Series B, 62, 7787.CrossRefGoogle Scholar
Song, X.Y., & Lee, S.Y. (2001). Bayesian estimation and test for factor analysis model with continuous and polytomous data in several populations. British Journal of Mathematical and Statistical Psychology, 54, 237263.CrossRefGoogle ScholarPubMed
Tanner, M.A. (1993). Tools for statistical inference: Methods for the exploration of posterior distributions and likelihood functions. New York, NY: Springer-Verlag.CrossRefGoogle Scholar
Tanner, M.A., & Wong, W.H. (1987). The calculation of posterior distribution by data augmentation (with discussion). Journal of the American Statistical Association, 86, 7986.Google Scholar
Wei, G.C.G., & Tanner, M.A. (1990). A Monte Carlo implementation of the EM algorithm and the poor man's data augmentation algorithm. Journal of the American Statistical Association, 85, 699704.CrossRefGoogle Scholar
Zellner, A. (1971). An introduction to Bayesian inference in econometrics. New York, NY: John Wiley.Google Scholar
Zhu, H.T., & Lee, S.Y. (1998). Statistical analysis of nonlinear factor analysis models. British Journal of Mathematical and Statistical Psychology, 52, 225242.CrossRefGoogle Scholar