Hostname: page-component-cd9895bd7-mkpzs Total loading time: 0 Render date: 2025-01-05T12:39:15.609Z Has data issue: false hasContentIssue false

Finite Mixture Multilevel Multidimensional Ordinal IRT Models for Large Scale Cross-Cultural Research

Published online by Cambridge University Press:  01 January 2025

Martijn G. de Jong*
Affiliation:
RSM Erasmus University
Jan-Benedict E. M. Steenkamp
Affiliation:
University of North-Carolina at Chapel Hill
*
Requests for reprints should be sent to Martijn G. de Jong, Department of Marketing Management, RSM Erasmus University, Room T10-17, Burgemeester Oudlaan 50, Rotterdam 3062 PA, The Netherlands. E-mail: [email protected]
Rights & Permissions [Opens in a new window]

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

We present a class of finite mixture multilevel multidimensional ordinal IRT models for large scale cross-cultural research. Our model is proposed for confirmatory research settings. Our prior for item parameters is a mixture distribution to accommodate situations where different groups of countries have different measurement operations, while countries within these groups are still allowed to be heterogeneous. A simulation study is conducted that shows that all parameters can be recovered. We also apply the model to real data on the two components of affective subjective well-being: positive affect and negative affect. The psychometric behavior of these two scales is studied in 28 countries across four continents.

Type
Theory and Methods
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NC
This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
Copyright
Copyright © 2009 The Psychometric Society

Footnotes

We thank AiMark for providing the data, and Roger Millsap, Bengt Muthén, and the anonymous reviewers for extremely valuable comments.

References

Ansari, A., & Jedidi, K. (2000). Bayesian factor analysis for multilevel binary observations. Psychometrika, 65, 475496.CrossRefGoogle Scholar
Ansari, A., Jedidi, K., & Dube, L. (2002). Heterogeneous factor analysis models: a Bayesian approach. Psychometrika, 67, 4977.CrossRefGoogle Scholar
Béguin, A.A., & Glas, C.A.W. (2001). MCMC estimation and some model-fit analysis of multidimensional IRT models. Psychometrika, 66, 541562.CrossRefGoogle Scholar
Bollen, K. (1989). Structural equations with latent variables, New York: Wiley.CrossRefGoogle Scholar
Bolt, D.M., Cohen, A.S., & Wollack, J.A. (2001). A mixture item response model for multiple-choice data. Journal of Educational and Behavioral Statistics, 26, 381409.CrossRefGoogle Scholar
Brooks, S.P., & Gelman, A. (1998). General methods for monitoring convergence of iterative simulations. Journal of Computational & Graphical Statistics, 7, 434455.CrossRefGoogle Scholar
Celeux, G., Forbes, F., Robert, C.P., & Titterington, D.M. (2006). Deviance information criteria for missing data models. Bayesian Analysis, 1, 651674.CrossRefGoogle Scholar
Cohen, A.S., & Bolt, D.M. (2005). A mixture model analysis of differential item functioning. Journal of Educational Measurement, 42, 133148.CrossRefGoogle Scholar
Cohen, A.S., Kim, S.H., & Wollack, J.A. (1996). An investigation of the likelihood ratio test for detection of differential item functioning. Applied Psychological Measurement, 20, 1526.CrossRefGoogle Scholar
De Boeck, P. (2008). Random item IRT models. Psychometrika, 73, 533559.CrossRefGoogle Scholar
De Jong, M.G., Steenkamp, J.-B.E.M., & Fox, J.-P. (2007). Relaxing measurement invariance in cross-national consumer research using a hierarchical IRT model. Journal of Consumer Research, 34, 260278.CrossRefGoogle Scholar
Diener, E., Diener, M., & Diener, C. (1995). Factors predicting the subjective well-being of nations. Journal of Personality and Social Psychology, 69, 851864.CrossRefGoogle ScholarPubMed
Diener, E., Suh, E., Lucas, R.E., & Smith, H.L. (1999). Subjective well-being: three decades of progress. Psychological Bulletin, 125, 276302.CrossRefGoogle Scholar
Diener, E., Oishi, S., & Lucas, R.E. (2003). Personality, culture, and subjective well-being: Emotional and cognitive evaluations of life. Annual Review of Psychology, 54, 403425.CrossRefGoogle ScholarPubMed
Fox, J.-P. (2005). Multilevel IRT using dichotomous and polytomous Items. British Journal of Mathematical and Statistical Psychology, 58, 145172.CrossRefGoogle Scholar
Fox, J.-P., & Glas, C.A.W. (2001). Bayesian estimation of a multilevel IRT model using Gibbs sampling. Psychometrika, 66, 269286.CrossRefGoogle Scholar
Fox, J.-P., & Glas, C.A.W. (2003). Bayesian modeling of measurement error in predictor variables using item response theory. Psychometrika, 68, 169191.CrossRefGoogle Scholar
Gelman, A., Carlin, J.B., Stern, H.S., & Rubin, D.B. (2004). Bayesian data analysis, New York: Chapman & Hall.Google Scholar
Goldstein, H. (2003). Multilevel statistical models, London: Oxford University Press.Google Scholar
Goldstein, H., Bonnet, G., & Rocher, T. (2007). Multilevel structural equation models for the analysis of comparative data on educational performance. Journal of Educational and Behavioral Statistics, 32, 252286.CrossRefGoogle Scholar
Hofstede, G.H. (2001). Culture’s consequences: comparing values, behaviors, institutions, and organizations across nations, (2nd ed.). Thousand Oaks: Sage.Google Scholar
Hoijtink, H., Rooks, G., & Wilmink, F.W. (1999). Confirmatory factor analysis of items with a dichotomous response format using the multidimensional Rasch model. Psychological Methods, 4, 300314.CrossRefGoogle Scholar
Johnson, T.R. (2003). On the use of heterogeneous thresholds ordinal response models to account for individual differences in response style. Psychometrika, 68, 563583.CrossRefGoogle Scholar
Jöreskog, K.G. (1969). A general approach to confirmatory maximum likelihood factor analysis. Psychometrika, 32, 443482.CrossRefGoogle Scholar
Kamman, R., & Flett, R. (1983). Sourcebook for measuring well-being with affectometer 2, Dunedin: Why Not? Foundation.Google Scholar
King, G., Murray, C.J.L., Salomon, J.A., & Tandon, A. (2003). Enhancing the validity of cross-cultural comparability of measurement in survey research. American Political Science Review, 98(1), 191207.CrossRefGoogle Scholar
Lee, S.-Y. (2007). Structural equation modelling: a Bayesian approach, London: Wiley.CrossRefGoogle Scholar
Lenk, P.J., & DeSarbo, W.S. (2000). Bayesian inference for finite mixtures of generalized linear models with random effects. Psychometrika, 65(1), 93119.CrossRefGoogle Scholar
Longford, N.T. (1993). Random coefficient models, New York: Oxford University Press.Google Scholar
Lord, F.M. (1980). Applications of item response theory to practical testing problems, Hillside: Erlbaum.Google Scholar
Lubke, G.H., & Muthén, B.O. (2004). Applying multigroup confirmatory factor models for continuous outcomes to Likert scale data complicates meaningful group comparisons. Structural Equation Modeling, 11, 514534.CrossRefGoogle Scholar
Lyubomirsky, S., King, L., & Diener, E. (2005). The benefits of frequent positive affect: does happiness lead to success. Psychological Bulletin, 131, 803855.CrossRefGoogle ScholarPubMed
May, H. (2006). A multilevel Bayesian IRT method for scaling socioeconomic status in international studies of education. Journal of Educational and Behavioral Statistics, 31, 6379.CrossRefGoogle Scholar
McCrae, R.R., & Terracciano, A. (2005). Universal features of personality traits from the observer’s perspective: data from 50 cultures. Journal of Personality and Social Psychology, 88, 547561.CrossRefGoogle ScholarPubMed
McLachlan, G., & Peel, D. (2000). Finite mixture models, New York: Wiley.CrossRefGoogle Scholar
Meade, A.W., & Lautenschlager, G.J. (2004). A comparison of item response theory and confirmatory factor analytic methodologies for establishing measurement equivalence/invariance. Organizational Research Methods, 7, 361388.CrossRefGoogle Scholar
Mellenbergh, G.J. (1994). Generalized linear item response theory. Psychological Bulletin, 115, 300307.CrossRefGoogle Scholar
Meredith, W. (1993). Measurement invariance, factor analysis, and factorial invariance. Psychometrika, 58, 525543.CrossRefGoogle Scholar
Millsap, R.E. (1995). Measurement invariance, predictive invariance, and the duality paradox. Multivariate Behavioral Research, 30(4), 577605.CrossRefGoogle ScholarPubMed
Millsap, R.E. (1997). Invariance in measurement and prediction: their relationship in the single-factor case. Psychological Methods, 2(3), 248260.CrossRefGoogle Scholar
Millsap, R.E. (2008). Invariance in measurement and prediction revisited. Psychometrika, 72, 461473.CrossRefGoogle Scholar
Millsap, R.E., & Kwok, O.-M. (2004). Evaluating the impact of partial factorial invariance on selection in two populations. Psychological Methods, 9, 93115.CrossRefGoogle ScholarPubMed
Millsap, R.E., & Yun-Tein, J. (2003). Assessing factorial invariance in ordered-categorical measures. Multivariate Behavioral Research, 39, 479515.CrossRefGoogle Scholar
Newton, M.A., & Raftery, A.E. (1994). Approximate Bayesian inference with the weighted likelihood bootstrap. Journal of the Royal Statistical Society: Series B (Methodological), 56, 348.CrossRefGoogle Scholar
Rabe-Hesketh, S., & Skrondal, A. (2007). Multilevel and latent variable modeling with composite links and exploded likelihoods. Psychometrika, 72, 123140.CrossRefGoogle Scholar
Rabe-Hesketh, S., Skrondal, A., & Pickles, A. (2004). Generalized multilevel structural equation modeling. Psychometrika, 69, 167190.CrossRefGoogle Scholar
Raju, N.S. (1990). Determining the significance of estimated signed and unsigned areas between two item response functions. Applied Psychological Measurement, 14, 197207.CrossRefGoogle Scholar
Raju, N.S., Laffitte, L.J., & Byrne, B.M. (2002). Measurement equivalence: a comparison of methods based on confirmatory factor analysis and item response theory. Journal of Applied Psychology, 87, 517529.CrossRefGoogle ScholarPubMed
Reise, S.P., Widaman, K.F., & Pugh, R.H. (1993). Confirmatory factor analysis and item response theory: two approaches for exploring measurement invariance. Psychological Bulletin, 114, 552566.CrossRefGoogle ScholarPubMed
Rijmen, F., Tuerlinckx, F., De Boeck, P., & Kuppens, P. (2003). A nonlinear mixed model framework for item response theory. Psychological Methods, 8, 185205.CrossRefGoogle ScholarPubMed
Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph, 17, 1100.Google Scholar
Scheines, R., Hoijtink, H., & Boomsma, A. (1999). Bayesian estimation and testing of structural equation models. Psychometrika, 64, 3752.CrossRefGoogle Scholar
Schimmack, U., Radhakrishnan, P., Oishi, S., Dzokoto, V., & Ahadi, S. (2002). Culture, personality, and subjective well-being: Integrating process models of life satisfaction. Journal of Personality and Social Psychology, 82, 582593.CrossRefGoogle ScholarPubMed
Sinharay, S., Johnson, M.S., & Stern, H.S. (2006). Posterior predictive assessment of item response theory models. Applied Pychological Measurement, 30, 298321.CrossRefGoogle Scholar
Snijders, T.A.B., & Bosker, R.J. (1999). Multilevel analysis: an introduction to basic and advanced multilevel modeling, London: Sage.Google Scholar
Song, X.-Y., & Lee, S.-Y. (2004). Bayesian analysis of two-level nonlinear structural equation models with continuous and polytomous data. British Journal of Mathematical and Statistical Psychology, 57, 2952.CrossRefGoogle ScholarPubMed
Spiegelhalter, D.J., Best, N.G., Carlin, B.P., & van der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society B, 64(10), 583639.CrossRefGoogle Scholar
Stark, S., Chernyshenko, O.S., & Drasgow, F. (2006). Detecting differential item functioning with confirmatory factor analysis and item response theory: toward a unified strategy. Journal of Applied Psychology, 91, 12921306.CrossRefGoogle Scholar
Steenkamp, J.-B.E.M. (2005). Moving out of the US silo: a call to arms for conducting international marketing research. Journal of Marketing, 69, 68.Google Scholar
Steenkamp, J.-B.E.M., & Baumgartner, H. (1998). Assessing measurement invariance in cross-national consumer research. Journal of Consumer Research, 25, 7890.CrossRefGoogle Scholar
Tanner, M.A., & Wong, W.H. (1987). The calculation of posterior distributions by data Augmentation. Journal of the American Statistical Association, 82, 528550.CrossRefGoogle Scholar
Thissen, D., Steinberg, L., & Wainer, H. (1988). Use of item response theory in the study of group differences in trace lines. In Wainer, H., Braun, H.I. (Eds.), Test validity (pp. 147169). Hillsdale: Erlbaum.Google Scholar
Titterington, D.M., Smith, A.E.M., & Makov, U.E. (1985). Statistical analysis offinite mixture distributions, New York: Wiley.Google Scholar
Van de Vijver, F.J.R., & Leung, K. (1997). Methods and data analysis for cross-cultural research, London: Sage.Google Scholar
Vandenberg, R.J., & Lance, C.E. (2000). A review and synthesis of the measurement invariance literature: suggestions, practices, and recommendations for organizational research. Organizational Research Methods, 3, 469.CrossRefGoogle Scholar
Vermunt, J. (2008). Latent class and finite mixture models for multilevel datasets. Statistical Methods in Medical Research, 17, 3351.CrossRefGoogle Scholar
Watson, D., & Clark, L.A. (1991). Self-versus peer-ratings of specific emotional traits: evidence of convergent and discriminant validity. Journal of Personality and Social Psychology, 60, 927940.CrossRefGoogle Scholar
Wolfe, R., & Firth, D. (2002). Modeling subjective use of an ordinal response scale in a many period crossover experiment. Applied Statistics, 51(2), 245255.Google Scholar
Zumbo, O., & Bruno, D. (2007). Three generations of DIF analyses: considering where it has been, where it is now, and where it is going. Language Assessment Quarterly, 4, 223233.CrossRefGoogle Scholar
Zwick, R., & Thayer, D.T. (1996). Evaluating the magnitude of differential item functioning in polytomous items. Journal of Educational and Behavioral Statistics, 21, 187201.CrossRefGoogle Scholar