Skip to main content Accessibility help
×
Hostname: page-component-cd9895bd7-dzt6s Total loading time: 0 Render date: 2024-12-22T15:25:07.824Z Has data issue: false hasContentIssue false

Algorithms for Measurement Invariance Testing

Contrasts and Connections

Published online by Cambridge University Press:  02 December 2023

Veronica Cole
Affiliation:
Wake Forest University, North Carolina
Conor H. Lacey
Affiliation:
Wake Forest University, North Carolina

Summary

Latent variable models are a powerful tool for measuring many of the phenomena in which developmental psychologists are often interested. If these phenomena are not measured equally well among all participants, this would result in biased inferences about how they unfold throughout development. In the absence of such biases, measurement invariance is achieved; if this bias is present, differential item functioning (DIF) would occur. This Element introduces the testing of measurement invariance/DIF through nonlinear factor analysis. After introducing models which are used to study these questions, the Element uses them to formulate different definitions of measurement invariance and DIF. It also focuses on different procedures for locating and quantifying these effects. The Element finally provides recommendations for researchers about how to navigate these options to make valid inferences about measurement in their own data.
Get access
Type
Element
Information
Online ISBN: 9781009303408
Publisher: Cambridge University Press
Print publication: 21 December 2023

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Asparouhov, T., & Muthén, B. (2014). Multiple-group factor analysis alignment. Structural Equation Modeling: A Multidisciplinary Journal, 21(4), 495508.Google Scholar
Bauer, D. J. (2017). A more general model for testing measurement invariance and differential item functioning. Psychological Methods, 22(3), 507526. https://doi.org/10.1037/met0000077.CrossRefGoogle ScholarPubMed
Bauer, D. J., Belzak, W. C. M., & Cole, V. T. (2020). Simplifying the assessment of measurement invariance over multiple background variables: Using regularized moderated nonlinear factor analysis to detect differential item functioning. Structural Equation Modeling, 27(1), 4355. https://doi.org/10.1080/10705511.2019.1642754.Google Scholar
Bauer, D. J., & Hussong, A. M. (2009). Psychometric approaches for developing commensurate measures across independent studies: Traditional and new models. Psychological Methods, 14(2), 101125. https://doi.org/10.1037/a0015583.CrossRefGoogle ScholarPubMed
Belzak, W. C. M. (2020). Testing differential item functioning in small samples. Multivariate Behavioral Research, 55(5), 722747. https://doi.org/10.1080/00273171.2019.1671162.Google Scholar
Belzak, W. C. M., & Bauer, D. J. (2020). Improving the assessment of measurement invariance: Using regularization to select anchor items and identify differential item functioning. Psychological Methods, 25(6), 673690. https://doi.org/10.1037/met0000253.Google Scholar
Bentler, P. M. (1990). Comparative fit indexes in structural models. Psychological Bulletin, 107(2), 238.Google Scholar
Birmaher, B., Khetarpal, S., Brent, D., et al. (1997). The screen for child anxiety related emotional disorders (SCARED): Scale construction and psychometric characteristics. Journal of the American Academy of Child & Adolescent Psychiatry, 36(4), 545553.CrossRefGoogle ScholarPubMed
Birnbaum, A. (1969). Statistical theory for logistic mental test models with a prior distribution of ability. Journal of Mathematical Psychology, 6(2), 258276.Google Scholar
Bock, R. D., & Zimowski, M. F. (1997). Multiple group IRT. In Linden, W. J., & Hambleton, R. K., (eds.), Handbook of modern item response theory (pp. 433448). Springer.CrossRefGoogle Scholar
Bollen, K. A. (1989). Structural equations with latent variables (Vol. 210). John Wiley & Sons.Google Scholar
Bond, T., Yan, Z., & Heene, M. (2020). Applying the Rasch model: Fundamental measurement in the human sciences. Routledge.Google Scholar
Brannick, M. T. (1995). Critical comments on applying covariance structure modeling. Journal of Organizational Behavior, 16(3), 201213.Google Scholar
Byrne, B. M., Shavelson, R. J., & Muthén, B. (1989). Testing for the equivalence of factor covariance and mean structures: The issue of partial measurement invariance. Psychological Bulletin, 105(3), 456466.Google Scholar
Chalmers, R. P. (2023). A unified comparison of IRT-based effect sizes for DIF investigations. Journal of Educational Measurement, 60(2), 318350.CrossRefGoogle Scholar
Chalmers, R. P., Counsell, A., & Flora, D. B. (2016). It might not make a big DIF: Improved differential test functioning statistics that account for sampling variability. Educational and Psychological Measurement, 76(1), 114140.Google Scholar
Chang, H.-H., & Mazzeo, J. (1994). The unique correspondence of the item response function and item category response functions in polytomously scored item response models. Psychometrika, 59(3), 391404.Google Scholar
Chen, F. F. (2007). Sensitivity of goodness of fit indexes to lack of measurement invariance. Structural Equation Modeling: A Multidisciplinary Journal, 14(3), 464504.Google Scholar
Chen, F. F. (2008). What happens if we compare chopsticks with forks? The impact of making inappropriate comparisons in cross-cultural research. Journal of Personality and Social Psychology, 95(5), 10051018. https://doi.org/10.1037/a0013193.Google Scholar
Cheung, G. W., & Rensvold, R. B. (1999). Testing factorial invariance across groups: A reconceptualization and proposed new method. Journal of Management, 25(1), 127.Google Scholar
Cheung, G. W., & Rensvold, R. B. (2002a). Evaluating goodness-of-fit indexes for testing measurement invariance. Structural Equation Modeling, 9(2), 233255.CrossRefGoogle Scholar
Cheung, G. W., & Rensvold, R. B. (2002b). Evaluating goodness-of-fit indexes for testing measurement invariance. Structural Equation Modeling, 9(2), 233255. https://doi.org/10.1207/s15328007sem0902_5.CrossRefGoogle Scholar
Cohen, D. J., Dibble, E., & Grawe, J. M. (1977). Parental style: Mothers’ and fathers’ perceptions of their relations with twin children. Archives of General Psychiatry, 34(4), 445451.Google Scholar
Cole, V. T., Hussong, A. M., Gottfredson, N. C., Bauer, D. J., & Curran, P. J. (2022). Informing harmonization decisions in integrative data analysis: Exploring the measurement multiverse. Prevention Science, 113.Google Scholar
Curran, P. J., Cole, V., Bauer, D. J., Hussong, A. M., & Gottfredson, N. (2016). Improving factor score estimation through the use of observed background characteristics. Structural Equation Modeling: A Multidisciplinary Journal, 23(6), 827844.CrossRefGoogle ScholarPubMed
Curran, P. J., Cole, V. T., Bauer, D. J., Rothenberg, W. A., & Hussong, A. M. (2018). Recovering predictor–criterion relations using covariate-informed factor score estimates. Structural Equation Modeling: A Multidisciplinary Journal, 25(6), 860875.Google Scholar
Curran, P. J., McGinley, J. S., Bauer, D. J., et al. (2014). A moderated nonlinear factor model for the development of commensurate measures in integrative data analysis. Multivariate Behavioral Research, 49(3), 214231.CrossRefGoogle ScholarPubMed
DeMars, C. E. (2009). Modification of the Mantel-Haenszel and logistic regression dif procedures to incorporate the sibtest regression correction. Journal of Educational and Behavioral Statistics, 34(2), 149170.Google Scholar
DiStefano, C., Shi, D., & Morgan, G. B. (2021). Collapsing categories is often more advantageous than modeling sparse data: Investigations in the CFA framework. Structural Equation Modeling: A Multidisciplinary Journal, 28(2), 237249.Google Scholar
DiStefano, C., Zhu, M., & Mindrila, D. (2009). Understanding and using factor scores: Considerations for the applied researcher. Practical assessment, Research, and Evaluation, 14(1), 114. https://scholarworks.umass.edu/cgi/viewcontent.cgi?article=1226&context=pare.Google Scholar
Dorans, N. J., & Kulick, E. (1986). Demonstrating the utility of the standardization approach to assessing unexpected differential item performance on the scholastic aptitude test. Journal of Educational Measurement, 23(4), 355368.Google Scholar
Edelen, M. O., Stucky, B. D., & Chandra, A. (2015). Quantifying ‘problematic’ DIF within an IRT framework: Application to a cancer stigma index. Quality of Life Research, 24, 95103.CrossRefGoogle ScholarPubMed
Efron, B., Hastie, T., Johnstone, I., & Tibshirani, R. (2004). Least angle regression. The Annals of Statistics, 32(2), 407499.Google Scholar
Epskamp, S., Rhemtulla, M., & Borsboom, D. (2017). Generalized network psychometrics: Combining network and latent variable models. Psychometrika, 82(4), 904927.CrossRefGoogle ScholarPubMed
Ferrando, P. J. (2002). Theoretical and empirical comparisons between two models for continuous item response. Multivariate Behavioral Research, 37(4), 521542.CrossRefGoogle ScholarPubMed
Finch, H. (2005). The MIMIC model as a method for detecting DIF: Comparison with Mantel–Haenszel, SIBTEST, and the IRT likelihood ratio. Applied Psychological Measurement, 29(4), 278295.Google Scholar
Fischer, H. F., & Rose, M. (2019). Scoring depression on a common metric: A comparison of EAP estimation, plausible value imputation, and full Bayesian IRT modeling. Multivariate Behavioral Research, 54(1), 8599.Google Scholar
Flake, J. K., & McCoach, D. B. (2018). An investigation of the alignment method with polytomous indicators under conditions of partial measurement invariance. Structural Equation Modeling: A Multidisciplinary Journal, 25(1), 5670.Google Scholar
Flora, D. B., & Curran, P. J. (2004). An empirical evaluation of alternative methods of estimation for confirmatory factor analysis with ordinal data. Psychological Methods, 9(4), 466491.Google Scholar
French, B. F., & Finch, W. H. (2006). Confirmatory factor analytic procedures for the determination of measurement invariance. Structural Equation Modeling, 13(3), 378402. https://doi.org/10.1207/s15328007sem1303_3.Google Scholar
Gottfredson, N. C., Cole, V. T., Giordano, M. L., et al. (2019). Simplifying the implementation of modern scale scoring methods with an automated R package: Automated moderated nonlinear factor analysis (AMNLFA). Addictive Behaviors, 94, 6573.CrossRefGoogle ScholarPubMed
Gray, M., & Sanson, A. (2005). Growing up in Australia: The longitudinal study of Australian children. Family Matters, (72), 49.Google Scholar
Grice, J. W. (2001). Computing and evaluating factor scores. Psychological Methods, 6(4), 430.CrossRefGoogle ScholarPubMed
Gunn, H. J., Grimm, K. J., & Edwards, M. C. (2020). Evaluation of six effect size measures of measurement non-invariance for continuous outcomes. Structural Equation Modeling, 27(4), 503514. https://doi.org/10.1080/10705511.2019.1689507.CrossRefGoogle Scholar
Holland, P. W., & Thayer, D. T. (1986). Differential item functioning and the Mantel–Haenszel procedure. ETS Research Report Series, 1986(2), i24.CrossRefGoogle Scholar
Horn, J. L., & McArdle, J. J. (1992). A practical and theoretical guide to measurement invariance in aging research. Experimental Aging Research, 18(3), 117144.Google Scholar
Hosmer Jr, D. W., Lemeshow, S., & Sturdivant, R. X. (2013). Applied Logistic Regression (Vol. 398). John Wiley.Google Scholar
Hoyle, R. H. (1995). Structural equation modeling: Concepts, issues, and applications. Sage.Google Scholar
Jiang, H., & Stout, W. (1998). Improved type I error control and reduced estimation bias for DIF detection using SIBTEST. Journal of Educational and Behavioral Statistics, 23(4), 291322.Google Scholar
Jodoin, M. G., & Gierl, M. J. (2001). Evaluating type I error and power rates using an effect size measure with the logistic regression procedure for DIF detection. Applied Measurement in Education, 14(4), 329349.Google Scholar
Jöreskog, K. G. (1971). Simultaneous factor analysis in several populations. Psychometrika, 36(4), 409426.CrossRefGoogle Scholar
Kim, E. S., Cao, C., Wang, Y., & Nguyen, D. T. (2017). Measurement invariance testing with many groups: A comparison of five approaches. Structural Equation Modeling: A Multidisciplinary Journal, 24(4), 524544.Google Scholar
Knott, M., & Bartholomew, D. J. (1999). Latent variable models and factor analysis (Vol. 7). Edward Arnold.Google Scholar
Kopf, J., Zeileis, A., & Strobl, C. (2015a). Anchor selection strategies for DIF analysis: Review, assessment, and new approaches. Educational and Psychological Measurement, 75(1), 2256.CrossRefGoogle ScholarPubMed
Kopf, J., Zeileis, A., & Strobl, C. (2015b). A framework for anchor methods and an iterative forward approach for DIF detection. Applied Psychological Measurement, 39(2), 83103.CrossRefGoogle Scholar
Lai, M. H., Liu, Y., & Tse, W. W. Y. (2022). Adjusting for partial invariance in latent parameter estimation: Comparing forward specification search and approximate invariance methods. Behavior Research Methods, 54(1), 414434. https://doi.org/10.3758/s13428-021-01560-2.Google Scholar
Lai, M. H. C., & Zhang, Y. (2022). Classification accuracy of multidimensional tests: Quantifying the impact of noninvariance. Structural Equation Modeling: A Multidisciplinary Journal, 29(4), 620–629, 1–10. https://doi.org/10.1080/10705511.2021.1977936.Google Scholar
Li, H.-H., & Stout, W. (1996). A new procedure for detection of crossing DIF. Psychometrika, 61(4), 647677.Google Scholar
Li, Z., & Zumbo, B. D. (2009). Impact of differential item functioning on subsequent statistical conclusions based on observed test score data. Psicológica, 30(2), 343370.Google Scholar
Lubke, G., & Neale, M. (2008). Distinguishing between latent classes and continuous factors with categorical outcomes: Class invariance of parameters of factor mixture models. Multivariate Behavioral Research, 43(4), 592620.Google Scholar
MacCallum, R. C., Roznowski, M., & Necowitz, L. B. (1992). Model modifications in covariance structure analysis: The problem of capitalization on chance. Psychological Bulletin, 111(3), 490504.CrossRefGoogle ScholarPubMed
Marsh, H. W., Guo, J., Parker, P. D., et al. (2018). What to do when scalar invariance fails: The extended alignment method for multi-group factor analysis comparison of latent means across many groups. Psychological Methods, 23(3), 524545.CrossRefGoogle Scholar
Maydeu-Olivares, A., & Cai, L. (2006). A cautionary note on using g2 (DIF) to assess relative model fit in categorical data analysis. Multivariate Behavioral Research, 41(1), 5564.CrossRefGoogle ScholarPubMed
McCullagh, P., & Nelder, J. A. (2019). Generalized linear models. Routledge.CrossRefGoogle Scholar
McNeish, D., & Wolf, M. G. (2020). Thinking twice about sum scores. Behavior Research Methods, 52(6), 22872305.CrossRefGoogle ScholarPubMed
Meade, A. W., Johnson, E. C., & Braddy, P. W. (2008). Power and sensitivity of alternative fit indices in tests of measurement invariance. Journal of Applied Psychology, 93(3), 568592. https://doi.org/10.1037/0021-9010.93.3.568.Google Scholar
Meade, A. W., & Lautenschlager, G. J. (2004). A comparison of item response theory and confirmatory factor analytic methodologies for establishing measurement equivalence/invariance. Organizational Research Methods, 7(4), 361388. https://doi.org/10.1177/1094428104268027.Google Scholar
Meade, A. W., & Wright, N. A. (2012). Solving the measurement invariance anchor item problem in item response theory. Journal of Applied Psychology, 97(5), 10161031. https://doi.org/10.1037/a0027934.CrossRefGoogle ScholarPubMed
Mellenbergh, G. J. (1989). Item bias and item response theory. International Journal of Educational Research, 13(2), 127143. https://doi.org/10.1016/0883-0355(89)90002-5.CrossRefGoogle Scholar
Meredith, W. (1993). Measurement invariance, factor analysis and factorial invariance. Psychometrika, 58(4), 525543.Google Scholar
Millsap, R. E. (1997). Invariance in measurement and prediction: Their relationship in the single-factor case. Psychological Methods, 2(3), 248260.Google Scholar
Millsap, R. E. (1998). Group differences in regression intercepts: Implications for factorial invariance. Multivariate Behavioral Research, 33(3), 403424.Google Scholar
Millsap, R. E. (2011). Statistical approaches to measurement invariance. Routledge. https://doi.org/10.4324/9780203821961.Google Scholar
Millsap, R. E., & Kwok, O.-M. (2004). Evaluating the impact of partial factorial invariance on selection in two populations. Psychological Methods, 9(1), 93115. https://doi.org/10.1037/1082-989x.9.1.93.Google Scholar
Millsap, R. E., & Meredith, W. (2007). Factorial invariance: Historical perspectives and new problems. In Cudeck, R., and MacCallum, R. C., (eds.), Factor analysis at 100 (pp. 145166). Routledge.Google Scholar
Millsap, R. E., & Yun-Tein, J. (2004). Assessing factorial invariance in ordered-categorical measures. Multivariate Behavioral Research, 39(3), 479515.Google Scholar
Muraki, E., & Engelhard Jr, G. (1985). Full-information item factor analysis: Applications of EAP scores. Applied Psychological Measurement, 9(4), 417430.Google Scholar
Muthén, B. O. (1989). Latent variable modeling in heterogeneous populations. Psychometrika, 54(4), 557585.CrossRefGoogle Scholar
Muthén, B., & Asparouhov, T. (2018). Recent methods for the study of measurement invariance with many groups: Alignment and random effects. Sociological Methods & Research, 47(4), 637664.Google Scholar
Nye, C. D., Bradburn, J., Olenick, J., Bialko, C., & Drasgow, F. (2019). How big are my effects? Examining the magnitude of effect sizes in studies of measurement equivalence: Organizational Research Methods, 22(3), 678709. https://doi.org/10.1177/1094428118761122.Google Scholar
Nye, C. D., & Drasgow, F. (2011). Effect size indices for analyses of measurement equivalence: Understanding the practical importance of differences between groups. Journal of Applied Psychology, 96(5), 966980. https://doi.org/10.1037/a0022955.Google Scholar
Osterlind, S. J., & Everson, H. T. (2009). Differential item functioning. Sage.CrossRefGoogle Scholar
Putnick, D. L., & Bornstein, M. H. (2016). Measurement invariance conventions and reporting: The state of the art and future directions for psychological research. Developmental Review, 41(41), 7190. https://doi.org/10.1016/j.dr.2016.06.004.Google Scholar
Raykov, T., Marcoulides, G. A., Harrison, M., & Zhang, M. (2020). On the dependability of a popular procedure for studying measurement invariance: A cause for concern? Structural Equation Modeling: A Multidisciplinary Journal, 27(4), 649656.CrossRefGoogle Scholar
Reckase, M. D. (1997). The past and future of multidimensional item response theory. Applied Psychological Measurement, 21(1), 2536.CrossRefGoogle Scholar
Reise, S. P., Widaman, K. F., & Pugh, R. H. (1993). Confirmatory factor analysis and item response theory: Two approaches for exploring measurement invariance. Psychological Bulletin, 114(3), 552566.CrossRefGoogle ScholarPubMed
Rost, J. (1990). Rasch models in latent classes: An integration of two approaches to item analysis. Applied Psychological Measurement, 14(3), 271282.CrossRefGoogle Scholar
Roussos, L. A., & Stout, W. F. (1996). Simulation studies of the effects of small sample size and studied item parameters on SIBTEST and Mantel–Haenszel type I error performance. Journal of Educational Measurement, 33(2), 215230.Google Scholar
Rutter, M., Bailey, A., & Lord, C. (2003). SCQ. The Social Communication Questionnaire. Torrance, CA: Western Psychological Services. https://www.wpspublish.com/store/p/2954/social-communication-questi-onnaire-scq.Google Scholar
Samejima, F. (1997). Graded response model. In Linden, W. J., & Hambleton, R. K., (eds.), Handbook of modern item response theory (pp. 85100). Springer.Google Scholar
Sanson, A. V., Nicholson, J., Ungerer, J., et al. (2002). Introducing the longitudinal study of Australian children. Australian Institute of Family Studies.Google Scholar
Satorra, A., & Bentler, P. M. (2001). A scaled difference chi-square test statistic for moment structure analysis. Psychometrika, 66(4), 507514.CrossRefGoogle Scholar
Satorra, A., & Saris, W. E. (1985). Power of the likelihood ratio test in covariance structure analysis. Psychometrika, 50(1), 8390.Google Scholar
Savalei, V., & Kolenikov, S. (2008). Constrained versus unconstrained estimation in structural equation modeling. Psychological Methods, 13(2), 150170.Google Scholar
Schiltz, H. K., & Magnus, B. E. (2021). Differential item functioning based on autism features, IQ, and age on the screen for child anxiety related disorders (SCARED) among youth on the autism spectrum. Autism Research, 14(6), 12201236.Google Scholar
Schneider, L., Chalmers, R. P., Debelak, R., & Merkle, E. C. (2020). Model selection of nested and non-nested item response models using Vuong tests. Multivariate Behavioral Research, 55(5), 664684.CrossRefGoogle ScholarPubMed
Shealy, R., & Stout, W. (1993). A model-based standardization approach that separates true bias/DIF from group ability differences and detects test bias/DTF as well as item bias/DIF. Psychometrika, 58(2), 159194.CrossRefGoogle Scholar
Shi, D., Song, H., DiStefano, C., et al. (2019). Evaluating factorial invariance: An interval estimation approach using bayesian structural equation modeling. Multivariate Behavioral Research, 54(2), 224245. https://doi.org/10.1080/00273171.2018.1514484.Google Scholar
Skrondal, A., & Laake, P. (2001). Regression among factor scores. Psychometrika, 66, 563575.Google Scholar
Skrondal, A., & Rabe-Hesketh, S. (2004). Generalized latent variable modeling: Multilevel, longitudinal, and structural equation models. Chapman Hall/CRC.CrossRefGoogle Scholar
Stark, S., Chernyshenko, O. S., & Drasgow, F. (2004). Examining the effects of differential item (functioning and differential) test functioning on selection decisions: When are statistically significant effects practically important? Journal of Applied Psychology, 89(3), 497508.Google Scholar
Stark, S., Chernyshenko, O. S., & Drasgow, F. (2006). Detecting differential item functioning with confirmatory factor analysis and item response theory: Toward a unified strategy. Journal of Applied Psychology, 91(6), 12921306.Google Scholar
Steenkamp, J.-B. E., & Baumgartner, H. (1998). Assessing measurement invariance in cross-national consumer research. Journal of Consumer Research, 25(1), 7890.Google Scholar
Steiger, J. H. (1998). A note on multiple sample extensions of the RMSEA fit index. Structural Equation Modeling, 5(4), 411419.CrossRefGoogle Scholar
Steinberg, L., & Thissen, D. (2006). Using effect sizes for research reporting: Examples using item response theory to analyze differential item functioning. Psychological Methods, 11(4), 402415.CrossRefGoogle ScholarPubMed
Steinmetz, H. (2013). Analyzing observed composite differences across groups: Is partial measurement invariance enough? Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 9(1), 112.Google Scholar
Stoel, R. D., Garre, F. G., Dolan, C., & Van Den Wittenboer, G. (2006). On the likelihood ratio test in structural equation modeling when parameters are subject to boundary constraints. Psychological Methods, 11(4), 439455.CrossRefGoogle ScholarPubMed
Swaminathan, H., & Rogers, H. J. (1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27(4), 361370.CrossRefGoogle Scholar
Thissen, D. (2001). IRTLRDIF v. 2.0 b: Software for the computation of the statistics involved in item response theory likelihood-ratio tests for differential item functioning. Chapel Hill, NC: LL Thurstone Psychometric Laboratory.Google Scholar
Thissen, D., Steinberg, L., & Kuang, D. (2002). Quick and easy implementation of the Benjamini-Hochberg procedure for controlling the false positive rate in multiple comparisons. Journal of Educational and Behavioral Statistics, 27(1), 7783.Google Scholar
Thissen, D., Steinberg, L., & Wainer, H. (1993). Detection of differential item functioning using the parameters of item response models. In Holland, P. W., & Wainer, H., (eds.), Differential Item Functioning (pp. 67113). Lawrence Erlbaum Associates.Google Scholar
Tibshirani, R. (1997). The lasso method for variable selection in the Cox model. Statistics in Medicine, 16(4), 385395.3.0.CO;2-3>CrossRefGoogle ScholarPubMed
Tucker, L. R., & Lewis, C. (1973). A reliability coefficient for maximum likelihood factor analysis. Psychometrika, 38, 110.CrossRefGoogle Scholar
Vandenberg, R. J., & Lance, C. E. (2000). A review and synthesis of the measurement invariance literature: Suggestions, practices, and recommendations for organizational research. Organizational Research Methods, 3(1), 470.Google Scholar
Vernon-Feagans, L., Willoughby, M., Garrett-Peters, P., & Family Life Project Key Investigators. (2016). Predictors of Behavioral Regulation in Kindergarten: Household Chaos, Parenting and Early Executive Functions. Developmental Psychology, 52(3), 430.CrossRefGoogle ScholarPubMed
Wachs, T. D. (2013). Relation of maternal personality to perceptions of environmental chaos in the home. Journal of Environmental Psychology, 34, 19.Google Scholar
Wirth, R., & Edwards, M. C. (2007). Item factor analysis: Current approaches and future directions. Psychological Methods, 12(1), 5879.CrossRefGoogle ScholarPubMed
Woods, C. M. (2008). IRT-LR-DIF with estimation of the focal-group density as an empirical histogram. Educational and Psychological Measurement, 68(4), 571586.Google Scholar
Woods, C. M. (2009a). Empirical selection of anchors for tests of differential item functioning. Applied Psychological Measurement, 33(1), 4257.CrossRefGoogle Scholar
Woods, C. M. (2009b). Evaluation of MIMIC-model methods for DIF testing with comparison to two-group analysis. Multivariate Behavioral Research, 44(1), 127.CrossRefGoogle ScholarPubMed
Woods, C. M., Cai, L., & Wang, M. (2013). The Langer-improved Wald test for DIF testing with multiple groups: Evaluation and comparison to two-group IRT. Educational and Psychological Measurement, 73(3), 532547.CrossRefGoogle Scholar
Xu, Y. & Green, S. B. (2016). The impact of varying the number of measurement invariance constraints on the assessment of between-group differences of latent means. Structural equation modeling, 23(2), 290301. https://doi.org/10.1080/10705511.2015.1047932.Google Scholar
Yoon, M., & Kim, E. S. (2014). A comparison of sequential and nonsequential specification searches in testing factorial invariance. Behavior Research Methods, 46(4), 11991206. https://doi.org/10.3758/s13428-013-0430-2.CrossRefGoogle ScholarPubMed
Yoon, M., & Millsap, R. E. (2007). Detecting violations of factorial invariance using data-based specification searches: A Monte Carlo study. Structural Equation Modeling, 14(3), 435463. https://doi.org/10.1080/10705510701301677.Google Scholar
Yuan, K.-H., & Bentler, P. M. (2004). On chi-square difference and z tests in mean and covariance structure analysis when the base model is misspecified. Educational and Psychological Measurement, 64(5), 737757.Google Scholar
Zhang, Y., Lai, M. H., & Palardy, G. J. (2022). A Bayesian region of measurement equivalence (ROME) approach for establishing measurement invariance. Psychological Methods, 28(4), 9931004.Google Scholar
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2), 301320.CrossRefGoogle Scholar
Zubrick, S. R., Lucas, N., Westrupp, E. M., & Nicholson, J. M. (2014). Parenting Measures in the Longitudinal Study of Australian Children: Construct Validity and Measurement Quality, Waves 1 to 4. Canberra: Department of social services 1100.Google Scholar
Zwick, R. (1990). When do item response function and Mantel–Haenszel definitions of differential item functioning coincide? Journal of Educational Statistics, 15(3), 185197.Google Scholar

Save element to Kindle

To save this element to your Kindle, first ensure [email protected] is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Algorithms for Measurement Invariance Testing
  • Veronica Cole, Wake Forest University, North Carolina, Conor H. Lacey, Wake Forest University, North Carolina
  • Online ISBN: 9781009303408
Available formats
×

Save element to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Algorithms for Measurement Invariance Testing
  • Veronica Cole, Wake Forest University, North Carolina, Conor H. Lacey, Wake Forest University, North Carolina
  • Online ISBN: 9781009303408
Available formats
×

Save element to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Algorithms for Measurement Invariance Testing
  • Veronica Cole, Wake Forest University, North Carolina, Conor H. Lacey, Wake Forest University, North Carolina
  • Online ISBN: 9781009303408
Available formats
×