Hostname: page-component-745bb68f8f-5r2nc Total loading time: 0 Render date: 2025-01-23T05:26:06.607Z Has data issue: false hasContentIssue false

Statistical Tests of Conditional Independence Between Responses and/or Response Times on Test Items

Published online by Cambridge University Press:  01 January 2025

Wim J. van der Linden*
Affiliation:
University of Twente and CTB/McGraw-Hill
Cees A. W. Glas
Affiliation:
University of Twente
*
Requests for reprints should be sent to Wim J. van der Linden, CTB/McGraw-Hill, 20 Ryan Ranch Road, Monterey, CA 93940, USA. E-mail: [email protected]
Rights & Permissions [Opens in a new window]

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

Three plausible assumptions of conditional independence in a hierarchical model for responses and response times on test items are identified. For each of the assumptions, a Lagrange multiplier test of the null hypothesis of conditional independence against a parametric alternative is derived. The tests have closed-form statistics that are easy to calculate from the standard estimates of the person parameters in the model. In addition, simple closed-form estimators of the parameters under the alternatives of conditional dependence are presented, which can be used to explore model modification. The tests were applied to a data set from a large-scale computerized exam and showed excellent power to detect even minor violations of conditional independence.

Type
Theory and Methods
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NC
This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
Copyright
Copyright © 2009 The Psychometric Society

Footnotes

This study received funding from the Law School Admissions Council (LSAC). The opinions and conclusions contained in this paper are those of the author and do not necessarily reflect the policy and position of LSAC.

Wim J. van der Linden is now at CTB/McGraw-Hill.

References

Aithchison, J., & Silvey, D.C. (1958). Maximum likelihood estimation of parameters subject to restraints. Annals of Mathematical Statistics, 29, 813828.CrossRefGoogle Scholar
Bergstrom, B., Gershon, R., & Lunz, M.E. (1994). Computer-adaptive testing: exploring examinee response time using hierarchical linear modeling. Paper presented at the annual meeting of the National Council on Measurement in Education, New Orleans, LA.Google Scholar
Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In Lord, F.M., & Novick, M.R. (Eds.), Statistical theories of mental test scores (pp. 397479). Reading: Addison-Wesley.Google Scholar
Chen, W.-H., & Thissen, D. (1997). Local dependence indexes for item pairs using item response theory. Journal of Educational and Behavioral Statistics, 22, 265289.CrossRefGoogle Scholar
Fox, J.-P., Klein Entink, R.H., & van der Linden, W.J. (2007). Modeling of responses and response times with the package cirt. Journal of Statistical Software, 20(7), 114.CrossRefGoogle Scholar
Glas, C.A.W. (1999). Modification indices for the 2PL and the nominal response model. Psychometrika, 64, 273294.CrossRefGoogle Scholar
Glas, C.A.W., & Dagohoy, A.V.T. (2007). Person fit tests for IRT models for polytomous items with estimated person and item parameters. Psychometrika, 72, 159180.CrossRefGoogle Scholar
Glas, C.A.W., & Suárez Falcón, J.C. (2003). A comparison of item-fit statistics for the three-parameter logistic model. Applied Psychological Measurement, 27, 87106.CrossRefGoogle Scholar
Glas, C.A.W., & van der Linden, W.J. (2005). Likelihood-based estimation methods for models for concurrent continuous and discrete responses (LSAC Report). Enschede, The Netherlands: University of Twente, Department of Research Methodology, Measurement, and Data Analysis.Google Scholar
Hornke, L.F. (2000). Item response times in computerized adaptive testing. Psicológica, 21, 175189.Google Scholar
Hornke, L.F. (2005). Response time in computer-aided testing: a “Verbal Memory” test for routes and maps. Psychological Science, 2, 280293.Google Scholar
Klein Entink, R.H., Fox, J.-P., & van der Linden, W.J. (2009). A multivariate multilevel approach to simultaneous modeling of accuracy and speed on test items. Psychometrika, 74, 2148.CrossRefGoogle Scholar
Lehmann, E.L. (1999). Elements of large-sample theory, New York: Springer.CrossRefGoogle Scholar
Lord, F.M. (1980). Applications of item response theory to practical testing problems, Hillsdale: Erlbaum.Google Scholar
Luce, R.D. (1986). Response times: their roles in inferring elementary mental organization, Oxford: Oxford University Press.Google Scholar
Orlando, M., & Thissen, D. (2000). Likelihood-based item-fit indices for dichotomous item response theory models. Applied Psychological Measurement, 24, 5064.CrossRefGoogle Scholar
Rao, C.R. (1948). Large sample tests of statistical hypotheses concerning several parameters with applications to problems of estimation. Proceedings of the Cambridge Philosophical Society, 44, 5057.Google Scholar
Schnipke, D.L., & Scrams, D.J. (1997). Representing response time information in item banks (LSAC Computerized Testing Report No. 97-09). Newtown, PA: Law School Admission Council.Google Scholar
Silvey, S.D. (1975). Statistical inference, London: Chapman & Hall.Google Scholar
Sörbom, D. (1989). Model modification. Psychometrika, 54, 371384.CrossRefGoogle Scholar
Swanson, D.B., Featherman, C.M., Case, S.M., Luecht, R.M., & Nungester, R. (1999). Relationship of response latency to test design, examinee proficiency and item difficulty in computer-based test administration. Paper presented at the Annual Meeting of the National Council on Measurement in Education, Chicago, IL.Google Scholar
Swanson, D.B., Case, S.E., Ripkey, D.R., Clauser, B.E., & Holtman, M.C. (2001). Relationships among item characteristics, examinee characteristics, and response times on USMLE Step 1. Academic Medicine, 76, 114116.CrossRefGoogle Scholar
Thissen, D. (1983). Timed testing: an approach using item response theory. In Weiss, D.J. (Eds.), New horizons in testing: Latent trait test theory and computerized adaptive testing, New York: Academic Press.Google Scholar
van der Linden, W.J. (2005). Linear models for optimal test design, New York: Springer.CrossRefGoogle Scholar
van der Linden, W.J. (2006). A lognormal model for response times on test items. Journal of Educational and Behavioral Statistics, 31, 181204.CrossRefGoogle Scholar
van der Linden, W.J. (2007). A hierarchical framework for modeling speed and accuracy on test items. Psychometrika, 72, 287308.CrossRefGoogle Scholar
van der Linden, W.J. (2008). Using response times for item selection in adaptive testing. Journal of Educational and Behavioral Statistics, 32, 520.CrossRefGoogle Scholar
van der Linden, W.J. (2009a). Conceptual issues in response-time modeling. Journal of Educational Measurement, 46. In press.CrossRefGoogle Scholar
van der Linden, W.J. (2009). Predictive control of speededness in adaptive testing. Applied Psychological Measurement, 33, 2541.CrossRefGoogle Scholar
van der Linden, W.J. (2009c). A bivariate lognormal response-time model for the detection of collusion between test takers. Journal of Educational and Behavioral Statistics, 34. In press.CrossRefGoogle Scholar
van der Linden, W.J., & Guo, F. (2008). Bayesian procedures for identifying aberrant response-time patterns in adaptive testing. Psychometrika, 73, 365384.CrossRefGoogle Scholar
van der Linden, W.J., Breithaupt, K., Chuah, D., & Zhang, O. (2007). Detecting differential speededness in multistage testing. Journal of Educational Measurement, 44, 117130.CrossRefGoogle Scholar
van der Linden, W.J., Klein Entink, R.H., & Fox, J.-P. (2008). IRT parameter estimation with response times as collateral information. Manuscript submitted for publication.Google Scholar
Yen, W.M. (1984). Effects of local independence on the fit and equating performance of the three-parameter logistic model. Applied Psychological Measurement, 8, 125145.CrossRefGoogle Scholar