Hostname: page-component-745bb68f8f-b6zl4 Total loading time: 0 Render date: 2025-01-08T10:10:01.321Z Has data issue: false hasContentIssue false

A Generalized Speed–Accuracy Response Model for Dichotomous Items

Published online by Cambridge University Press:  01 January 2025

Peter W. van Rijn*
Affiliation:
ETS Global
Usama S. Ali
Affiliation:
Educational Testing Service South Valley University
*
Correspondence should be made to Peter W. van Rijn, ETS Global, Amsterdam, The Netherlands. Email: [email protected]; [email protected]

Abstract

We propose a generalization of the speed–accuracy response model (SARM) introduced by Maris and van der Maas (Psychometrika 77:615–633, 2012). In these models, the scores that result from a scoring rule that incorporates both the speed and accuracy of item responses are modeled. Our generalization is similar to that of the one-parameter logistic (or Rasch) model to the two-parameter logistic (or Birnbaum) model in item response theory. An expectation–maximization (EM) algorithm for estimating model parameters and standard errors was developed. Furthermore, methods to assess model fit are provided in the form of generalized residuals for item score functions and saddlepoint approximations to the density of the sum score. The presented methods were evaluated in a small simulation study, the results of which indicated good parameter recovery and reasonable type I error rates for the residuals. Finally, the methods were applied to two real data sets. It was found that the two-parameter SARM showed improved fit compared to the one-parameter SARM in both data sets.

Type
Original Paper
Copyright
Copyright © 2017 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Andersen, E.B., (1973). Conditional inference and multiple choice questionnaires, British Journal of Mathematical and Statistical Psychology, 26, 3144.CrossRefGoogle Scholar
Biehler, M., Holling, H., & Doebler, P., (2015). Saddlepoint approximations of the distribution of the person parameter in the two parameter logistic model. Psychometrika, 80, 665–688. https://doi.org/10.1007/s11336-014-9405-1.CrossRefGoogle Scholar
Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In Lord, F., Novick, M.(Eds.), Statistical theories of mental test scores Reading, MA: Addison-Wesley pp(397472.Google Scholar
Butler, R.W., (2007). Saddlepoint approximations with applications Cambridge: Cambridge University Press.CrossRefGoogle Scholar
De Boeck, P., Chen, H., Davison, M., (2017). Spontaneous and imposed speed of cognitive test responses, British Journal of Mathematical and Statistical Psychology, 70, 225237.CrossRefGoogle ScholarPubMed
Dempster, A. P., Laird, N. M., & Rubin, D. B., (1977). Maximum likelihood estimation from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 39, 1–38.Google Scholar
Fahrmeir, L., & Tutz, G., (2001). Multivariate statistical modelling based on generalized linear models (2nd edn). Berlin: Springer. https://doi.org/10.1007/978-1-4757-3454-6.CrossRefGoogle Scholar
Goldhammer, F., (2015). Measuring, ability, speed, or both? Challenges, psychometric solutions, and what can be gained from experimental control, Measurement, 13, 133164 4706043.Google ScholarPubMed
Haberman, S.J. (2006). Joint and conditional estimation for implicit models for tests with polytomous item scores (ETS Research Report RR-06-03). Princeton, NJ: Educational Testing Service. https://doi.org/10.1002/j.2333-8504.2006.tb02009.x.CrossRefGoogle Scholar
Haberman, S.J. (2013). A general program for item-response analysis that employs the stabilized Newton–Raphson algorithm (ETS research report RR-13-32). Princeton, NJ: Educational Testing Service. https://doi.org/10.1002/j.2333-8504.2013.tb02339.x.CrossRefGoogle Scholar
Haberman, S.J. (2016). Exponential family distributions relevant to IRT. In van der Linden, W. J. (Ed.), Handbook of item response theory, volume two: Statistical tools (pp. 4770). Boca Raton, FL: CRC Press.Google Scholar
Haberman, S.J., Sinharay, S., (2013). Generalized residuals for general models for contingency tables with application to item response theory, Journal of the American Statistical Association, 108, 14351444.CrossRefGoogle Scholar
Haberman, S.J., Sinharay, S., Chon, K.H., (2013). Assessing item fit for unidimensional item response theory models using residuals from estimated item response functions, Psychometrika, 78, 417440.CrossRefGoogle ScholarPubMed
Hooker, G., Finkelman, M., Schwartzman, A., (2009). Paradoxical results in multidimensional item response theory, Psychometrika, 74, 419442.CrossRefGoogle Scholar
Kim, S., (2012). A note on the reliability coefficients for item response model-based ability estimates, Psychometrika, 77, 153162.CrossRefGoogle Scholar
Kim, S., (2013). Generalization of the Lord–Wingersky algorithm to computing the distributions of summed test scores based on real-number item scores, Journal of Educational Measurement, 50, 381389.CrossRefGoogle Scholar
Lee, Y.H., Chen, H., (2011). A review of recent response-time analyses in educational testing, Psychological Test and Assessment Modeling, 3, 359379.Google Scholar
Lord, F.M., (1975). Formula scoring and number right scoring, Journal of Educational Measurement, 12, 711.CrossRefGoogle Scholar
Lord, F.M., Wingersky, M.S., (1984). Comparison of “IRT” true-score and equipercentile observed-score equatings, Applied Psychological Measurement, 8, 453461.CrossRefGoogle Scholar
Louis, T., (1982). Finding the observed information matrix when using the EM algorithm, Journal of the Royal Statistical Society. Series B (Methodological), 44, 226233.CrossRefGoogle Scholar
Luce, R. D., (1986). Response times. Oxford: Oxford University Press. https://doi.org/10.1093/acprof:oso/9780195070019.001.0001.Google Scholar
Maris, G., van der Maas, H.L.J., (2012). Speed-accuracy response models: Scoring rules based on response time and accuracy, Psychometrika, 77, 615633.CrossRefGoogle Scholar
Marsman, M., (2014). Plausible values in statistical inference. Doctoral dissertation, University of Twente, Enschede.CrossRefGoogle Scholar
Meng, X.L., Rubin, D., (1991). Using EM to obtain asymptotic variance-covariance matrices: The SEM algorithm, Journal of the American Statistical Association, 86, 899909.CrossRefGoogle Scholar
Naylor, J.C., Smith, A.F.M., (1982). Applications of a method for efficient computation of posterior distributions, Applied Statistics, 31, 214225.CrossRefGoogle Scholar
Ranger, J., Kuhn, J.T., (2012). A flexible latent trait model for response times in tests, Psychometrika, 77, 3147.CrossRefGoogle Scholar
Ranger, J., Kuhn, J.T., Gaviria, J.L., (2015). A race model for responses and response times in tests, Psychometrika, 80, 791810.CrossRefGoogle ScholarPubMed
Rasch, G., (1960). Probabilistic models for some intelligence and attainment tests Copenhagen: Paedagogike Institut.Google Scholar
Roskam, E.E. eds.Hambleton, R.K., van der Linden, W.J., (1997). Models for speed and time-limit tests, Handbook of modern item response theory, New York: Springer pp(187208.CrossRefGoogle Scholar
Rouder, J.N., Sun, D., Speckman, P.L., Lu, J., Zhou, D., (2003). A hierarchical Bayesian statistical framework for response time distributions, Psychometrika, 68, 589606.CrossRefGoogle Scholar
Spearman, C., (1927). The abilities of men London: MacMillan.Google Scholar
Thurstone, L.L., (1919). A scoring method for mental tests, Psychological Bulletin, 16, 235240.CrossRefGoogle Scholar
Thurstone, L.L., (1937). Ability, motivation, and speed, Psychometrika, 2, 249254.CrossRefGoogle Scholar
Tuerlinckx, F., de Boeck, P., (2005). Two interpretations of the discrimination parameter, Psychometrika, 70, 629650.CrossRefGoogle Scholar
Tuerlinckx, F., Molenaar, D., van der Maas, H.L.J., eds.van der Linden, W.J., (2016). Diffusion-based item response modeling, Handbook of item response theory, Boca Raton, FL: Chapman & Hall/CRC Press pp(283302.Google Scholar
van der Linden, W.J., (2007). A hierarchical framework for modeling speed and accuracy on test items, Psychometrika, 72, 287308.CrossRefGoogle Scholar
van der Linden, W.J., (2008). Using response times for item selection in adaptive testing, Journal of Educational and Behavioral Statistics, 33, 520.CrossRefGoogle Scholar
van der Linden, W.J., (2009). Conceptual issues in response-time modeling, Journal of Educational Measurement, 46, 247272.CrossRefGoogle Scholar
van der Maas, H., Wagenmakers, E.J., (2005). A psychometric analysis of chess expertise, American Journal of Psychology, 118, 2960.CrossRefGoogle ScholarPubMed
van Rijn, P.W., Ali, U.S., (2017). A comparison of item response models for accuracy and speed of item responses with applications to adaptive testing, British Journal of Mathematical and Statistical Psychology, 70, 317345.CrossRefGoogle ScholarPubMed
van Rijn, P. W., & Ali, U. S., (2018, in press). SARM: A computer program for estimating speed-accuracy response models (ETS Research Report). Princeton, NJ: Educational Testing Service.CrossRefGoogle Scholar
van Rijn, P.W., Rijmen, F., (2015). On the explaining-away phenomenon in multivariate latent variable models, British Journal of Mathematical and Statistical Psychology, 68, 122.CrossRefGoogle ScholarPubMed
Yuan, K.H., Cheng, Y., Patton, J., (2014). Information matrices and standard errors for MLEs of item parameters in IRT, Psychometrika, 79, 232254.CrossRefGoogle ScholarPubMed