Hostname: page-component-745bb68f8f-mzp66 Total loading time: 0 Render date: 2025-01-07T15:02:34.732Z Has data issue: false hasContentIssue false

Optimal and Most Exact Confidence Intervals for Person Parameters in Item Response Theory Models

Published online by Cambridge University Press:  01 January 2025

Anna Doebler*
Affiliation:
Fachbereich Psychologie und Sportwissenschaft (FB 7), Institut für Psychologie, Westfälische Wilhelms-Universität
Philipp Doebler
Affiliation:
Fachbereich Psychologie und Sportwissenschaft (FB 7), Institut für Psychologie, Westfälische Wilhelms-Universität
Heinz Holling
Affiliation:
Fachbereich Psychologie und Sportwissenschaft (FB 7), Institut für Psychologie, Westfälische Wilhelms-Universität
*
Requests for reprints should be sent to Anna Doebler, Fachbereich Psychologie und Sportwissenschaft (FB 7), Institut für Psychologie, Westfälische Wilhelms-Universität, Fliednerstr. 21, 48149 Münster, Germany. E-mail: [email protected]

Abstract

The common way to calculate confidence intervals for item response theory models is to assume that the standardized maximum likelihood estimator for the person parameter θ is normally distributed. However, this approximation is often inadequate for short and medium test lengths. As a result, the coverage probabilities fall below the given level of significance in many cases; and, therefore, the corresponding intervals are no longer confidence intervals in terms of the actual definition. In the present work, confidence intervals are defined more precisely by utilizing the relationship between confidence intervals and hypothesis testing. Two approaches to confidence interval construction are explored that are optimal with respect to criteria of smallness and consistency with the standard approach.

Type
Original Paper
Copyright
Copyright © The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

This work was supported by a Grant of the Studienstiftung des Deutschen Volkes.

References

Agresti, A., & Coull, B. (1998). Approximate is better than “exact” for interval estimation of binomial proportions. American Statistician, 52, 119126Google Scholar
Agresti, A., Gottard, A., Berger, R., Casella, G., Brown, L., Tony Cai, T., DasGupta, A., Gelman, A., Thompson, E., Geyer, C., & Meeden, G. (2005). Discussion: fuzzy and randomized confidence intervals and P-values. Statistical Science, 20, 367387CrossRefGoogle Scholar
Blyth, C., & Still, H. (1983). Binomial confidence intervals. Journal of the American Statistical Association, 78, 108116CrossRefGoogle Scholar
Bock, R., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: application of an EM algorithm. Psychometrika, 46, 443459CrossRefGoogle Scholar
Brown, L., Cai, T., & DasGupta, A. (2001). Interval estimation for a binomial proportion. Statistical Science, 16, 101117CrossRefGoogle Scholar
Geyer, C., & Meeden, G. (2005). Fuzzy and randomized confidence intervals and P-values. Statistical Science, 20, 358366Google Scholar
Ghosh, B. (1979). A comparison of some approximate confidence intervals for the binomial parameter. Journal of the American Statistical Association, 74, 894900CrossRefGoogle Scholar
Haberman, S. (1977). Maximum likelihood estimates in exponential response models. Annals of Statistics, 5, 815841CrossRefGoogle Scholar
Hornke, L. (1999). Item generation models for higher order cognitive functions. In Sidney, I. (Eds.), Item generation, Hillsdale: ErlbaumGoogle Scholar
Hornke, L. (2000). Item response times in computerized adaptive testing. Psicológica, 21, 175190Google Scholar
Hornke, L., & Habon, M. (1986). Rule-based item bank construction and evaluation within the linear logistic framework. Applied Psychological Measurement, 10, 369380CrossRefGoogle Scholar
Hornke, L., Küppers, A., & Etzel, S. (2000). Konstruktion und Evaluation eines adaptiven Matrizentests. Diagnostica, 46, 182188CrossRefGoogle Scholar
Hornke, L., Rettig, K., & Etzel, S. (1999). AMT Adaptiver Matrizentest. German language computer adaptive test. Google Scholar
Klauer, K.C. (1991). Exact and best confidence intervals for the ability parameter of the Rasch model. Psychometrika, 56(3), 535547CrossRefGoogle Scholar
Lord, F.M. (1983). Unbiased estimators of ability parameters, of their variance, and of their parallel-forms reliability. Psychometrika, 48(2), 233245CrossRefGoogle Scholar
Lumsden, J. (1976). Test theory. Annual Review of Psychology, 27(1), 251280CrossRefGoogle Scholar
Nogami, Y., & Hayashi, N. (2010). A Japanese adaptive test of English as a foreign language: developmental and operational aspects. In van der Linden, W., Glas, C. (Eds.), Elements of adaptive testing, Berlin: Springer 191211Google Scholar
Pratt, J. (1961). Length of confidence intervals. Journal of the American Statistical Association, 56, 549567CrossRefGoogle Scholar
Shao, J. (2003). Mathematical statistics, (2nd ed.). New York: SpringerCrossRefGoogle Scholar
Walter, O. (2010). Adaptive tests for measuring anxiety and depression. In van der Linden, W., & Glas, C. (Eds.), Elements of adaptive testing, Berlin: Springer 123136Google Scholar