Some Neglected Problems in IRT

Gerhard H. Fischer

doi:10.1007/BF02294324

Some Neglected Problems in IRT

Published online by Cambridge University Press: 01 January 2025

Gerhard H. Fischer

Show author details

Gerhard H. Fischer*: Affiliation:
University of Vienna
*: Requests for reprints should be sent to Gerhard H. Fischer, Institut für Psychologie, Universität Wien, Liebiggasse 5, A-1010 Wien, AUSTRIA.

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

The paper addresses three neglected questions from IRT. In section 1, the properties of the “measurement” of ability or trait parameters and item difficulty parameters in the Rasch model are discussed. It is shown that the solution to this problem is rather complex and depends both on general assumptions about properties of the item response functions and on assumptions about the available item universe. Section 2 deals with the measurement of individual change or “modifiability” based on a Rasch test. A conditional likelihood approach is presented that yields (a) an ML estimator of modifiability for given item parameters, (b) allows one to test hypotheses about change by means of a Clopper-Pearson confidence interval for the modifiability parameter, or (c) to estimate modifiability jointly with the item parameters. Uniqueness results for all three methods are also presented. In section 3, the Mantel-Haenszel method for detecting DIF is discussed under a novel perspective: What is the most general framework within which the Mantel-Haenszel method correctly detects DIF of a studied item? The answer is that this is a 2PL model where, however, all discrimination parameters are known and the studied item has the same discrimination in both populations. Since these requirements would hardly be satisfied in practical applications, the case of constant discrimination parameters, that is, the Rasch model, is the only realistic framework. A simple Pearson x2 test for DIF of one studied item is proposed as an alternative to the Mantel-Haenszel test; moreover, this test is generalized to the case of two items simultaneously studied for DIF.

Keywords

measurement IRT Rasch model measurement of change DIF Mantel-Haenszel statistic

Type: Original Paper
Information: Psychometrika , Volume 60 , Issue 4 , December 1995 , pp. 459 - 487

DOI: https://doi.org/10.1007/BF02294324 [Opens in a new window]
Copyright: Copyright © 1995 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

Footnotes

Presidential Address delivered at the 30th Annual Meeting of the Psychometric Society, 16–18 June 1995 in Minneapolis.

This research was supported in part by the Fonds zur Förderung der Wissenschaftlichen Forschung under Grant No. P10118-HIS.

References

Aczél, J. (1966). Lectures on functional equations and their applications, New York: Academic Press.Google Scholar

Alper, T. M. (1987). A classification of all order-preserving homeomorphism groups of the reals that satisfy finite uniqueness. Journal of Mathematical Psychology, 31, 135–154.CrossRef Google Scholar

Andersen, E. B. (1973). A goodness of fit test for the Rasch model. Psychometrika, 38, 123–140.CrossRef Google Scholar

Andersen, E. B. (1985). Estimating latent correlations between repeated testings. Psychometrika, 50, 3–16.CrossRef Google Scholar

Baker, F. B. (1992). Item response theory, New York: Marcel Dekker.Google Scholar

Bereiter, C. (1963). Some persisting dilemmas in the measurement of change. In Harris, C. W. (Eds.), Problems in measuring change (pp. 3–20). Madison: The University of Wisconsin Press.Google Scholar

Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee's ability. In Lord, F. M., Novick, M. R. (Eds.), Statistical theories of mental test scores (pp. 395–479). Reading, MA: Addison-Wesley.Google Scholar

Churchhouse, R. F. (1981). Handbook of applicable mathmatics, Vol. III, Chichester and New York: J. Wiley.Google Scholar

Colonius, H. (1979). Zur Eindeutigkeit der Parameter im Rasch-Modell [On the uniqueness of parameters in the Rasch model]. Psychologische Beiträge, 21, 414–416.Google Scholar

Cronbach, L. J., Furby, L. (1970). How should we measure change—or should we?. Psychological Bulletin, 74, 68–80.CrossRef Google Scholar

Embretson, S. E. (1991). A multidimensional latent trait model for measuring learning and change. Psychometrika, 56, 495–515.CrossRef Google Scholar

Fischer, G. H. (1972). A measurement model for the effect of mass-media. Acta Psychologica, 36, 207–220.CrossRef Google Scholar

Fischer, G. H. (1974). Einführung in die Theorie psychologischer Tests [Introduction to mental test theory. In German], Berne: Huber.Google Scholar

Fischer, G. H. (1976). Some probabilistic models for measuring change. In de Gruijter, D. N. M., van der Kamp, L. J. Th. (Eds.), Advances in psychological and educational measurement (pp. 97–110). New York: J. Wiley.Google Scholar

Fischer, G. H. (1981). On the existence and uniqueness of maximum-likelihood estimates in the Rasch model. Psychometrika, 46, 59–77.CrossRef Google Scholar

Fischer, G. H. (1983). Logistic latent trait models with linear constraints. Psychometrika, 48, 3–26.CrossRef Google Scholar

Fischer, G. H. (1987). Applying the principles of specific objectivity and generalizability to the measurement of change. Psychometrika, 52, 565–587.CrossRef Google Scholar

Fischer, G. H. (1988). Spezifische Objektivität: Eine wissenschaftstheoretische Grundlage des Rasch-Modells [Specific objectivity: A theoretical foundation of the Rasch model. In German]. In Kubinger, K. D. (Eds.), Moderne Testtheorie (pp. 87–111). Weinhein: Beltz.Google Scholar

Fischer, G. H. (1989). An IRT-based model for dichotomous longitudinal data. Psychometrika, 54, 599–624.CrossRef Google Scholar

Fischer, G. H. (1993). Notes on the Mantel-Haenszel procedure and another chi-squared test for the assessment of DIF. Methodika, 7, 88–100.Google Scholar

Fischer, G. H. (1995). Derivations of the Rasch model. In Fischer, G. H., Molenaar, I. W. (Eds.), Rasch models. Foundations, recent developments, and applications (pp. 15–38). New York: Springer-Verlag.Google Scholar

Fischer, G. H. (1995). The linear logistic test model. In Fischer, G. H., Molenaar, I. W. (Eds.), Rasch models. Foundations, recent developments, and applications (pp. 131–155). New York: Springer-Verlag.Google Scholar

Fischer, G. H., Parzer, P. (1991). An extension of the rating scale model with an application to the measurement of change. Psychometrika, 56, 637–651.CrossRef Google Scholar

Fischer, G. H., Ponocny, I. (1994). An extension of the partial credit model with an application to the measurement of change. Psychometrika, 59, 177–192.CrossRef Google Scholar

Glas, C. A. W., Verhelst, N. D. (1995). Testing the Rasch model. In Fischer, G. H., Molenaar, I. W. (Eds.), Rasch models. Foundations, recent developments, and applications (pp. 69–95). New York: Springer-Verlag.Google Scholar

Guttmann, G., Etlinger, S. C. (1991). Susceptibility to stress and anxiety in relation to performance, emotion, and personality: The ergopsychometric approach. In Spielberger, C. D., Sarason, I. G., Strelau, J., Brebner, J. M. T. (Eds.), Stress and anxiety, Vol. 13 (pp. 23–52). New York: Hemisphere Publishing.Google Scholar

Hambleton, R. K. (1989). Principles and selected applications of item response theory. In Linn, R. L. (Eds.), Educational measurement (pp. 147–200). London: Collier Macmillan.Google Scholar

Hamerle, A. (1979). Über die meßtheoretischen Grundlagen von Latent-Trait-Modellen [On measurement-theoretic foundations of latent trait models. In German.]. Archiv für Psychologie, 132, 19–39.Google Scholar

Hamerle, A. (1982). Latent-Trait-Modelle [Latent trait models], Weinheim: Beltz.Google Scholar

Harris, C. W. (1963). Problems in measuring change, Madison: The University of Wisconsin Press.Google Scholar

Holland, P. W., Thayer, D. T. (1988). Differential item functioning and the Mantel-Haenszel procedure. In Wainer, H., Braun, H. I. (Eds.), Test validity, Hillsdale, NJ: Lawrence Erlbaum.Google Scholar

Hulin, C. L., Drasgow, F., Parsons, C. K. (1983). Item response theory. Application to psychological measurement, Homewood, IL: Dow Jones-Irwin.Google Scholar

Irtel, H. (1987). On specific objectivity as a concept in measurement. In Roskam, E. E., Suck, R. (Eds.), Progress in mathematical psychology (pp. 35–45). Amsterdam: North-Holland.Google Scholar

Irtel, H. (1994). The uniqueness structure of simple latent trait models. In Fischer, G. H., Laming, D. (Eds.), Contributions to mathematical psychology, psychometrics, and methodology (pp. 265–275). New York: Springer-Verlag.CrossRef Google Scholar

Johnson, N. L., Kotz, S. (1969). Distributions in statistics: Discrete distributions, Vol. I, Boston: Houghton Mifflin.Google Scholar

Kempf, W. (1977). Dynamic models for the measurement of ‘traits’ in social behavior. In Kempf, W., Repp, B. H. (Eds.), Mathematical models for social psychology (pp. 14–58). Berne: Huber.Google Scholar

Krantz, D. H., Luce, R. D., Suppes, P., Tversky, A. (1971). Foundations of measurement, Vol. 1, New York/London: Academic Press.Google Scholar

Kubinger, K. D. (1988). Moderne Testtheorie [Modern test theory], Weinheim: Beltz.Google Scholar

Lord, F. M. (1963). Elementary models for measuring change. In Harris, C. W. (Eds.), Problems in measuring change (pp. 21–38). Madison: The University of Wisconsin Press.Google Scholar

Lord, F. M. (1980). Applications of items response theory to practical testing problems, Hillsdale, NJ: Lawrence Erlbaum.Google Scholar

Luce, R. D. (1990). Goals, achievements, and limitations of modern fundamental measurement theory. In Bock, H. H. (Eds.), Classification and related methods of data analysis (pp. 15–22). Amsterdam: North-Holland.Google Scholar

McLane, S., Birkoff, G. (1988). Algebra 3rd ed.,, New York: Chelsea.Google Scholar

Narens, L. (1981). On the scales of measurement. Journal of Mathematical Psychology, 24, 249–275.CrossRef Google Scholar

Pfanzagl, J. (1971). Theory of measurement, Würzburg and Vienna: Physica-Verlag.CrossRef Google Scholar

Pfanzagl, J. (1994). On item parameter estimation in certain latent trait models. In Fischer, G. H., Laming, D. (Eds.), Contributions to mathematical psychology, psychometrics, and methodology (pp. 249–263). New York: Springer-Verlag.CrossRef Google Scholar

Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests, Copenhagen: Pædagogiske Institut.Google Scholar

Rasch, G. (1961). On general laws and the meaning of measurement in psychology. Proceedings of the IV. Berkeley Symposium on mathematical statistics and probability, Vol. IV (pp. 321–333). Berkeley: University of California Press.Google Scholar

Rasch, G. (1967). An informal report on a theory of objectivity in comparisons. In van der Kamp, L. J. Th., Vlek, C. A. J. (Eds.), Measurement theory (pp. 1–19). Leyden: University of Leyden.Google Scholar

Rasch, G. (1968, September). A mathematical theory of objectivity and its consequences for model construction. Paper presented at the European Meeting on Statistics, Econometrics, and Management Science, Amsterdam, The Netherlands.Google Scholar

Rasch, G. (1972). Objectivitet i samfundsvidenskaberne et metodeproblem [Ojectivity in the social sciences as a methodological problem]. National-økonomisk Tidsskrift, 110, 161–196.Google Scholar

Rasch, G. (1977). On specific objectivity. An attempt at formalizing the request for generaliy and validity of scientific statements. In Blegvad, M. (Eds.), The Danish yearbook of philosophy (pp. 58–94). Copenhagen: Munksgaard.Google Scholar

Santner, T. J., Duffy, D. E. (1989). The statistical analysis of discrete data, New York: Springer-Verlag.CrossRef Google Scholar

Scheiblechner, H. (1995). Isotonic psychometric models (ISOP). Psychometrika, 60, 281–304.CrossRef Google Scholar

Stene, J. (1968). Einführung in Raschs Theorie psychologischer Messung [Introduction to Rasch's theory of psychological measurement]. In Fischer, G. H. (Eds.), Psychologische Testtheorie (pp. 229–268). Berne: Huber.Google Scholar

Steyer, R., Eid, M. (1993). Messen und Testen [Measurement and testing], Berlin: Springer-Verlag.CrossRef Google Scholar

Tutz, G. (1989). Latent Trait-Modelle für ordinale Beobachtungen [Latent trait models for ordinal data], Berlin: Springer-Verlag.CrossRef Google Scholar

Verhelst, N. D., Glas, C. A. W. (1993). A dynamic generalization of the Rasch model. Psychometrika, 58, 395–415.CrossRef Google Scholar

Verhelst, N. D., Glas, C. A. W. (1995). Dynamic generalizations of the Rasch model. In Fischer, G. H., Molenaar, I. W. (Eds.), Rasch models. Foundations, recent developments, and applications (pp. 181–201). New York: Springer-Verlag.Google Scholar

Wainer, H., Mislevy, R. (1990). Item response theory, item calibration and proficiency estimation. In Wainer, H. (Eds.), Computerized adaptive testing: A primer, Hillsdale, NJ: Lawrence Erlbaum.Google Scholar

Webster, H., Bereiter, C. (1963). The reliability of changes measured by mental test scores. In Harris, C. W. (Eds.), Problems in measuring change (pp. 39–59). Madison: The University of Wisconsin Press.Google Scholar

Wright, B. D., Stone, M. H. (1972). Best test design, Chicago: Mesa Press.Google Scholar

Article contents

Some Neglected Problems in IRT

Abstract

Keywords

Access options

Article purchase

Temporarily unavailable

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests