Book contents
- Frontmatter
- Contents
- Series Editor's Preface
- Acknowledgements
- Abbreviations
- Part I Basic concepts and statistics
- Part II Statistics for test analysis and improvement
- 4 Analyzing test tasks
- 5 Investigating reliability for norm-referenced tests
- 6 Investigating reliability for criterion-referenced tests
- Part III Statistics for test use
- Bibliography
- Appendix: Statistical tables
- Index
6 - Investigating reliability for criterion-referenced tests
Published online by Cambridge University Press: 05 May 2010
- Frontmatter
- Contents
- Series Editor's Preface
- Acknowledgements
- Abbreviations
- Part I Basic concepts and statistics
- Part II Statistics for test analysis and improvement
- 4 Analyzing test tasks
- 5 Investigating reliability for norm-referenced tests
- 6 Investigating reliability for criterion-referenced tests
- Part III Statistics for test use
- Bibliography
- Appendix: Statistical tables
- Index
Summary
Introduction
The approaches to estimating reliability discussed in Chapter 5 have been developed largely for use with norm-referenced (NR) tests and are generally not appropriate for use with criterion-referenced (CR) tests for several reasons. Unlike NR scores, which are intended for use in making relative decisions, CR tests are intended for situations in which the decisions to be made are absolute (see Chapter 1 for a discussion of CR and NR tests and relative and absolute decisions). NR scores are thus interpreted with reference to the relative standing of individuals within a particular group, whereas scores from CR tests are interpreted with reference to an absolute criterion that is independent of the groups or individuals taking the test. This criterion may be defined in terms of levels of ability (e.g. novice – expert, beginning – advanced) or in terms of a well specified domain of content, such as the content of a course syllabus. As discussed in Chapter 5, NR reliability estimates are based on the variance component for relative error, and provide estimates of how consistently test scores distinguish among different test takers. NR reliability estimates thus do not provide direct information about how dependable the test scores are as indicators of test takers’ levels of ability with respect to criterion ability levels, or their degrees of mastery of the criterion content/ability domain.
With CR tests, we typically indicate one or more criterion levels of performance or ability that indicate degrees of mastery and that can be used to classify test takers into groups.
- Type
- Chapter
- Information
- Statistical Analyses for Language Assessment Book , pp. 192 - 206Publisher: Cambridge University PressPrint publication year: 2004