Book contents
- Frontmatter
- Contents
- Series Editor's Preface
- Acknowledgements
- Abbreviations
- Part I Basic concepts and statistics
- Part II Statistics for test analysis and improvement
- 4 Analyzing test tasks
- 5 Investigating reliability for norm-referenced tests
- 6 Investigating reliability for criterion-referenced tests
- Part III Statistics for test use
- Bibliography
- Appendix: Statistical tables
- Index
5 - Investigating reliability for norm-referenced tests
Published online by Cambridge University Press: 05 May 2010
- Frontmatter
- Contents
- Series Editor's Preface
- Acknowledgements
- Abbreviations
- Part I Basic concepts and statistics
- Part II Statistics for test analysis and improvement
- 4 Analyzing test tasks
- 5 Investigating reliability for norm-referenced tests
- 6 Investigating reliability for criterion-referenced tests
- Part III Statistics for test use
- Bibliography
- Appendix: Statistical tables
- Index
Summary
Introduction
When we give a language test, we want to be sure that the scores we obtain are consistent measures of the ability we want to assess. If the scores from a given test were perfectly consistent, test takers would obtain the same score under different conditions in the testing procedure. That is, their scores would be the same irrespective of, for example, when they took the test, which particular set of test tasks they took, or how their responses were scored. The fundamental definition of reliability, then, is consistency of measures across different conditions in the measurement procedure.
Test scores can be affected by a variety of factors in addition to the ability we want to assess, such as differences in administrative procedures, changes in test takers over time, differences in different forms of the test, and differences in raters. These variations in the testing conditions are sources of inconsistency, or measurement error. Looking at reliability from this perspective, we can say that scores are reliable to the extent that they are free from measurement error. Or, to put it another way, test scores that reflect mostly the ability we want to measure, rather than measurement error, will be relatively reliable. The most important step we can take to minimize measurement error is to identify and try to control the sources of measurement error in the way we design and develop our test tasks. In this regard, Bachman & Palmer (1996, Chapter 7) list a number of questions that test developers can ask during the test development process to help ensure that test tasks will yield consistent scores.
- Type
- Chapter
- Information
- Statistical Analyses for Language Assessment Book , pp. 153 - 191Publisher: Cambridge University PressPrint publication year: 2004
- 1
- Cited by