Skip to main content Accessibility help
×
Hostname: page-component-78c5997874-4rdpn Total loading time: 0 Render date: 2024-11-06T10:08:01.528Z Has data issue: false hasContentIssue false

2 - The Demand for Cognitive Diagnostic Assessment

Published online by Cambridge University Press:  23 November 2009

Kristen Huff
Affiliation:
Senior Director, K-12 Research & Psychometrics, The College Board, New York
Dean P. Goodman
Affiliation:
Assessment Consultant
Jacqueline Leighton
Affiliation:
University of Alberta
Mark Gierl
Affiliation:
University of Alberta
Get access

Summary

In this chapter, we explore the nature of the demand for cognitive diagnostic assessment (CDA) in K–12 education and suggest that the demand originates from two sources: assessment developers who are arguing for radical shifts in the way assessments are designed, and the intended users of large-scale assessments who want more instructionally relevant results from these assessments. We first highlight various themes from the literature on CDA that illustrate the demand for CDA among assessment developers. We then outline current demands for diagnostic information from educators in the United States by reviewing results from a recent national survey we conducted on this topic. Finally, we discuss some ways that assessment developers have responded to these demands and outline some issues that, based on the demands discussed here, warrant further attention.

THE DEMAND FOR COGNITIVE DIAGNOSTIC ASSESSMENT FROM ASSESSMENT DEVELOPERS

To provide the context for assessment developers' call for a revision of contemporary assessment practices that, on the whole, do not operate within a cognitive framework, we offer a perspective on existing CDA literature, and we outline the differences between psychometric and cognitive approaches to assessment design. The phrases working within a cognitive framework, cognitively principled assessment design, and cognitive diagnostic assessment are used interchangeably throughout this chapter. They can be generally defined as the joint practice of using cognitive models of learning as the basis for principled assessment design and reporting assessment results with direct regard to informing learning and instruction.

Type
Chapter
Information
Cognitive Diagnostic Assessment for Education
Theory and Applications
, pp. 19 - 60
Publisher: Cambridge University Press
Print publication year: 2007

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Bennett, R. E. (1999). Using new technology to improve assessment. Educational Measurement: Issues and Practice, 18(3), 5–12.CrossRefGoogle Scholar
Bennett, R. E., & Bejar, , I., I. (1998). Validity and automated scoring: It's not only the scoring. Educational Measurement, 4, 9–17.CrossRefGoogle Scholar
Bennett, R. E., Steffen, M., Singley, M. K., Morley, M., & Jacquemin, D. (1997). Evaluating an automatically scorable, open-ended response type for measuring mathematical reasoning in computer-adaptive tests. Journal of Educational Measurement, 34(2), 62–176.CrossRefGoogle Scholar
Black, P., & Wiliam, D. (1998a). Assessment and classroom learning. Assessment in Education: Principles, Policy and Practice, 5(1), 7–74.CrossRefGoogle Scholar
Black, P., & Wiliam, D. (1998b). Inside the black box: Raising standards through classroom assessment. Phi Delta Kappan, 80(2), 139–148.Google Scholar
British Columbia (BC) Ministry of Education. (2005a). 2004/05 Service plan report. Retrieved June 26, 2006, from http://www.bcbudget.gov.bc.ca/Annual_Reports/2004_2005/educ/educ.pdf.
British Columbia (BC) Ministry of Education. (2005b). Handbook of procedures for the graduation program. Retrieved June 26, 2006, from http://www.bced./gov.bc.ca/exams/handbook/handbook_procedures.pdf.
British Columbia (BC) Ministry of Education. (2006). E-assessment: Grade 10 and 11 – Administration. Retrieved June 26, 2006, from http://www.bced.gov.bc.ca/eassessment/gradprog.htm.
Buck, G., VanEssen, T., Tatsuoka, K., Kostin, I., Lutz, D., & Phelps, M. (1998). Development, selection and validation of a set of cognitive and linguistic attributes for the SAT I verbal: Sentence completion section (ETS Research Report [RR-98–23]). Princeton, NJ: Educational Testing Service.Google Scholar
Chipman, S. F., Nichols, P. D., & Brennan, R. L. (1995). Introduction. In Nichols, P. D., Chipman, S. F., & Brennan, R. L. (Eds.), Cognitively diagnostic assessment (pp. 1–18). Mahwah, NJ: Erlbaum.Google Scholar
College Board. (2006). Passage-based reading. Retrieved January 15, 2006, from http://www.collegeboard.com/student/testing/sat/prep_one/passage_based/pracStart.html.
Cronbach, L. J. (1971). Test validation. In Thorndike, R. L. (Ed.), Educational measurement (3rd ed., pp. 443–507). Washington, D.C.: American Council on Education.Google Scholar
Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52, 281–302.CrossRefGoogle ScholarPubMed
DiBello, L. V. (2002, April). Skills-based scoring models for the PSAT/NMSQT. Paper presented at the annual meeting of the National Council on Measurement in Education, New Orleans.
DiBello, L. V., & Crone, C. (2001, April). Technical methods underlying the PSAT/NMSQT enhanced score report. Paper presented at the annual meeting of the National Council on Measurement in Education, Seattle.
DiBello, L. V., Stout, W. F., & Roussos, L. A. (1995). Unified cognitive/psychometric diagnostic assessment likelihood-based classification techniques. In Nichols, P. D., Chipman, S. F., & Brennan, R. L. (Eds.), Cognitively diagnostic assessment (pp. 361–390). Hillsdale, NJ: Erlbaum.Google Scholar
Embretson (Whitely), S. (1983). Construct validity: Construct representation versus nomothetic span. Psychological Bulletin, 93(1), 179–197.Google Scholar
Embretson, S. E. (1999). Generating items during testing: Psychometric issues and models. Psychometrika, 64(4), 407–433.CrossRefGoogle Scholar
Feltovich, P. J., Spiro, R. J., & Coulson, R. L. (1993). Learning, teaching, and testing for complex conceptual understanding. In Frederiksen, N., Mislevy, R. J., & Bejar, I. I. (Eds.), Test theory for a new generation of tests (pp. 181–218). Hillsdale, NJ: Erlbaum.Google Scholar
Fischer, G. H., & Formann, A. K. (1982). Some applications of logistic latent trait models with linear constraints on the parameters. Applied Psychological Measurement, 6(4), 397–416.CrossRefGoogle Scholar
Goodman, D. P., & Hambleton, , , R. K. (2004). Student test score reports and interpretive guides: Review of current practices and suggestions for future research. Applied Measurement in Education, 17(2), 145–220.CrossRefGoogle Scholar
Goodman, D. P., & Huff, K. (2006). Findings from a national survey of teachers on the demand for and use of diagnostic information from large-scale assessments. Manuscript in preparation, College Board, New York.Google Scholar
Gorin, J. (2005). Manipulating processing difficulty of reading comprehension questions: The feasibility of verbal item generations. Journal of Educational Measurement, 42, 351–373.CrossRefGoogle Scholar
Hambleton, R. K., & Slater, S. (1997). Are NAEP executive summary reports understandable to policy makers and educators? (CSE Technical Report 430). Los Angeles: National Center for Research on Evaluation, Standards, and Student Teaching.Google Scholar
Harcourt Assessment. (2006a). Stanford achievement test series, tenth edition: Critical, action-oriented information. Retrieved January 29, 2006, from http://harcourtassessment.com/haiweb/Cultures/en-US/dotCom/Stanford10.com/Subpages/Stanford+10+-+Sample+Reports.htm.
Harcourt Assessment. (2006b). Support materials for parents, students, and educators. Retrieved January 29, 2006, from http://harcourtassessment.com/haiweb/Cultures/en-US/dotCom/Stanford10.com/Subpages/Stanford+10+-+Support+Materials.htm.
Huff, K. (2004, April). A practical application of evidence centered design principles: Coding items for skills. In K. Huff (Organizer), Connecting curriculum and assessment through meaningful score reports. Symposium conducted at the meeting of the National Council on Measurement in Education, San Diego.
Huff, K. L., & Sireci, S. G. (2001). Validity issues in computer-based testing. Educational Measurement: Issues and Practice, 20(4), 16–25.CrossRefGoogle Scholar
Impara, J. C., Divine, K. P., Bruce, F. A., Liverman, M. R., & Gay, A. (1991). Does interpretive test score information help teachers?Educational Measurement: Issues and Practice, 10(4), 16–18.CrossRefGoogle Scholar
Jaeger, R. (1998). Reporting the results of the National Assessment of Educational Progress (NVS NAEP Validity Studies). Washington, DC: American Institutes for Research.CrossRefGoogle Scholar
Kintsch, W. (1998). Comprehension: A paradigm for cognition.Cambridge, MA: Cambridge University Press.Google Scholar
Leighton, J. P., Gierl, M. J., & Hunka, S. (2004). The attribute hierarchy method for cognitive assessment: A variation on Tatsuoka's rule-space approach. Journal of Educational Measurement, 41, 205–237.CrossRefGoogle Scholar
Luecht, R. M. (2002, April). From design to delivery: Engineering the mass production of complex performance assessments. Paper presented at the annual meeting of the National Council on Measurement in Education, New Orleans.
Massachusetts Department of Education. (2004). 2004 MCAS technical report. Retrieved January 15, 2006, from http://www.doe.mass.edu/mcas/2005/news/04techrpt.doc#_Toc123531775.
Massachusetts Department of Education. (2005). The Massachusetts comprehensive assessment system: Guide to the 2005 MCAS for parents/guardians. Malden: Author.
Mathematical Sciences Education Board. (1993). Measuring what counts: A conceptual guide for mathematics assessment.Washington, DC: National Academy Press.
Messick, S. (1989). Validity. In Linn, R. L. (Ed.), Educational measurement (3rd ed., pp. 13–103). New York: American Council on Education/Macmillan.Google Scholar
Mislevy, R. J. (1996). Test theory reconceived. Journal of Educational Measurement, 33(4), 379–416.CrossRefGoogle Scholar
Mislevy, R. J., & Riconscente, M. M. (2005). Evidence-centered assessment design: Layers, structures, and terminology (PADI Technical Report 9). Menlo Park, CA: SRI International and University of Maryland. Retrieved May 1, 2006, from http://padi.sri.com/downloads/TR9_ECD.pdf.
Missouri Department of Elementary and Secondary Education. (2005). Missouri assessment program: Guide to interpreting results. Retrieved June 24, 2006, from http://dese.mo.gov/divimprove/assess/GIR_2005.pdf.
National Research Council (NRC). (1999). How people learn: Brain, mind, experience, and school.Washington, DC: National Academy Press.
National Research Council (NRC). (2001). Knowing what students know: The science and design of educational assessment.Washington, DC: National Academy Press.
National Research Council (NRC). (2002). Learning and understanding: Improving advanced study of mathematics and science in U.S. high schools.Washington, DC: National Academy Press.
New Jersey Department of Education. (2006). Directory of test specifications and sample items for ESPA, GEPA and HSPA in language arts literacy. Retrieved June 24, 2006, from http://www.njpep.org/assessment/TestSpecs/LangArts/TOC.html.
Nichols, P. D. (1993). A framework for developing assessments that aid instructional decisions (ACT Research Report 93–1). Iowa City, IA:American College Testing.Google Scholar
Nichols, P. D. (1994). A framework for developing cognitively diagnostic assessments. Review of Educational Research, 64, 575–603.CrossRefGoogle Scholar
Nitko, A. J. (1989). Designing tests that are integrated with instruction. In Linn, R. L. (Ed.), Educational measurement (3rd ed., pp. 447–474). New York: American Council on Education/Macmillan.Google Scholar
No Child Left Behind (NCLB) Act of 2001, Pub. L. No. 107-110, § 1111, 115 Stat. 1449–1452 (2002).
Notar, C. E., Zuelke, D. C., Wilson, J. D., & Yunker, B. D. (2004). The table of specifications: Insuring accountability in teacher made tests. Journal of Instructional Psychology, 31(2), 115–129.Google Scholar
O'Callaghan, R., Morley, M., & Schwartz, A. (2004, April). Developing skill categories for the SAT® math section. In K. Huff (Organizer), Connecting curriculum and assessment through meaningful score reports. Symposium conducted at the meeting of the National Council on Measurement in Education, San Diego.
Ohio Department of Education. (2005). Diagnostic guidelines. Retrieved February 1, 2006, from http://www.ode.state.oh.us/proficiency/diagnostic_achievement/Diagnostics_PDFs/Diagnostic_Guidelines_9--05.pdf.
O'Neil, T., Sireci, , Huff, S. G., , K. L. (2004). Evaluating the content validity of a state-mandated science assessment across two successive administrations of a state-mandated science assessment. Educational Assessment and Evaluation, 9(3–4), 129–151.CrossRefGoogle Scholar
Pellegrino, J. W. (2002). Understanding how students learn and inferring what they know: Implications for the design of curriculum, instruction, and assessment. In Smith, M. J. (Ed.), NSF K–12 Mathematics and science curriculum and implementation centers conference proceedings (pp. 76–92). Washington, DC: National Science Foundation and American Geological Institute.Google Scholar
Pellegrino, J. W., Baxter, G. P., & Glaser, R. (1999). Addressing the “two disciplines” problem: Linking theories of cognition and learning with assessment and instructional practice. Review of Research in Education, 24, 307–353.Google Scholar
Perfetti, C. A. (1985). Reading ability. In Sternberg, R. J. (Ed.), Human abilities: An information-processing approach (pp. 31–58). New York: W. H. Freeman.Google Scholar
Perfetti, C. A. (1986). Reading ability.New York: Oxford University Press.Google Scholar
Riconscente, M. M., Mislevy, R. J., & Hamel, L. (2005). An introduction to PADI task templates (PADI Technical Report 3). Menlo Park, CA: SRI International and University of Maryland. Retrieved May 1, 2006, from http://padi.sri.com/downloads/TR3_Templates.pdf.
Sheehan, K. M. (1997). A tree-based approach to proficiency scaling and diagnostic assessment. Journal of Educational Measurement, 34(4), 333–352.CrossRefGoogle Scholar
Sheehan, K. M., Ginther, , Schedl, A., M., (1999). Development of a proficiency scale for the TOEFL reading comprehension section (Unpublished ETS Research Report). Princeton, NJ: Educational Testing Service.Google Scholar
Sireci, S. G., & Zenisky, A. L. (2006). Innovative item formats in computer-based testing: In pursuit of improved construct representation. In Downing, S. M. & Haladyna, T. M. (Eds.), Handbook of test development (pp. 329–348). Mahwah, NJ: Erlbaum.Google Scholar
Snow, R. E., & Lohman, D. F. (1989). Implications of cognitive psychology for educational measurement. In Linn, R. L. (Ed.), Educational measurement (3rd ed., pp. 263–331). New York: American Council on Education/Macmillan.Google Scholar
Steinberg, L. S., Mislevy, R. J., Almond, R. G., Baird, A. B., Cahallan, C., DiBello, L. V., Senturk, D., Yan, D., Chernick, H., & Kindfield, A. C. H. (2003). Introduction to the Biomass project: An illustration of evidence-centered assessment design and delivery capability (CRESST Technical Report 609). Los Angeles: Center for the Study of Evaluation, CRESST, UCLA.Google Scholar
Stiggins, R. (2001). The unfulfilled promise of classroom assessment. Educational Measurement: Issues and Practice, 20(3), 5–15.CrossRefGoogle Scholar
Tatsuoka, K. K. (1983). Rule space: An approach for dealing with misconceptions based on item response theory. Journal of Educational Measurement, 20, 345–354.CrossRefGoogle Scholar
Tatsuoka, K. K. (1995). Architecture of knowledge structures and cognitive diagnosis: A statistical pattern recognition and classification approach. In Nichols, P. D., Chipman, S. F., & Brennan, R. L. (Eds.), Cognitively diagnostic assessment (pp. 327–359). Hillsdale, NJ: Erlbaum.Google Scholar
Thissen, D., & Edwards, M. C. (2005, April). Diagnostic scores augmented using multidimensional item response theory: Preliminary investigation of MCMC strategies. Paper presented at the annual meeting of the National Council on Measurement in Education, Montreal.
VanderVeen, A. (2004, April). Toward a construct of critical reading for the new SAT. In K. Huff (Organizer), Connecting curriculum and assessment through meaningful score reports. Symposium conducted at the meeting of the National Council on Measurement in Education, San Diego.
VanderVeen, A., Huff, K., Gierl, M., McNamara, D. S., Louwerse, M., & Graesser, A. (2007). Developing and validating instructionally relevant reading competency profiles measured by the critical reading section of the SAT. In McNamara, D. S. (Ed.), Reading comprehension strategies: Theory, interventions, and technologies. Mahwah, NJ: Erlbaum.Google Scholar
Wainer, H. (1997). Improving tabular displays: With NAEP tables as examples and inspirations. Journal of Educational and Behavioral Statistics, 22(1), 1–30.CrossRefGoogle Scholar
Wainer, H., Hambleton, R. K., & Meara, K. (1999). Alternative displays for communicating NAEP results: A redesign and validity study. Journal of Educational Measurement, 36(4), 301–335.CrossRefGoogle Scholar
Wainer, H., Vevea, J. L., Camacho, F., Reeve, B. B., Rosa, K., Nelson, L., Swygert, K. A., & Thissen, D. (2001). Augmented scores: “Borrowing strength” to compute scores based on small numbers of items. In Thissen, D. & Wainer, H. (Eds.), Test scoring (pp. 343–387). Hillsdale, NJ: Erlbaum.Google Scholar
Washington State Office of Superintendent of Public Instruction. (2006). Test and item specifications for grades 3–high school reading WASL. Retrieved June 24, 2006, from http://www.k12.wa.us/Assessment/WASL/Readingtestspecs/TestandItemSpecsv2006.pdf.

Save book to Kindle

To save this book to your Kindle, first ensure [email protected] is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×