Skip to main content Accessibility help
×
Hostname: page-component-cd9895bd7-7cvxr Total loading time: 0 Render date: 2024-12-22T21:08:46.990Z Has data issue: false hasContentIssue false

8 - Evaluating Test and Survey Items for Bias Across Languages and Cultures

Published online by Cambridge University Press:  05 June 2012

David Matsumoto
Affiliation:
San Francisco State University
Fons J. R. van de Vijver
Affiliation:
Universiteit van Tilburg, The Netherlands
Get access

Summary

Introduction

The world is growing smaller at a rapid rate in this 21st century, and it is little wonder that interest and activity in cross-cultural research are at their peak. Examples of cross-cultural research activities include international comparisons of educational achievement, the exploration of personality constructs across cultures, and investigations of employees’ opinions, attitudes, and skills by multinational companies. In many, if not all, of these instances, the research involves measuring psychological attributes across people who have very different cultural backgrounds and often function using different languages. This cultural and linguistic diversity poses significant challenges for researchers who strive for standardization of measures across research participants. In fact, the backbone of scientific research in psychology – standardization of measures – may lead to significant biases in the interpretation of results if the measuring instruments do not take linguistic and cultural differences into account.

The International Test Commission (ITC) has long pointed out problems in measuring educational and psychological constructs across languages and cultures. Such problems are also well documented in the Standards for Educational and Psychological Testing (American Educational Research Association [AERA], American Psychological Association, & National Council on Measurement in Education, 1999). For example, the Guidelines for Adapting Educational and Psychological Tests (Hambleton, 2005; ITC, 2001) provide numerous guidelines for checking the quality of measurement instruments when they are adapted for use across languages. These guidelines include careful evaluation of the translation process and statistical analysis of test and item response data to evaluate test and item comparability. Many of these guidelines are echoed by the aforementioned Standards. Table 8.1 presents some brief excerpts from the Guidelines and Standards that pertain to maximizing measurement equivalence across languages and cultures while ruling out issues of measurement bias. As can be seen from Table 8.1, both qualitative and quantitative procedures are recommended to comprehensively evaluate test comparability across languages. The qualitative procedures involve use of careful translation and adaptation designs and comprehensive evaluation of the different language versions of a test. Quantitative procedures include the use of dimensionality analyses to evaluate construct equivalence, differential predictive validity to evaluate the consistency of test-criterion relationships across test versions, and differential item functioning procedures to evaluate potential item bias.

Type
Chapter
Information
Publisher: Cambridge University Press
Print publication year: 2010

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Allalouf, A.Hambleton, R. K.Sireci, S. G. 1999 Identifying the sources of differential item functioning in translated verbal itemsJournal of Educational Measurement 36 185CrossRefGoogle Scholar
1999
Angoff, W. H. 1972 Use of difficulty and discrimination indices for detecting item biasBerk, R. A.Handbook of methods for detecting test bias96BaltimoreJohns Hopkins University PressGoogle Scholar
Angoff, W. H.Cook, L. L. 1988 (Report No. 88-2New YorkCollege Entrance Examination BoardGoogle Scholar
Angoff, W. H.Modu, C. C. 1973 Equating the scores of the Prueba de Aptitud Academica and the Scholastic Aptitude TestNew YorkCollege Entrance Examination BoardGoogle Scholar
Brislin, R. W. 1970 Back-translation for cross-cultural researchJournal of Cross-Cultural Psychology 1 185CrossRefGoogle Scholar
Budgell, G.Raju, N.Quartetti, D. 1995 Analysis of differential item functioning in translated assessment instrumentsApplied Psychological Measurement 19 309CrossRefGoogle Scholar
Camilli, G.Shepard, L. A. 1994 Methods for identifying biased test itemsThousand Oaks, CASageGoogle Scholar
Clauser, B. E.Mazor, K. M. 1998 Using statistical procedures to identify differentially functioning test itemsEducational Measurement: Issues and Practice 17 31CrossRefGoogle Scholar
Cronbach, L. J.Meehl, P. E. 1955 Construct validity in psychological testsPsychological Bulletin 52 281CrossRefGoogle ScholarPubMed
Day, S. X.Rounds, J. 1998 Universality of vocational interest structure among racial and ethnic minoritiesAmerican Psychologist 53 728CrossRefGoogle Scholar
Dorans, N. J.Holland, P. W. 1993 DIF detection and description: Mantel–Haenszel and standardizationHolland, P. W.Wainer, H.Differential item functioning35Hillsdale, NJErlbaumGoogle Scholar
Dorans, N. J.Kulick, E. 1986 Demonstrating the utility of the standardization approach to assessing unexpected differential item performance on the Scholastic Aptitude TestJournal of Educational Measurement 23 355CrossRefGoogle Scholar
Fischer, G. 1993 Notes on the Mantel–Haenszel procedure and another chi-squared test for the assessment of DIFMethodika 7 88Google Scholar
Geisinger, K. F. 1994 Cross-cultural normative assessment: Translation and adaptation issues influencing the normative interpretation of assessment instrumentsPsychological Assessment 6 304CrossRefGoogle Scholar
Gierl, M. J.Khaliq, S. N. 2001 Identifying sources of differential item and bundle functioning on translated achievement tests: A confirmatory analysisJournal of Educational Measurement 38 164CrossRefGoogle Scholar
Hambleton, R. K. 1993 Translating achievement tests for use in cross-national studiesEuropean Journal of Psychological Assessment 9 57Google Scholar
Hambleton, R. K. 1994 Guidelines for adapting educational and psychological tests: A progress reportEuropean Journal of Psychological Assessment 10 229Google Scholar
Hambleton, R. K. 2005 Issues, designs, and technical guidelines for adapting tests into multiple languages and culturesHambleton, R. K.Merenda, P.Spielberger, C.Adapting educational and psychological tests for cross-cultural assessment3Hillsdale, NJErlbaumGoogle Scholar
Hambleton, R. K.Sireci, S. G.Robin, F. 1999 Adapting credentialing exams for use in multiple languagesCLEAR Exam Review 10 24Google Scholar
Hauger, J. B.Sireci, S. G. 2008 Detecting differential item functioning across examinees tested in their dominant language and examinees tested in a second languageInternational Journal of Testing 8 237CrossRefGoogle Scholar
Holland, P. W.Thayer, D. T. 1988 Differential item functioning and the Mantel–Haenszel procedureWainer, H.Braun, H. I.Test validity129Hillsdale, NJErlbaumGoogle Scholar
Holland, P. W.Wainer, H. 1993 Differential item functioningHillsdale, NJErlbaumGoogle Scholar
International Test Commission 2001 International Test Commission guidelines for test adaptationLondonAuthorGoogle Scholar
Jodoin, M. G.Gierl, M. J. 2001 Evaluating power and Type I error rates using an effect size with the logistic regression procedure for DIFApplied Measurement in Education 14 329CrossRefGoogle Scholar
Lord, F. M. 1980 Applications of item response theory to practical testing problemsHillsdale, NJErlbaumGoogle Scholar
Mantel, N.Haenszel, W. 1959 Statistical aspects of the analysis of data from retrospective studies of diseaseJournal of the National Cancer Institute 22 19Google Scholar
Millsap, R. E.Everson, H.T. 1993 Methodology review: Statistical approaches for assessing measurement biasApplied Psychological Measurement 17 297CrossRefGoogle Scholar
Muniz, J.Hambleton, R. K.Xing, D. 2001 Small sample studies to detect flaws in test translationInternational Journal of Testing 1 115CrossRefGoogle Scholar
Penfield, R. D. 2005 DIFAS: Differential item functioning analysis systemApplied Psychological Measurement 29 150CrossRefGoogle Scholar
Potenza, M. T.Dorans, N. J. 1995 DIF assessment for polytomously scored items: A framework for classification and evaluationApplied Psychological Measurement 19 23CrossRefGoogle Scholar
Raju, N. S. 1988 The area between two item characteristic curvesPsychometrika 53 495CrossRefGoogle Scholar
Raju, N. S. 1990 Determining the significance of estimated signed and unsigned areas between two item response functionsApplied Psychological Measurement 14 197CrossRefGoogle Scholar
Reise, S. P.Widaman, K. F.Pugh, R. H. 1993 Confirmatory factor analysis and item response theory: Two approaches for exploring measurement invariancePsychological Bulletin 114 552CrossRefGoogle ScholarPubMed
Robin, F. 1999 SDDIF: Standardization and delta DIF analysesAmherstUniversity of Massachusetts, Laboratory of Psychometric and Evaluative ResearchGoogle Scholar
Robin, F.Sireci, S. G.Hambleton, R. K. 2003 Evaluating the equivalence of different language versions of a credentialing examInternational Journal of Testing 3 1CrossRefGoogle Scholar
Rogers, H. J.Swaminathan, H. 1993 A comparison of logistic regression and Mantel-Haenszel procedures for detecting differential item functioningApplied Psychological Measurement 17 105CrossRefGoogle Scholar
Shealy, R.Stout, W. 1993 A model-based standardization differences and detects test bias/DTF as well as item bias/DIFPsychometrika 58 159CrossRefGoogle Scholar
Sireci, S. G. 1997 Problems and issues in linking tests across languagesEducational Measurement: Issues and Practice 16 12CrossRefGoogle Scholar
Sireci, S. G. 2005 Using bilinguals to evaluate the comparability of different language versions of a testHambleton, R. K.Merenda, P.Spielberger, C.Adapting educational and psychological tests for cross-cultural assessment117Hillsdale, NJErlbaumGoogle Scholar
Sireci, S. G.Bastari, B.Allalouf, A. 1998 Evaluating construct equivalence across adapted testsSan Francisco, CAGoogle Scholar
Sireci, S. G.Berberoglu, G. 2000 Evaluating translation DIF using bilingualsApplied Measurement in Education 13 229CrossRefGoogle Scholar
Sireci, S. G.Fitzgerald, C.Xing, D. 1998 Adapting credentialing examinations for international usesLaboratory of Psychometric and Evaluative Research Report No. 329Amherst, MAUniversity of Massachusetts, School of EducationGoogle Scholar
Sireci, S. G.Harter, J.Yang, Y.Bhola, D. 2003 Evaluating the equivalence of an employee attitude survey across languages, cultures, and administration formatsInternational Journal of Testing 3 129CrossRefGoogle Scholar
Sireci, S. G.Patsula, L.Hambleton, R. K. 2005 Statistical methods for identifying flawed items in the test adaptations processHambleton, R. K.Merenda, P.Spielberger, C.Adapting educational and psychological tests for cross-cultural assessment93Hillsdale, NJErlbaumGoogle Scholar
Swaminathan, H.Rogers, H. J. 1990 Detecting differential item functioning using logistic regression proceduresJournal of Educational Measurement 27 361CrossRefGoogle Scholar
Thissen, D. 2001 http://www.unc.edu/∼dthissen/dl.html
Thissen, D.Steinberg, L.Wainer, H. 1988 Use of item response theory in the study of group differences in trace linesWainer, H.Braun, H. I.Test validity147Hillsdale, NJErlbaumGoogle Scholar
Thissen, D.Steinberg, L.Wainer, H. 1993 Detection of differential item functioning using the parameters of item response modelsHolland, P. W.Wainer, H.Differential item functioning67Mahwah, NJErlbaumGoogle Scholar
Van de Vijver, F. J. R.Poortinga, Y. H. 1997 Towards an integrated analysis of bias in cross-cultural assessmentEuropean Journal of Psychological Assessment 13 29CrossRefGoogle Scholar
Van de Vijver, F. J. R.Poortinga, Y. H. 2005 Conceptual and methodological issues in adapting testsHambleton, R. K.Merenda, P.Spielberger, C.Adapting educational and psychological tests for cross-cultural assessment39Hillsdale, NJErlbaumGoogle Scholar
Van de Vijver, F.Tanzer, N. K. 1997 Bias and equivalence in cross-cultural assessmentEuropean Review of Applied Psychology 47 263Google Scholar
Wainer, H.Sireci, S. G. 2005 Item and test biasEncyclopedia of social measurement365San Diego, CAElsevierCrossRefGoogle Scholar
Wainer, H.Sireci, S. G.Thissen, D. 1991 Differential testlet functioning: Definitions and detectionJournal of Educational Measurement 28 197CrossRefGoogle Scholar
Waller, N. G. 1998 EZDIF: The detection of uniform and non-uniform differential item functioning with the Mantel–Haenszel and logistic regression proceduresApplied Psychological Measurement 22 391CrossRefGoogle Scholar
Zumbo, B. D. 1999 A handbook on the theory and methods of differential item functioning (DIF): Logistic regression modeling as a unitary framework for binary and Likert-type (ordinal) item scoresOttawa, CanadaDirectorate of Human Resources Research and Evaluation, Department of National DefenseGoogle Scholar
Zwick, R.Donoghue, J. R.Grima, A. 1993 Assessment of differential item functioning for performance tasksJournal of Educational Measurement 30 233CrossRefGoogle Scholar

Save book to Kindle

To save this book to your Kindle, first ensure [email protected] is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×