Hostname: page-component-cd9895bd7-q99xh Total loading time: 0 Render date: 2024-12-23T08:27:26.260Z Has data issue: false hasContentIssue false

Mixed Methods Research in Language Testing and Assessment

Published online by Cambridge University Press:  22 December 2014

Abstract

As an alternative paradigm, mixed methods research (MMR), in general, endorses pluralism to understand the complex nature of a social world from multiple perspectives and multiple methodological lenses, each of which offers partial, yet valuable, insights. This methodological mixing is not limited to mixing of methods, but extends to the entire inquiry process. Researchers in language testing and assessment (LTA) are increasingly turning to MMR in order to understand the complexities of language acquisition and interaction among various language users, and also to expand opportunities to investigate validity claims beyond the three traditional facets of construct, content, and criterion validity. We use current conceptualizations of validity as a guiding framework to review 32 empirical MMR studies that have been published in LTA since 2007. Our systematic review encompassed multiple areas of foci, including the rationale for the use of MMR, evidence of collaboration, and synergetic effects. The analyses revealed several key trends including: (a) triangulation and complementarity were the prevalent uses of MMR in LTA; (b) the majority of the studies took place predominantly in higher education learning contexts with adult immigrant or university populations; (c) aspects of writing assessment were most frequently the focus of the studies (compared to other language modalities); (d) many of the studies explicitly addressed facets of validity, and others had significant implications for expanding notions of validity in LTA; (e) the majority of the studies avoided mixing at the data analysis stage by distinguishing data types and reporting results separately; and (f) integration occurred primarily at the discussion stage. We contend that LTA should embrace MMR through creative designs and integrative analytic strategies to seek new insights into the complexities and contexts of language testing and assessment.

Type
Research Article
Copyright
Copyright © Cambridge University Press 2014 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

ANNOTATED BIBLIOGRAPHY

Dörnyei, Z. (2007). Research methods in applied linguistics: Quantitative, qualitative, and mixed methodologies. New York, NY: Oxford University Press.

This volume thoroughly engages the reader through each major phase of research by addressing key issues, data collection, data analysis, and reporting. Each of these aforementioned sections contains discussions specific to qualitative, quantitative, and MMR, thereby providing the reader with integral information about MMR while also providing the reader with knowledge that can be used to compare and contrast the benefits and challenges to the use of each method in specific contexts. Further, the volume also includes a historical overview of the research paradigms and a summary of their strengths and weakness, along with a summary of Dörnyei's own paradigmatic stance, which highlights his call encouraging the use of MMR.

Morgan, D. L. (2007). Paradigms lost and pragmatism regained. Journal of Mixed Methods Research, 1, 4876.

Morgan introduces this article by asking a pointed question: “To what extent is combining qualitative and quantitative methods simply about how we use methods, as opposed to raising basic issues about the nature of research methodology in the social sciences?” (p. 48). In order to respond to this question, this thought-provoking article takes the reader through a historical review of the developments in research methodology in the social sciences using the concept of paradigms as a conceptual framework. Morgan presents four conceptualizations of paradigms, which include paradigms as worldviews, epistemological stances, shared beliefs in a research field, and model examples. Through a discussion of paradigms, Morgan propagates a discussion of methodological issues that arise when researchers engage in MMR. Morgan's central thesis is that a “metaphysical” paradigm in the social sciences has been exhausted and should be replaced by a “pragmatic” paradigmatic approach (p. 55).

Tashakkori, A., & Teddlie, C. (Eds.). (2010). Handbook of mixed methods in social and behavioral research (2nd ed.). Thousand Oaks, CA: Sage.

This handbook, weighing in at close to 900 pages, is one of the most comprehensive resources for understanding and applying MMR in any field including LTA. The handbook is divided into three sections: (a) conceptual issues (philosophical, theoretical, sociopolitical); (b) issues regarding methods and methodology; and (c) contemporary applications of MMR. While not every chapter has a direct connection to education or assessment, there is much to be learned from the applications of MMR across these fields. The authors address emergent issues and present examples that provide learning opportunities about the different stances and purposes of MMR.

REFERENCES

Abbuhl, R., & Mackey, A. (2008). Second language acquisition research methods. In King, K. A. & Hornberger, N. H. (Eds.), Encyclopedia of language and education: Research methods in language and education (Vol. 10, pp. 113). Dordrecht, The Netherlands: Springer.Google Scholar
Anthony, J. J. (2009). Classroom computer experiences that stick: Two lenses on reflective timed essays. Assessing Writing, 14, 194205.Google Scholar
Bachman, L. F. (2005). Building and supporting a case for test use. Language Assessment Quarterly, 2, 134.CrossRefGoogle Scholar
Baker, B. A. (2010). Playing with the stakes: A consideration of an aspect of the social context of a gatekeeping writing assessment. Assessing Writing, 15, 133153.Google Scholar
Baker, B. A. (2012). Individual differences in rater decision-making style: An exploratory mixed-methods study. Language Assessment Quarterly, 9, 225248.Google Scholar
Barkaoui, K. (2010). Do ESL essay raters’ evaluation criteria change with experience? A mixed-methods, cross-sectional study. TESOL Quarterly, 44, 3157.Google Scholar
Barkaoui, K. (2011). Effects of marking method and rater experience on ESL essay scores and rater performance. Assessment in Education: Principles, Policy, & Practice, 18, 279293.Google Scholar
Bryman, A. (2007). Barriers to integrating quantitative and qualitative research. Journal of Mixed Methods Research, 1, 822.Google Scholar
Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56, 81105.CrossRefGoogle ScholarPubMed
Campbell, D. T., & Stanley, J. C. (1966). Experimental and quasi- experimental designs for research. Skokie, IL: Rand McNally.Google Scholar
Caracelli, V. J., & Greene, J. C. (1993). Data analysis strategies for mixed-method evaluation designs. Educational Evaluation and Policy Analysis, 15, 195207.CrossRefGoogle Scholar
Colby-Kelly, C., & Turner, C. E. (2007). AFL research in the L2 classroom and evidence of usefulness: Taking formative assessment to the next level. The Canadian Modern Language Review/La Revue Canadienne des Langues Vivantes, 64, 937.CrossRefGoogle Scholar
Cook, T. D. (1985). Postpositivist critical multiplism. In Shortland, R. L. & Mark, M. M. (Eds.), Social science and social policy (pp. 129–46). Newbury Park, CA: Sage.Google Scholar
Cook, T. D., & Campbell, D. T. (1979). Quasi-experimentation: Design and analysis issues for field settings. Skokie, IL: Rand McNally.Google Scholar
Creswell, J. W. (2007). Qualitative inquiry and research design: Choosing among five approaches (2nd ed.). Thousand Oaks, CA: Sage.Google Scholar
Creswell, J. W., & Plano Clark, V. L. (2007). Designing and conducting mixed methods research. Thousand Oaks, CA: Sage.Google Scholar
Cronbach, L. J. (1988). Five perspectives on validity argument. In Wainer, H. & Braun, H. (Eds.), Test validity (pp. 317). Hillsdale, NJ: Erlbaum.Google Scholar
Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52, 281302.CrossRefGoogle ScholarPubMed
Datta, L. (1997). Multimethod evaluations: Using case studies together with other methods. In Chelimsky, E. & Shadish, W. R. (Eds.), Evaluation for the 21st century: A handbook (pp. 344359). Thousand Oaks, CA: Sage.Google Scholar
Davies, A. (Ed.). (1997). Ethics in language testing [special issue]. Language Testing, 14.Google Scholar
Dellinger, A. B., & Leech, N. L. (2007). Toward a unified validation framework in mixed methods research. Journal of Mixed Methods Research, 1, 309332.CrossRefGoogle Scholar
Denzin, N. K. (1978). The research act: A theoretical introduction to sociological methods. New York, NY: Praeger.Google Scholar
Derwing, T. M., Munro, M. J., & Thomson, R. I. (2008). A longitudinal study of ESL learners’ fluency and comprehensibility development. Applied Linguistics, 29, 359380.Google Scholar
DiPardo, A., Storms, B. A., & Selland, M. (2011). Seeing voices: Assessing writerly stance in the NWP analytic writing continuum. Assessing Writing, 16, 170188.Google Scholar
Dörnyei, Z. (2007). Research methods in applied linguistics: Quantitative, qualitative, and mixed methodologies. New York, NY: Oxford University Press.Google Scholar
Fulcher, G. (1996). Does thick description lead to smart tests? A data-based approach to rating scale construction. Language Testing, 13, 208238.CrossRefGoogle Scholar
Gage, N. (1989). The paradigm wars and their aftermath: A “historical” sketch of research on teaching since 1989. Educational Researcher, 18, 410.Google Scholar
Ghanbari, B., Barati, H., & Moinzadeh, A. (2012). Problematizing rating scales in EFL academic writing assessment: Voices from Iranian context. English Language Teaching, 5, 7690.CrossRefGoogle Scholar
Greene, J. C. (2007). Mixed methods in social inquiry. San Francisco, CA: Jossey-Bass.Google Scholar
Greene, J. C. (2008). Is mixed methods social inquiry a distinctive methodology? Journal of Mixed Methods Research, 2, 722.Google Scholar
Greene, J. C. (2011). The construct(ion) of validity as argument. In Chen, H. T., Donaldson, S. I., & Mark, M. M. (Eds.), Advancing validity in outcome evaluation: Theory and practice, New Directions for Evaluation (pp. 8192). San Francisco, CA: Jossey-Bass.Google Scholar
Greene, J. C., & Caracelli, V. J. (1997). Defining and describing the paradigm issues in mixed-method evaluation. In Greene, J. C. & Caracelli, V. J. (Eds.), Advances in mixed-method evaluation: The challenges and benefits of integrating diverse paradigms (pp. 518). San Francisco, CA: Jossey-Bass.Google Scholar
Greene, J. C., Caracelli, V. J., & Graham, W. F. (1989). Toward a conceptual framework for mixed-method evaluation designs. Educational Evaluation and Policy Analysis, 11, 255274.Google Scholar
Guilford, J. P. (1946). New standards for test evaluation. Educational and Psychological Measurement, 6, 427439.Google Scholar
Guion, R. M. (1980). On trinitarian doctrines of validity. Professional Psychology, 11, 385398.Google Scholar
Habermas, J. (1984). The theory of communicative action. Boston, MA: Beacon Press.Google Scholar
Hamid, M., Sussex, R., & Khan, A. (2009). Private tutoring in English for secondary school students in Bangladesh. TESOL Quarterly, 43, 281308.Google Scholar
Harsch, C., & Martin, G. (2012). Adapting CEF-descriptors for rating purposes: Validation by a combined rater training and scale revision approach. Assessing Writing, 17, 228250.Google Scholar
Hashemi, M. R. (2012). Reflections on mixing methods in applied linguistics research. Applied Linguistics, 33, 206212.Google Scholar
Hesse-Biber, S., & Leavy, P (2006). Analysis and interpretation of qualitative data. In Hesse-Biber, S. & Leavy, P. (Eds.), The practice of qualitative research (pp. 343374). London, UK: Sage.Google Scholar
Hyland, T. A. (2009). Drawing a line in the sand: Identifying the borderzone between self and other in EL1 and EL2 citation practices. Assessing Writing, 14, 6274.Google Scholar
Isaacs, T. (2008). Toward defining a valid assessment criterion of pronunciation proficiency in non-native English-speaking graduate students. The Canadian Modern Language Review/La Revue Canadienne des Langues Vivantes, 64, 555580.Google Scholar
Isaacs, T., & Trofimovich, P. (2012). Deconstructing comprehensibility. Studies in Second Language Acquisition, 34, 475505.Google Scholar
Jang, E. E. (2009). Cognitive diagnostic assessment of L2 reading comprehension ability: Validity arguments for applying Fusion Model to LanguEdge assessment. Language Testing, 26, 3173.Google Scholar
Jang, E. E. (2013). Mixed methods research in SLA. In Robinson, P. (Ed.), The Routledge encyclopedia of SLA (pp. 429431). New York, NY: Routledge.Google Scholar
Jang, E. E., & Roussos, L. (2009). Integrative analytic approach to detecting and interpreting L2 vocabulary DIF. International Journal of Testing, 9, 238259.Google Scholar
Johnson, R. B., & Onwuegbuzie, A. J. (2004). Mixed methods research: A research paradigm whose time has come. Educational Researcher, 33, 1426.Google Scholar
Johnson, R. B., Onwuegbuzie, A. J., & Turner, L. A. (2007). Toward a definition of mixed methods research. Journal of Mixed Methods Research, 1, 112133.CrossRefGoogle Scholar
Kane, M. (1992). An argument-based approach to validity. Psychological Bulletin, 112, 527535.Google Scholar
Kane, M. (2002). Validating high stakes testing programs. Educational Measurement: Issues and Practice, 21, 3141.Google Scholar
Kane, M. (2006). Validation. In Brennon, R. (Ed.), Educational Measurement (4th ed., pp. 1764). Westport, CT: American Council on Education and Praeger.Google Scholar
Kim, Y. (2009). An investigation into native and non-native teachers’ judgments of oral English performance: A mixed methods approach. Language Testing, 26, 187217.CrossRefGoogle Scholar
Kim, Y., & Jang, E. E. (2009). Differential functioning of reading subskills on the OSSLT for L1 and ELL students: A multidimensionality model-based DBF/DIF approach. Language Learning, 59, 825865.Google Scholar
Kirkhart, K. E. (2005). Through a cultural lens: Reflections on validity and theory in evaluation. In Hood, S., Hopson, R. K., & Frierson, H. T. (Eds.) The role of culture and cultural context: A mandate for inclusion, the discovery of truth, and understanding in evaluative theory and practice (pp. 2139). Greenwich, CT: Information Age.Google Scholar
Knoch, U. (2009). Diagnostic assessment of writing: A comparison of two rating scales. Language Testing, 26, 275304.Google Scholar
Lee, Y., & Greene, J. (2007). The predictive validity of an ESL placement test. Journal of Mixed Methods Research, 1, 366389.Google Scholar
Lee, H., & Winke, P. (2013). The differences among three-, four-, and five-option-item formats in the context of a high-stakes English-language listening test. Language Testing, 30, 99123.Google Scholar
Lincoln, Y. S., & Guba, E. G. (1985). Naturalistic inquiry. Newbury Park, CA: Sage.Google Scholar
Magnan, S. S. (2006). From the editor: The MLJ turns 90 in a digital age. The Modern Language Journal, 90, 15.Google Scholar
Mathison, S. (1988). Why triangulate? Educational Researcher, 17, 1317.Google Scholar
Maxwell, J. A. (1992). Understanding and validity in qualitative research. Harvard Educational Review, 62, 279300.Google Scholar
Maxwell, J. A., & Loomis, D. M. (2003). Mixed methods design: An alternative approach. In Tashakkori, A. & Teddlie, C. (Eds.), Sage handbook of mixed methods in social & behavioral research (pp. 241271). Thousand Oaks, CA: Sage.Google Scholar
Mertens, D. M. (2007). Transformative paradigm. Journal of Mixed Methods Research, 1, 212225.Google Scholar
Mertens, D. M. (2010). Philosophy in mixed methods teaching: The transformative paradigm as illustration. International Journal of Multiple Research Approaches, 4, 918.Google Scholar
Messick, S. (1989). Validity. In Linn, R. L. (Ed.), Educational Measurement (3rd ed., pp. 13103). New York, NY: Macmillan.Google Scholar
Miles, M. B., & Huberman, A. M. (1984). Qualitative data analysis: A sourcebook of new methods. Thousand Oaks, CA: Sage.Google Scholar
Mislevy, R. (1995). Test theory and language-learning assessment. Language Testing, 12, 341369.Google Scholar
Morgan, D. L. (2007). Paradigms lost and pragmatism regained. Journal of Mixed Methods Research, 1, 4876.Google Scholar
Moss, P. A. (1994). Can there be validity without reliability? Educational Researcher, 23, 512.Google Scholar
Nakatsuhara, F. (2011). Effects of test-taker characteristics and the number of participants in group oral tests. Language Testing, 28, 483508.CrossRefGoogle Scholar
Nastasi, B. K., Hitchcock, J. H., & Brown, L. M. (2010). An inclusive framework for conceptualizing mixed methods design typologies: Moving toward fully integrated synergistic research models. In Tashakkori, A. & Teddlie, C. (Eds.), Sage handbook of mixed methods in social and behavioral research (pp. 305338). Thousand Oaks, CA: Sage.Google Scholar
Onwuegbuzie, A. J., & Johnson, R. B. (2006). The validity issue in mixed research. Research in the Schools, 13, 4863.Google Scholar
Patton, M. Q. (2002). Qualitative research and evaluation methods (3rd ed.). Thousand Oaks, CA: Sage.Google Scholar
Perrone, M. (2011). The effect of classroom-based assessment and language processing on the second language acquisition of EFL students. Journal of Adult Education, 40, 2033.Google Scholar
Plakans, L., & Gebril, A. (2012). A close investigation into source use in integrated second language writing tasks. Assessing Writing, 17, 1834.Google Scholar
Poonpon, K. (2011, April). “Synergy of mixed method approach to development of ESL speaking rating scale.” Paper presented at Doing Research in Applied Linguistics [conference], Bangkok, Thailand.Google Scholar
Reichardt, D. S., & Cook, T. D. (1979). Beyond qualitative versus quantitative methods. In Cook, T. D. & Reichardt, C. S. (Eds.), Qualitative and quantitative methods in evaluation research (pp. 732). Thousand Oaks, CA: Sage.Google Scholar
Reiss, A. J. (1968). Stuff and nonsense about social surveys and participant observation. In Becker, H. S., Geer, B., Riesman, D., & Weiss, R. S. (Eds.), Institutions and the person: Papers in memory of Everett C. Hughes. Chicago, IL: Aldine.Google Scholar
Schegloff, E. A. (1993). Reflections on quantification in the study of conversation. Research on Language and Social Interaction, 26, 99128.Google Scholar
Schwandt, T. A., & Jang, E. E. (2004). Linking validity and ethics in language testing: Insights from the hermeneutic turn in social science. Studies in Educational Evaluation, 30, 265280.Google Scholar
Shepard, L. A. (1992). What policy makers who mandate tests should know about the new psychology of intellectual ability and learning. In. Gifford, B. R. & O’Connor, M. C. (Eds.), Changing assessment: Alternative views of aptitude, achievement and instruction (pp. 301328). Boston, MA: Kluwer.Google Scholar
Tashakkori, A., & Teddlie, C. (Eds.). (2003). Handbook of mixed methods in social and behavioral research. Thousand Oaks, CA: Sage.Google Scholar
Tashakkori, A., & Teddlie, C. (Eds.). (2010). Handbook of mixed methods in social and behavioral research (2nd ed.). Thousand Oaks, CA: Sage.Google Scholar
Teddlie, C., & Tashakkori, A. (2003). Major issues and controversies in the use of mixed methods in the social and behavioral sciences. In Tashakkori, A. & Teddlie, C. (Eds.), Handbook of mixed methods in social and behavioral research (pp. 351). Thousand Oaks, CA: Sage.Google Scholar
Teddlie, C., & Tashakkori, A. (2006). A general typology of research designs featuring mixed methods. Research in the Schools, 13, 1228.Google Scholar
Turner, C. E. (2009). Examining washback in second language education contexts: A high stakes provincial exam and the teacher factor in classroom practice in Quebec secondary schools. International Journal on Pedagogies and Learning, 5, 103123.Google Scholar
Uchikoshi, Y., & Maniates, H. (2010). How does bilingual instruction enhance English achievement? A mixed-methods study of Cantonese-speaking and Spanish-speaking bilingual classrooms. Bilingual Research Journal, 33, 364385.Google Scholar
Weir, C. J. (2005). Language testing and validation. Basingstoke, UK: Palgrave Macmillan.Google Scholar
Winke, P. (2011). Evaluating the validity of a high-stakes ESL test: Why teachers’ perceptions matter. TESOL Quarterly, 45, 628660.Google Scholar
Wiseman, C. S. (2012). Rater effects: Ego engagement in rater decision-making. Assessing Writing, 17, 150173.Google Scholar
Xi, X. (2010). Aspects of performance on line graph description tasks: Influenced by graph familiarity and different task features. Language Testing, 27, 73100.Google Scholar
Yin, M., Sims, J., & Cothran, D. (2012). Scratching where they itch: Evaluation of feedback on a diagnostic English grammar test for Taiwanese university students. Language Assessment Quarterly, 9, 78104.Google Scholar
Yu, G. (2010). Effects of presentation mode and computer familiarity on summarization of extended texts. Language Assessment Quarterly, 7, 119136.CrossRefGoogle Scholar