Charting the landscape of data-driven learning using a bibliometric analysis

Jihua Dong; Yanan Zhao; Louisa Buckingham

doi:10.1017/S0958344022000222

Charting the landscape of data-driven learning using a bibliometric analysis

Published online by Cambridge University Press: 22 November 2022

and

Jihua Dong: Affiliation:
Shandong University, China ([email protected])
Yanan Zhao: Affiliation:
Shandong University, China ([email protected])
Louisa Buckingham: Affiliation:
The University of Auckland, New Zealand ([email protected])

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

This study employs a bibliometric approach to analyse common research themes, high-impact publications and research venues, identify the most recent transformative research, and map the developmental stages of data-driven learning (DDL) since its genesis. A dataset of 126 articles and 3,297 cited references (1994–2021) retrieved from the Web of Science was analysed using CiteSpace 6.1.R2. The analysis uncovered the principal research themes and high-impact publications, and the most recent transformative research in the DDL field. The following evolutionary stages of DDL were determined based on Shneider’s (2009) scientific model and the timeline generated by CiteSpace, namely, the conceptualising stage (1980s–1998), the maturing stage (1998–2011), and the expansion stage (2011–now), with Stage 4 just emerging. Finally, the analysis discerned potential future research directions, including the implementation of DDL in larger-scale classroom practice and the role of variables in DDL.

Keywords

data-driven learning co-citation analysis structural variation analysis bibliometric analysis

Type: Research Article
Information: ReCALL , Volume 35 , Issue 3 , September 2023 , pp. 339 - 355

DOI: https://doi.org/10.1017/S0958344022000222 [Opens in a new window]
Copyright: © The Author(s), 2022. Published by Cambridge University Press on behalf of EUROCALL, the European Association for Computer-Assisted Language Learning

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Anthony, L. (2004) AntConc (Version 3.0.1). Tokyo: Waseda University. http://www.antlab.sci.waseda.ac.jp/ Google Scholar

Aryadoust, V., Zakaria, A., Lim, M. H. & Chen, C. (2020) An extensive knowledge mapping review of measurement and validity in language assessment and SLA research. Frontiers in Psychology, 11: 1–29. https://doi.org/10.3389/fpsyg.2020.01941 CrossRef Google Scholar PubMed

Birkle, C., Pendlebury, D. A., Schnell, J. & Adams, J. (2020) Web of Science as a data source for research on scientific and scholarly activity. Quantitative Science Studies, 1(1): 363–376. https://doi.org/10.1162/qss_a_00018 CrossRef Google Scholar

Bloch, J. (2009) The design of an online concordancing program for teaching about reporting verbs. Language Learning & Technology, 13(1): 59–78. https://doi.org/10125/44168 Google Scholar

Borja, A. (2007) Corpora for translators in Spain. The CDJ-GITRAD Corpus and the GENTT Project. In Anderman, G. & Rogers, M. (eds.), Incorporating corpora: The linguist and the translator. Clevedon: Multilingual Matters, 243–265. https://doi.org/10.21832/9781853599873-016 CrossRef Google Scholar

Boulton, A. (2008) But where’s the proof? The need for empirical evidence for data-driven learning. In Edwardes, M. (ed.), Proceedings of the BAAL annual conference 2007. London: Scitsiugnil Press, 13–16.Google Scholar

Boulton, A. (2009a) Testing the limits of data-driven learning: Language proficiency and training. ReCALL, 21(1): 37–54. https://doi.org/10.1017/S0958344009000068 CrossRef Google Scholar

Boulton, A. (2009b) Data-driven learning: Reasonable fears and rational reassurance. Indian Journal of Applied Linguistics, 35(1): 81–106.Google Scholar

Boulton, A. (2010a) Data-driven learning: Taking the computer out of the equation. Language Learning, 60(3): 534–572. https://doi.org/10.1111/j.1467-9922.2010.00566.x CrossRef Google Scholar

Boulton, A. (2010b) Learning outcomes from corpus consultation. In Moreno Jaén, M., Serrano Valverde, F. & Calzada Pérez, M. (eds.), Exploring new paths in language pedagogy: Lexis and corpus-based language teaching. London: Equinox, 129–144.Google Scholar

Boulton, A. (2011) Data-driven learning: The perpetual enigma. In Goźdź-Roszkowski, S. (ed.), Explorations across languages and corpora. Frankfurt: Peter Lang, 563–580. https://doi.org/10.3726/978-3-653-04563-5 Google Scholar

Boulton, A. (2012) Corpus consultation for ESP: A review of empirical research. In Boulton, A., Carter-Thomas, S. & Rowley-Jolivet, E. (eds.), Corpus-informed research and learning in ESP: Issues and applications. Amsterdam: John Benjamins, 261–291. https://doi.org/10.1075/scl.52.11bou CrossRef Google Scholar

Boulton, A. (2017) Corpora in language teaching and learning. Language Teaching, 50(4): 483–506. https://doi.org/10.1017/S0261444817000167 CrossRef Google Scholar

Boulton, A. & Cobb, T. (2017) Corpus use in language learning: A meta-analysis. Language Learning, 67(2): 348–393. https://doi.org/10.1111/lang.12224 CrossRef Google Scholar

Boulton, A. & Tyne, H. (2013) Corpus linguistics and data-driven learning: A critical overview. Bulletin Suisse de Linguistique Appliquée, 97: 97–118.Google Scholar

Boulton, A. & Vyatkina, N. (2021) Thirty years of data-driven learning: Taking stock and charting new directions over time. Language Learning & Technology, 25(3): 66–89. https://doi.org/10125/73450 Google Scholar

Braun, S. (2005) From pedagogically relevant corpora to authentic language learning contents. ReCALL, 17(1): 47–64. https://doi.org/10.1017/S0958344005000510 CrossRef Google Scholar

Burnard, L. (ed.) (2004) BNC Baby [CD-ROM]. Oxford: Oxford University Research and Technology Service. http://www.natcorp.ox.ac.uk/corpus/babyinfo.html Google Scholar

Chambers, A. (2005) Integrating corpus consultation in language studies. Language Learning & Technology, 9(2): 111–125. https://doi.org/10125/44022 Google Scholar

Chambers, A. (2007) Popularising corpus consultation by language learners and teachers. In Hidalgo, E., Quereda, L. & Santana, J. (eds.), Corpora in the foreign language classroom. Amsterdam: Rodopi, 3–16.Google Scholar

Chambers, A. (2019) Towards the corpus revolution? Bridging the research–practice gap. Language Teaching, 52(4): 460–475. https://doi.org/10.1017/S0261444819000089 CrossRef Google Scholar

Charles, M. (2014) Getting the corpus habit: EAP students’ long-term use of personal corpora. English for Specific Purposes, 35: 30–40. https://doi.org/10.1016/j.esp.2013.11.004 CrossRef Google Scholar

Charles, M. (2015) Same task, different corpus: The role of personal corpora in EAP classes. In Leńko-Szymańska, A. & Boulton, A. (eds.), Multiple affordances of language corpora for data-driven learning. Amsterdam: John Benjamins, 131–154. https://doi.org/10.1075/scl.69.07cha Google Scholar

Chen, C. (2012) Predictive effects of structural variation on citation counts. Journal of the American Society for Information Science and Technology, 63(3): 431–449. https://doi.org/10.1002/asi.21694 CrossRef Google Scholar

Chen, C. (2017) Science mapping: A systematic review of the literature. Journal of Data and Information Science, 2(2): 1–40. https://doi.org/10.1515/jdis-2017-0006 CrossRef Google Scholar

Chen, C., Ibekwe-SanJuan, F. & Hou, J. (2010) The structure and dynamics of cocitation clusters: A multiple-perspective cocitation analysis. Journal of the American Society for Information Science and Technology, 61(7): 1386–1409. https://doi.org/10.1002/asi.21309 CrossRef Google Scholar

Chen, M. & Flowerdew, J. (2018) A critical review of research and practice in data-driven learning (DDL) in the academic writing classroom. International Journal of Corpus Linguistics, 23(3): 335–369. https://doi.org/10.1075/ijcl.16130.che CrossRef Google Scholar

Chen, M., Flowerdew, J. & Anthony, L. (2019) Introducing in-service English language teachers to data-driven learning for academic writing. System, 87: 102148. https://doi.org/10.1016/j.system.2019.102148 CrossRef Google Scholar

Chen, X. L., Zou, D., Xie, H. R. & Su, F. (2021) Twenty-five years of computer-assisted language learning: A topic modeling analysis. Language Learning & Technology, 25(3): 151–185.Google Scholar

Cobb, T. & Boulton, A. (2015) Classroom applications of corpus analysis. In Biber, D. & Reppen, R. (eds.), The Cambridge handbook of English corpus linguistics. Cambridge: Cambridge University Press, 478–497. https://doi.org/10.1017/CBO9781139764377.027 CrossRef Google Scholar

Coden, A. R., Pakhomov, S. V., Ando, R. K., Duffy, P. H. & Chute, C. G. (2005) Domain-specific language models and lexicons for tagging. Journal of Biomedical Informatics, 38(6): 422–430. https://doi.org/10.1016/j.jbi.2005.02.009 CrossRef Google Scholar PubMed

Cotos, E. (2014) Enhancing writing pedagogy with learner corpus data. ReCALL, 26(2): 202–224. https://doi.org/10.1017/S0958344014000019 CrossRef Google Scholar

Craig, I. D., Plume, A. M., McVeigh, M. E., Pringle, J. & Amin, M. (2007) Do open access articles have greater citation impact? A critical review of the literature. Journal of Informetrics, 1(3): 239–248. https://doi.org/10.1016/j.joi.2007.04.001 CrossRef Google Scholar

Cresswell, A. (2007) Getting to ‘know’ connectors? Evaluating data-driven learning in a writing skills course. In Hidalgo, E., Quereda, L. & Santana, J. (eds.), Corpora in the foreign language classroom. Amsterdam: Rodopi, 267–287. https://doi.org/10.1163/9789401203906_018 CrossRef Google Scholar

Crosthwaite, P., Luciana, & Wijaya, D. (2021) Exploring language teachers’ lesson planning for corpus-based language teaching: A focus on developing TPACK for corpora and DDL. Computer Assisted Language Learning. Advance online publication. https://doi.org/10.1080/09588221.2021.1995001 CrossRef Google Scholar

Crosthwaite, P., Storch, N. & Schweinberger, M. (2020) Less is more? The impact of written corrective feedback on corpus-assisted L2 error resolution. Journal of Second Language Writing, 49: 100729. https://doi.org/10.1016/j.jslw.2020.100729 CrossRef Google Scholar

Crosthwaite, P., Wong, L. L. C. & Cheung, J. (2019) Characterising postgraduate students’ corpus query and usage patterns for disciplinary data-driven learning. ReCALL, 31(3): 255–275. https://doi.org/10.1017/S0958344019000077 CrossRef Google Scholar

Daskalovska, N. (2015) Corpus-based versus traditional learning of collocations. Computer Assisted Language Learning, 28(2): 130–144. https://doi.org/10.1080/09588221.2013.803982 CrossRef Google Scholar

Davies, M. (2008) The Corpus of Contemporary American English (COCA). https://www.english-corpora.org/coca/ Google Scholar

Flowerdew, L. (2012) Corpora and language education. Houndmills: Palgrave Macmillan. https://doi.org/10.1057/9780230355569 CrossRef Google Scholar

Francis, W. N. & Kučera, H. (1989) Manual of information to accompany a standard corpus of present-day edited American English, for use with digital computers. Brown University, Department of Linguistics.Google Scholar

Frankenberg-Garcia, A. (2012) Learners’ use of corpus examples. International Journal of Lexicography, 25(3): 273–296. https://doi.org/10.1093/ijl/ecs011 CrossRef Google Scholar

Frankenberg-Garcia, A. (2014) The use of corpus examples for language comprehension and production. ReCALL, 26(2): 128–146. https://doi.org/10.1017/S0958344014000093 CrossRef Google Scholar

Geluso, J. & Yamaguchi, A. (2014) Discovering formulaic language through data-driven learning: Student attitudes and efficacy. ReCALL, 26(2): 225–242. https://doi.org/10.1017/S0958344014000044 CrossRef Google Scholar

Gilquin, G. & Granger, S. (2010) How can data-driven learning be used in language teaching? In O’Keeffe, A. & McCarthy, M. (eds.), The Routledge handbook of corpus linguistics. Abingdon: Routledge, 359–370.CrossRef Google Scholar

He, C. & Wei, X. (2019) Study of corpus’ influences in EAP research (2009-2018): A bibliometric analysis in CiteSpace. English Language Teaching, 12(12): 59–66. https://doi.org/10.5539/elt.v12n12p59 CrossRef Google Scholar

Huang, Z. (2014) The effects of paper-based DDL on the acquisition of lexico-grammatical patterns in L2 writing. ReCALL, 26(2): 163–183. https://doi.org/10.1017/S0958344014000020 CrossRef Google Scholar

Hyland, K. & Jiang, F. K. (2021a) A bibliometric study of EAP research: Who is doing what, where and when? Journal of English for Academic Purposes, 49: 100929. https://doi.org/10.1016/j.jeap.2020.100929 CrossRef Google Scholar

Hyland, K. & Jiang, F. K. (2021b) Delivering relevance: The emergence of ESP as a discipline. English for Specific Purposes, 64: 13–25. https://doi.org/10.1016/j.esp.2021.06.002 CrossRef Google Scholar

Johns, T. (1988) Whence and whither classroom concordancing? In Bongaerts, T., de Haan, P., Lobbe, S. & Wekker, H. (eds.), Computer applications in language learning. Dordrecht: Foris Publications, 9–27. https://doi.org/10.1515/9783110884876-003 CrossRef Google Scholar

Johns, T. (1991) From printout to handout: Grammar and vocabulary teaching in the context of data-driven learning. English Language Research Journal, 4: 27–45.Google Scholar

Jung, U. O. H. (2005) CALL: Past, present and future – A bibliometric approach. ReCALL, 17(1): 4–17. https://doi.org/10.1017/S0958344005000212 CrossRef Google Scholar

Kennedy, C. & Miceli, T. (2010) Corpus-assisted creative writing: Introducing intermediate Italian learners to a corpus as a reference resource. Language Learning & Technology, 14(1): 28–44. https://doi.org/10125/44201 Google Scholar

Kita, K. & Ogata, H. (1997) Collocations in language learning: Corpus-based automatic compilation of collocation and bilingual collocation concordancer. Computer Assisted Language Learning, 10(3): 229–238. https://doi.org/10.1080/0958822970100303 CrossRef Google Scholar

Köhler, J., Philippi, S., Specht, M. & Rüegg, A. (2006) Ontology based text indexing and querying for the semantic web. Knowledge-Based Systems, 19(8): 744–754. https://doi.org/10.1016/j.knosys.2006.04.015 CrossRef Google Scholar

Lee, H., Warschauer, M. & Lee, J. H. (2019) The effects of corpus use on second language vocabulary learning: A multilevel meta-analysis. Applied Linguistics, 40(5): 721–753. https://doi.org/10.1093/applin/amy012 CrossRef Google Scholar

Lei, L. & Liu, D. (2019) Research trends in applied linguistics from 2005 to 2016: A bibliometric analysis and its implications. Applied Linguistics, 40(3): 540–561. https://doi.org/10.1093/applin/amy003 CrossRef Google Scholar

Lin, Z. & Lei, L. (2020) The research trends of multilingualism in applied linguistics and education (2000–2019): A bibliometric analysis. Sustainability, 12(15): 6058. https://doi.org/10.3390/su12156058 CrossRef Google Scholar

Liu, S. & Zhang, S. (2021) A bibliometric analysis of computer-assisted English learning from 2001 to 2020. International Journal of Emerging Technologies in Learning (iJET), 16(14): 53–67. https://doi.org/10.3991/ijet.v16i14.24151 CrossRef Google Scholar

Liu, Y. & Hu, G. (2021) Mapping the field of English for specific purposes (1980–2018): A co-citation analysis. English for Specific Purposes, 61: 97–116. https://doi.org/10.1016/j.esp.2020.10.003 CrossRef Google Scholar

Luo, Q. & Zhou, J. (2017) Data-driven learning in second language writing class: A survey of empirical studies. International Journal of Emerging Technologies in Learning (iJET), 12(3): 182–196. https://doi.org/10.3991/ijet.v12i03.6523 CrossRef Google Scholar

Meara, P. (2012) The bibliometrics of vocabulary acquisition: An exploratory study. RELC Journal, 43(1): 7–22. https://doi.org/10.1177/0033688212439339 CrossRef Google Scholar

Mizumoto, A. & Chujo, K. (2015) A meta-analysis of data-driven learning approach in the Japanese EFL classroom. English Corpus Studies, 22: 1–18.Google Scholar

Murphy, B. (1996) Computer corpora and vocabulary study. The Language Learning Journal, 14(1): 53–57. https://doi.org/10.1080/09571739685200391 CrossRef Google Scholar

O’Keeffe, A. (2021) Data-driven learning – A call for a broader research gaze. Language Teaching, 54(2): 259–272. https://doi.org/10.1017/S0261444820000245 CrossRef Google Scholar

O’Sullivan, Í. (2007) Enhancing a process-oriented approach to literacy and language learning: The role of corpus consultation literacy. ReCALL, 19(3): 269–286. https://doi.org/10.1017/S095834400700033X CrossRef Google Scholar

Park, H. & Nam, D. (2017) Corpus linguistics research trends from 1997 to 2016: A co-citation analysis. Linguistic Research, 34(3): 427–457. https://doi.org/10.17250/KHISLI.34.3.201712.008 CrossRef Google Scholar

Pérez-Paredes, P. (2022) A systematic review of the uses and spread of corpora and data-driven learning in CALL research during 2011–2015. Computer Assisted Language Learning, 35(1–2): 36–61. https://doi.org/10.1080/09588221.2019.1667832 CrossRef Google Scholar

Pérez-Paredes, P., Sánchez-Tornel, M. & Calero, J. M. A. (2012) Learners’ search patterns during corpus-based focus-on-form activities: A study on hands-on concordancing. International Journal of Corpus Linguistics, 17(4): 482–515. https://doi.org/10.1075/ijcl.17.4.02par CrossRef Google Scholar

Pritchard, A. (1969) Statistical bibliography or bibliometrics? Journal of Documentation, 25: 348–349. https://doi.org/10.1108/eb026482 Google Scholar

Pritchard, A. & Wittig, G. R. (1981) Bibliometrics: A bibliography and index. Watford: ALLM Books.Google Scholar

Sebastian, Y. & Chen, C. (2021) The boundary-spanning mechanisms of Nobel Prize winning papers. PLOS ONE, 16(8): e0254744. https://doi.org/10.1371/journal.pone.0254744 CrossRef Google Scholar PubMed

Shneider, A. M. (2009) Four stages of a scientific discipline; four types of scientist. Trends in Biochemical Sciences, 34(5): 217–223. https://doi.org/10.1016/j.tibs.2009.02.002 CrossRef Google Scholar PubMed

Smart, J. (2014) The role of guided induction in paper-based data-driven learning. ReCALL, 26(2): 184–201. https://doi.org/10.1017/S0958344014000081 CrossRef Google Scholar

Sun, X. & Hu, G. (2020) Direct and indirect data-driven learning: An experimental study of hedging in an EFL writing class. Language Teaching Research. Advance online publication. https://doi.org/10.1177/1362168820954459 CrossRef Google Scholar

Vannestål, M. E. & Lindquist, H. (2007) Learning English grammar with a corpus: Experimenting with concordancing in a university grammar course. ReCALL, 19(3): 329–350. https://doi.org/10.1017/S0958344007000638 CrossRef Google Scholar

Vyatkina, N. (2016) Data-driven learning for beginners: The case of German verb-preposition collocations. ReCALL, 28(2): 207–226. https://doi.org/10.1017/S0958344015000269 CrossRef Google Scholar

Vyatkina, N. (2020) Corpora as open educational resources for language teaching. Foreign Language Annals, 53(2): 359–370. https://doi.org/10.1111/flan.12464 CrossRef Google Scholar

Yoon, C. (2011) Concordancing in L2 writing class: An overview of research and issues. Journal of English for Academic Purposes, 10(3): 130–139. https://doi.org/10.1016/j.jeap.2011.03.003 CrossRef Google Scholar

Zareva, A. (2017) Incorporating corpus literacy skills into TESOL teacher training. ELT Journal, 71(1): 69–79. https://doi.org/10.1093/elt/ccw045 CrossRef Google Scholar

Dong et al. supplementary material

PDF 2.3 MB

Article contents

Charting the landscape of data-driven learning using a bibliometric analysis

Abstract

Keywords

Access options

Article purchase

Temporarily unavailable

References

Dong et al. supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests