1. Background
Among all academic genres, including conference proceedings, chapters, books and doctoral dissertations, research articles (RAs) in academic journals are often considered to have the greatest prestige and impact. In computer-assisted language learning (CALL), the major field-specific journals trace the “current trends and fashions in technologies” (Stockwell, Reference Stockwell2007: 107), and their evolution over time is a “testimony to the growth in quality, scope and influence of CALL research” (Gillespie, Reference Gillespie2020: 127). Given the dynamic nature (Stockwell, Reference Stockwell2007) and long history of CALL (Bax, Reference Bax2003), one outstanding question is how the field has evolved over time and what trends have been dominant in research. This research synthesis provides an overview of CALL history through highly cited RAs in four major CALL journals: ReCALL, CALICO Journal, Computer Assisted Language Learning (CALL) and Language Learning & Technology (LL&T). The aim of the study is to trace the evolution of CALL research from the past to the present day, complementing other recent syntheses (e.g. Gillespie, Reference Gillespie2020; Lai, Reference Lai2019; Lim & Aryadoust, Reference Lim and Aryadoust2021; Shadiev & Yang, Reference Shadiev and Yang2020) by concentrating on a larger dataset collected from CALL-specific journals and potentially providing a roadmap for future CALL studies. The value of a rigorous synthetic study – compared to other forms of secondary research such as narrative reviews, which are often considered to be traditional and non-systematic (Chong & Plonsky, Reference Chong and Plonskyin press) – lies in providing readers with “a systematic (i.e., exhaustive, trustworthy, and replicable) understanding of the state of accumulated knowledge” (Norris & Ortega, Reference Norris and Ortega2006: 6). Aside from that, they have the potential to cast light on the published literature at significant turning points in the progression of ever-changing fields (Norris & Ortega, Reference Norris and Ortega2006).
Chronologically, a number of reviews, syntheses and historical surveys have been published in both applied linguistics and educational technology to shed light on the evolution of CALL over time. Such informative reviews in the past three decades have covered, but are not limited to, CALL history and development, the usefulness of a particular type of technology such as mobile-assisted language learning (MALL), and CALL research trends and paradigms reflected in selected journals/databases.
We identified three major review studies, among many others, on CALL history and development. By offering a critical analysis of the history of the field up to 2003 and introducing three new phases – that is, restricted, open and integrated CALL – Bax (Reference Bax2003) argued that a proposed reassessment could pave the way for a more fine-grained analysis of academic institutions and classrooms in CALL contexts. He predicted that a state of normalization, through which technology becomes invisible through full integration, would dominate CALL classrooms in the coming years. With rapid technological advances, Levy and Hubbard (Reference Levy and Hubbard2005) highlighted the importance of accepting the term “CALL” and the role it has had in evaluating new language learning tutors and tools. A decade later, and to extend Bax’s concept of integrative CALL, Gimeno-Sanz (Reference Gimeno-Sanz2016) reflected upon the evolution of technology-enhanced language learning during 1990–2016 and discussed how future trends in CALL might develop. By describing the field from 2010 onwards as atomized CALL, Gimeno-Sanz (Reference Gimeno-Sanz2016) argued that the choice of technology depends on major factors such as mobility requirements and connectivity capabilities in the process of pedagogy-driven learning.
Another group of reviews have focused on the usefulness or adoption of a particular technology. Liu, Moore, Graham and Lee (Reference Liu, Moore, Graham and Lee2002), for instance, reviewed 246 studies during 1990–2000 and found that computers increased self-esteem, professional preparedness and overall language proficiency. Their findings also recognized the potential of software tools used in certain language skill areas. Burston’s (Reference Burston2015) detailed analysis of MALL studies – a subset of CALL – over 20 years revealed that over half reported no objectively quantifiable learning outcomes, a shortcoming that stemmed from the short duration of projects and small number of students involved. Also in MALL, Lai (Reference Lai2019) acknowledged the significance of journal impact (Xu et al., Reference Xu, Zhuang, Blair, Kim, Li, Thorson Hernández and Plonskyin press) and the role of high-impact papers in driving research trends, whittling down an initial pool of 2,445 MALL papers over 17 years to a final dataset of 100 highly cited papers to research new learning strategies and examine seldom-investigated domains and issues in MALL. Highly cited papers were found to have concentrated not only on comparing different mobile learning modes for finding effective learning approaches but also on learners’ higher-order thinking performance and learning behaviors.
The final group of reviews have traced CALL research trends and paradigms. Gillespie (Reference Gillespie2020) recently conducted an integrative overview of all published RAs in three leading CALL journals (ReCALL, CALL and CALICO Journal) 2006–2016. He found that writing has been the most investigated research topic, and the research methods employed are rigorous in writing, structure, theory, literature awareness, discussion and presentation of results; nonetheless, there are still weaknesses. Most empirical studies are small scale: based on one institution, a small group of students, over a short period of time and seldom followed up. Gillespie’s investigation is similar to ours in its emphasis on tracing the overall evolution of researching CALL. Shadiev and Yang (Reference Shadiev and Yang2020) examined 398 RAs published between 2014 and 2019. In a rather descriptive manner, the findings confirmed Gillespie’s (Reference Gillespie2020) results; English was the most common target language, writing, speaking, and vocabulary gained the most attention in published papers and 23 different technologies were identified. Lei and Liu (Reference Lei and Liu2019) conducted a bibliometric analysis of System over a sustained period of time (1973–2017). The findings indicated that foreign and second language learning/teaching practice issues, especially the use of various learning strategies and technologies, have a high weight in System publications, while more needs to be done to develop research into sociocultural and socio-psychological aspects of technology-mediated language instruction. Lastly, Lim and Aryadoust’s (2021) retrospective scientometric investigation of CALL research trends in 11 Scopus-indexed journals (1977–2020) identified seven major research clusters with a focus on synchronous computer-mediated communication (CMC) and negotiated interaction, multimedia, telecollaboration or email exchanges, blogs, digital games, wikis and podcasts to support language learning.
The present study addresses the pressing matters raised by the above syntheses and reviews, focuses on issues such as context, method and research focus that have been addressed, and complements – from a more inclusive, analytic scope – prior reviews that are deliberately limited in selection or remain largely descriptive. For maximum inclusiveness, our initial pool of papers featured all 2,397 RAs from four major CALL journals from their first appearance up to and including 2019. Principled and consistent procedures targeted the most influential of those papers, resulting in a final collection of 426 high-impact RAs – larger than most reviews (see, e.g., Plonsky & Ziegler, Reference Plonsky and Ziegler2016) and over a longer period of time. Although others have attempted to reduce a large pool by various means, which often results in bias towards older publications (see, e.g., Lai, Reference Lai2019), our novel methodology offers a solution to this issue. Another problem is that many reviews lean towards descriptive accounts (e.g. Shadiev & Yang, Reference Shadiev and Yang2020), while our analytic synthesis goes further, shedding light on less investigated dimensions, – that is, methodological and theoretical orientations as well as the research foci – giving us the chance to extend the scope of the study to provide a full picture of CALL research in different decades.
Four decades after the appearance of the first dedicated CALL journals we thus propose a critical investigation into broad contextual, participant-related methodological and theoretical considerations of CALL RAs. This can familiarize readers of CALL journals with the most engaged CALL micro- and macro-contexts, reveal what groups of participants are over- or under-researched and cast light on long-standing methodological as well as theoretical issues that have been overlooked in previous reviews. The synthesis also highlights the relative popularity of research foci and shows the progression of CALL research in response to the latest changes in academia concerning paradigm shifts and technological advances. The following questions were addressed to serve these purposes:
-
1. What research contexts and participant profiles characterize high-impact CALL RAs?
-
2. What research methodologies are employed and what theories underpin high-impact CALL RAs?
-
3. What are the research foci in high-impact CALL RAs, and how have these developed over time?
2. Methodology
2.1 Data source
Analysis of a large number of RAs published over a substantial period can suggest major findings for a technical and changing field such as CALL. Given the vast number of publications, the primary step was deciding a cut-off point and limiting the scope of the synthesis to a manageable but representative pool of data. To identify CALL journals with the highest impact in the field, we first used two of the most widely used resources to assess the quality and prestige of academic journals: Scimago Journal and Country Rank (SJR) powered by Scopus, and Journal Citation Reports (JCR) developed by the Web of Science (WoS) Group, 2019 edition. As both SJRFootnote 1 and JCR (the master journal list) provide quartile rankings Q1, Q2, Q3, Q4, we included only Q1 CALL journals to keep the study size within reasonable bounds. Additionally, we intended to focus on high-profile publications that are part of many CALL researchers’ cultures rather than more obscure items (Xu et al., Reference Xu, Zhuang, Blair, Kim, Li, Thorson Hernández and Plonskyin press). With ReCALL, CALL, LL&T, and CALICO Journal Footnote 2 as the top Q1 journals in the reference rankings, our pool of data was formed from all and only published RAs (thus excluding editorials, reports, commentaries and book and software reviews) in these journals over four decades (SupplementaryFootnote 3 material 1). All RAs, including the special issues, from ReCALL (483), CALL (776), LL&T (350) and CALICO Journal (788), published in English from the first issue to the end of 2019, were downloaded and then sorted chronologically for the second round of data processing.
2.2 Final dataset of high-impact papers
Working with 2,397 RAs was not feasible for the type of analysis we had in mind, and an all-inclusive approach might fail to identify real trends that are apparent from highly cited papers – a case of not seeing the wood for the trees. To reduce the pool, traditional syntheses (e.g. Smith & Lafford, Reference Smith and Lafford2009) tend to focus on citation metrics for journals (e.g. JCR) rather than articles, or data from various databases with complex algorithms to aggregate citations (e.g. CRExplorer); this increases the risk of missing well-known papers or including ones that are rarely cited. Thus, given the significance of high-impact RAs in reflecting recurring research trends (Lai, Reference Lai2019) and the role academic search engines play in assessing the quality and influence of academic publications (Xu et al., Reference Xu, Zhuang, Blair, Kim, Li, Thorson Hernández and Plonskyin press), our solution used the number of citations each paper has on Google Scholar.
Total citation numbersFootnote 4 for each of the 2,397 RAs were recorded, involving a comprehensive manual search of tens of thousands of pages of Google Scholar (Supplementary material 2). After considering various options, we arranged all the RAs in groups based on their year of publication in each journal and, as a cut-off point, we selected the top 15% of widely cited papers (including papers with equal ranks) from each individual year (Supplementary material 3). This procedure minimized the time bias between years as the RAs published in a given year were only in competition with each other and not with other years. Inclusion of the top 15% of widely cited papers was a principled choice that helped us build a dataset with papers that had the greatest impact, many being cited dozens or even hundreds of times. This method was particularly useful with very early (e.g. 1983) and recent (e.g. 2018) years, as the citation count of included papers from these specific years was below the overall average of total citations. Moreover, sticking to the 15% yardstick was to cut across the number of studies and assist with building a dataset of around 400 to 500 RAs. The final dataset thus consists of 426 high-impact papers in the field (Supplementary material 4). As can be seen in Figure 1, despite scattered peaks in the first two decades, almost 70% (294 RAs) of the dataset consists of papers published in the last two decades. The upward trend reflects the growing numbers of publications but also indicates a higher tendency to cite recently published papers: our journals have a mean citing half-life of 8.2, while half the papers cited in Q1 journals are over 10 years old.
2.3 Coding scheme
A coding scheme was developed (Supplementary material 5) to identify research trends and extract the main themes: research context, research participants, methodological and theoretical considerations, and research foci. To ensure consistency in the analysis, we followed Riazi, Shi and Haggerty (Reference Riazi, Shi and Haggerty2018) in opting for “a data-driven thematic approach” where coders “dr[a]w on the actual words of the authors as much as possible to describe the research focus and orientation rather than imposing a set of pre-conceived categories” (p. 44). Given the complexity of scholarly text analysis, particularly treating synonyms and hyponyms in abstract themes (i.e. research theories and foci), we accepted the authors’ terminology at face value rather than risk putting words in their mouths, as some individual terms are fuzzy or even polysemous, while other close pairs may be more or less synonymous but with specific distinctions for different authors. However, a certain degree of interpretation was inevitable in the analysis of more factual themes (i.e. contexts and participants of the studies) in order to standardize the categorization for meaningful quantifications. To ensure a maximum level of granularity, however, and given the significance of theme and subtheme extraction at the semantic level rather than the word level, we decided to cross-check the results of the analysis of more abstract themes against a similar batch of papers in the main dataset. For that purpose, we sampled 43 RAs (10% of the total) to record all possible synonyms and hyponyms of the adopted technologies (e.g. mobiles, tablets, apps, etc.), theories (e.g. learner autonomy, socio-cultural theory, etc.) and research foci (e.g. CMC, telecollaboration, etc.). The new results and the categorization scheme were cross-checked against the original ones attained in the main round of analysis and no significant difference was observed as a result of a chi-square test.
Recognizing the types of RAs was a crucial primary step, as the nature of the studies would determine what to expect in their structure and design. Prior categories were not imposed owing to the data-driven nature of our analysis, though as a rule of thumb, we defined empirical RAs as studies that collected and analyzed new (primary) data and had empirical results; theoretical RAs (position papers) traced and discussed CALL-related/linguistic theories or conjecture; descriptive RAs provided a description of a setting or phenomenon under investigation (cf. Mackey & Gass, Reference Mackey and Gass2012); studies that analyzed secondary sources were considered as reviews (see Section 3.1). Where available, information was collected on the countries where the studies were conducted (irrespective of the authors’ affiliations), settings, programs, learning environments and technology types in order to assist with understanding the individual research context. The participants’ status, age group, proficiency and first and target languages were collected to provide a snapshot of demographic information. The methodologies for data collection/analysis, together with the theories, approaches or frameworks that the authors adopted to conceptualize their work, were also collected by close reading of the relevant sections. Similarly, the research focus identified the primary research areas, including salient aspects of language learning/teaching. To establish reliability, two independent coders analyzed 10% of the data. Occasional discrepancies, mostly concerning research theories and foci, were resolved through discussion, and an inter-coder r of 89.8% was subsequently attained using Pearson correlation. Although a major part of the analysis was done manually, NVivo 12 and AntConc (Anthony, Reference Anthony2022) helped to overcome some limitations of working with a coding scheme. NVivo 12 was mainly adopted to assist with identifying,Footnote 5 recording and merging the research theories and foci within the relevant sections of texts of the RAs. AntConc was used as a supplementary tool for keyword search and cluster analysis to double-check the recurring list of research foci.
3. Results
Considering the synthetic nature of the study, an initial report on the entire collection of RAs in each time period is presented prior to exploring the three main research themes in the main dataset. CALICO Journal was founded in 1983, followed by ReCALL (1989), CALL (1990) and LL&T (1997). With the exception of CALICO Journal, which started with five issues, then three to four in subsequent years, there was an increase in the number of published issues in ReCALL (one to three), LL&T (two to three) and CALL (one to eight). Accordingly, the number of the published RAs has increased significantly over almost 40 years (Table 1).
The number of CALL-specific journalsFootnote 6 increased significantly in the first two decades in particular (1980s–1990s), and the number of published RAs multiplied correspondingly (from 204 in the 1980s to 893 in the 2010s). The first finding is that similar to broader fields of enquiry such as applied linguistics (Plonsky, Reference Plonsky2014) and SLA (Plonsky, Reference Plonsky2013), publication output in CALL is increasing steadily (see the trendline in Figure 2). However, such a quantitative increase might not reflect CALL research quality, particularly owing to the existing requirements of academic evaluation, professional promotions – what Colpaert (Reference Colpaert2012) has referred to as the “publish and perish” syndrome.
3.1 Context and participant-related dimensions of high-impact CALL RAs
Raw frequencies and percentages are given in the following tables or texts to reveal the recurring themes of the final dataset. Not all percentages add up to 100%, as only highlights and emerging themes are reported here for reasons of space; the full list of research themes is given in Supplementary material 5.
Our findings indicate that empirical studies are by far the most prevalent type of RAs. A large number (163, 38.2%) measure participants’ attitudes or survey their technology-related experiences; another large batch (90, 21%) focuses on learning outcomes in and out of educational settings. This may be rooted in the tendency for CALL researchers to set up empirical projects to explore a new teaching technology and use it to develop research participants’ knowledge and skills (Gillespie, Reference Gillespie2020). Literature reviews, meta-analyses, research syntheses and other types of secondary analyses ranked next, followed by descriptive and theoretical studies (Table 2). Studies linked to more than one specific RA type were combined in a “merged” category.
a Literature review (25), meta-analysis (8), critical (7), synthesis (7), state-of-the-art (1).
b Empirical & descriptive (11), evaluative & critical (1), evaluative & empirical (3), evaluative & theoretical (1), review & descriptive (3), review & evaluative (1), theoretical & descriptive (1), theoretical & review (1).
3.1.1 Research context
Countries. A total of 37 countries are represented, with the top nine featured 10 or more times, contributing 55.7% of the total. The dominance of inner-circle English speaking countries (particularly the USA, the UK and Australia) alongside East Asian counties (notably Taiwan, Japan and Hong Kong) is noticeable over the four time periods. However, CALL research has expanded internationally over the last two decades owing to the occasional contribution (13.4% of the total) of the 16 outer-circle countries (e.g. France, Turkey, Italy, Iran, Mexico and Thailand) and the remaining 12 (2.7%) from the Americas, Europe, Asia, Australasia and Africa (e.g. Chile, Macau, Columbia, South Africa and Oman).
Settings. The setting is largely dominated by foreign languages (FL) (189, 44.3%), overwhelmingly English as a foreign language (EFL) (103, 24.1%). English as a second language (ESL) (57, 13.3%) is the second dominant setting, followed by combinations: EFL and ESL (9, 2.1%), EFL and other FLs (9, 2.1%), as well as French as a foreign/second language (9, 2.1%). English as a lingua franca (5, 1.1%) and Chinese as a second/foreign language (5, 1.1%) were among the less common settings.
Programs. In all periods, universities (240, 56.3%) were the leading macro-level programs in which the studies were conducted, reflecting the type of participants that CALL researchers are willing to investigate and have easy access to. CALL research has also been conducted among children from kindergarten (age 5+) to 12th grade (age 17–18, pre-university) (28, 6.5%), in colleges (7, 1.6%) and institutesFootnote 7 (6, 1.4%), although their overall weight in CALL publications is well below that of universities. There were very few cases of mixed contexts (7, 1.6%), and two non-educational programs.
Learning environments. The micro-level environments where learning takes place have changed significantly with the emergence and accessibility of the internet in the last three decades. The largest group of studies (103, 24.1%) was conducted in virtual environments, though their presence in CALL research is less than the combined weight (134, 31.4%) of face-to-face environments, whether classrooms (91, 21.3%) or lab settings (43, 10%). There were also occasional cases of combinations of two or more environments (35, 8.2%), where research participants were physically and virtually present.
Technology types. Given the different types of technologies in CALL, some degree of merging while categorizing the tools and media was inevitable. The top five technologies adopted over the years have been software or courseware (45, 10.5%), text chatting and WeChat (28, 6.5%), multimedia (22, 5.1%), corpora and concordancing (22, 5.1%), as well as Web 2.0 (20, 4.6%), which overall contribute almost one third of all technology types. They are followed by videoconferencing (15, 3.5%), wikis and blogs (15, 3.5%), apps (14, 3.2%), mobile phones and tablets (14, 3.2%) and digital games (13, 3%) – all technologies widely used during the past two decades. Among less common technologies, ICALL (11, 2.5%), videos and videodiscs (11, 2.5%), various forms of digital boards (10, 2.3%) and emails (9, 2.1%) have been adopted over the four decades; however, Second Life (9, 2.1%), Facebook and other social networking sites (9, 2.1%), as still-in-use technologies (Shadiev & Yang, Reference Shadiev and Yang2020), were introduced to educational settings in the most recent decade.
3.1.2 Research participants
Status. In line with Lai (Reference Lai2019), CALL research here is largely concerned with “learners” (260, 61%), depicting the nature of technology-mediated instruction. Indeed, regardless of the learning environment, be it on- or off-campus, private or public language schools or even a teacher training course, virtually every participant can essentially be deemed a learner in a CALL program, learning a skill or language component with the use of technology. There were just 22 studies (5.1%) in which both learners and instructors were involved, 9 (2.1%) with the active involvement of (preservice) instructors and 2 with the participation of student-teachers.
Age group. Age was not explicit in 279 studies (65.4% of the total), with 12% mentioning only the educational/course level (e.g. undergraduate, graduate, first/second year or master’s) instead of the specific age group. There were 113 (26.5%) studies that included adult participants aged 18–70 (105 undergraduates and eight graduates), 12 (2.8%) that included participants in the 7–17 age group (eight primary and four high school), and 22 (5.1%) had a mix of adult and younger participants.
Proficiency. In keeping with the results of Burston and Arispe (Reference Burston and Arispe2018), our findings indicate that almost 60% of the studies do not report the proficiency level – or, if they do, use vague terms that complicate categorization. With limited information, we adopted the general descriptors the authors themselves use (e.g. “a group of high-intermediate learners”), along with the criteria proposed (Akiyama & Cunningham, Reference Akiyama and Cunningham2018) in the conversion table (Supplementary material 6). The target language proficiency of participants in 162 studies (38% of the total) fell within the range of low to high intermediate (83, 19.4%), advanced (41, 9.6%) and beginner (38, 8.9%). The results also revealed that a large batch (103, 24.1%) reported groups with mixed levels of proficiency. Ten others did not lend themselves to any of the existing categories since the authors used general educational or course-related descriptors (e.g. “engagement with English in the wild in a 1-week LD [Language Diary]”).
First and target language (L1 and L2). One third of the papers did not report the L1. A large number of research participants speak English (92, 21.5%) or Chinese (37, 8.6%) as their L1; Japanese (10, 2.3%), Turkish, Spanish and Dutch (8 each, 1.8%) are among the other common L1s. The results also showed that the additional 15 languages such as Korean, Persian and French accounted for 5.8% of the total publications, and 90 (21.1%) had groups with participants of more than one language background (i.e. mixed) or bi/multilingual participants. Among L2s, English (150, 35.2%) was overwhelmingly dominant, followed by Spanish (36, 8.4%), French (27, 6.3%), German (22, 5.1%), Japanese (12, 2.8%) and Chinese (7, 1.6%). Disturbingly, 28% of these highly cited RAs did not report the target language.
3.2 Methodological and theoretical considerations
3.2.1 Research methodology
The methodological orientations of influential CALL RAs are summed up in Table 3. Confirming the findings of Xu et al. (Reference Xu, Zhuang, Blair, Kim, Li, Thorson Hernández and Plonskyin press) for applied linguistics and Lai (Reference Lai2019) for MALL studies, CALL researchers tend to adopt quantitative more than qualitative methodologies, a trend that has been more significant in the 1980s and the 1990s. In the middle, eclectic methodology (i.e. a mix of quantitative and qualitative methodologies) has been widely used over the entire period; this contrasts with the much smaller number of studies framed specifically as mixed methods in the latest decade. Given the interdisciplinarity of CALL research (Hubbard & Colpaert, Reference Hubbard and Colpaert2019), it appears natural for CALL researchers to employ diverse methodological paradigms.
Note. “Eclectic” studies combine quantitative and qualitative data and analysis but with no convergence or any explicit mention of mixed methods, and the authors do not draw on the literature of mixed methods to frame their work. “Mixed methods” only applies if the authors explicitly used this term in the text. The total stands at 426 as each study was assigned to a single category.
A second-round analysis of the 91 RAs with an eclectic methodology revealed that equal weight (Table 4) was assigned to qualitative and quantitative data/analysis in 58 studies (63.7%), a trend that has increased in the 2000s; 23 papers (25.2%) were predominately qualitative and 10 (10.9%) mainly quantitative.
Note. Upper-case letters show the heavier weight of the adopted methodologies.
3.2.2 Research theories, approaches or frameworks
A large number of research theories, approaches or frameworks (henceforth grouped as “theories” under Hubbard’s, Reference Hubbard2009, classification) are rooted in numerous disciplines, from SLA and psychology to educational technology and computer sciences. Overall, 634 recurring (116 unique) theories that conceptualize CALL research were collected – substantially more than similar syntheses (e.g. Hew, Lan, Tang, Jia & Lo, Reference Hew, Lan, Tang, Jia and Lo2019). In other words, the final dataset included 184 studies that adopted more than one theory to frame their study (e.g. Aydın & Yıldız, Reference Aydın and Yıldız2014, based upon four theories) alongside 162 single-theory papers. Our analysis identified 80 (18%) atheoretical papers inasmuch as they did not explicitly adopt any theory to frame their research. All the recurring theories identified are ordered in Table 5 (based on percentages for an easier visual representation) from the most to the least prevalent. Among widely adopted SLA-related theories, sociocultural (63, 9.9%), interactionist (40, 6.3%), collaborative learning (34, 5.3%) and constructivism (32, 5%) contribute to framing one third of high-impact CALL RAs over the decades. However, theories heavily rooted in educational technology and computer sciences such as data-driven learning (18, 2.8%) or dual-coding of memory and cognition (7, 1.1%) have contributed to theorizing CALL papers in the last two decades.
It should be noted that an additional 58 (9.1%) theories were used only once in the final dataset (Supplementary material 5, column R). Moreover, 26 RAs adopted theories or approaches in order not to frame the study but only to inform (e.g. the data collection and analysis), and are not presented here.
3.3 Research focus
The diverse aspects of CALL and the multiple research foci in each study posed major challenges. Therefore, to attain a high level of granularity, an exhaustive list of research foci was compiled, including all identified aspects of language teaching and learning (initial coding) – 690 recurrent research foci in a form of relevant strings of texts were collected using NVivo 12. Following Charmaz (Reference Charmaz2006), those strings were later manually merged into major themes adopting a constant comparison method (focused coding), and 119 unique research foci were identified (Supplementary material 5, column T). Due to space constraints, only the dominant ones appearing in at least three studies (n = 45) are discussed here.
3.3.1 Dominant research foci
As illustrated in Table 6, the top 10 characterize nearly half of all the research foci over the four time periods.
The most dominant research areas are CMC (9.8%), writing (6.5%) and vocabulary (4.7%). One of the main findings is that although most of the top 10 topics increased in popularity in the first three decades, as a group, five have subsequently declined, a fact that is noticeable with the decline in entries (108 RAs) during 2010–2019 (Figure 3). The rise of telecollaboration in the most recent decades due to the prevalence of the internet and also the considerable attention of CALL researchers to speaking should also be highlighted.
3.3.2 Intermediary research foci
The 17 mid-frequency foci (Supplementary material 7) constitute almost one third of the total (213, 30.8%); together with the top 10, they represent 76.4% of the overall dataset. But there is a major difference: with the exception of reading and teacher education, which see an increase from the second to third time period and then a decline in the fourth, most of the research foci of this category have remained popular or increased over time.
3.3.3 Less dominant research foci
The 18 foci in Supplementary material 8 largely relate to teaching practices or learning issues. They represent 77 (11.1%) of the total and their frequency has been low (often zero) across the four time periods.
4. Discussion
The current synthesis traces the evolution of contextual descriptors, methodological and theoretical underpinnings as well as the research focus in influential RAs published in four major CALL journals from their inception up to and including 2019. Compared to reviews conducted by Gillespie (Reference Gillespie2020) and Lim and Aryadoust (Reference Lim and Aryadoust2021), our synthesis is unique in its methodology for extracting a principled sample of RAs from CALL journals. It also shifts the focus to RAs with a high impact, not merely because researchers tend to cite RAs they perceive to be high quality (Aksnes, Langfeldt & Wouters, Reference Aksnes, Langfeldt and Wouters2019), but because such papers are considered to contribute to scientific influence (Xu et al., Reference Xu, Zhuang, Blair, Kim, Li, Thorson Hernández and Plonskyin press). Our study also offers detailed information about the settings, programs and learning environments – not just countries or territories – where the research was conducted, the technologies adopted, the participants’ profiles, and the theoretical orientations that conceptualize CALL research.
4.1 Contexts and participants
Addressing the first research question on contexts and participant profiles, this synthesis reveals that a large proportion of CALL research has taken place in inner-circle English speaking countries, expanding recently to the outer circle. FL settings, university undergraduate programs and face-to-face learning environments form the micro-level context. Research participants are mainly learners aged 18–30 who are predominantly at the intermediate level of proficiency and use software, courseware and chat devices for EFL or ESL. The findings, partly supported by previous reviews (Gillespie, Reference Gillespie2020; Lai, Reference Lai2019; Shadiev & Yang, Reference Shadiev and Yang2020), reflect the reality of teaching and learning practices in academia and the research cultures of the CALL community.
The results testify that CALL has expanded internationally owing to the significant contribution of Asia, Europe and North America in the 2000s and 2010s. Given the recent decline of input from the USA (Gillespie, Reference Gillespie2020), the emergence of other CALL-oriented journals (e.g. CALL-EJ and IJCALLT) in Asia with international editors, reviewers and readers, together with the establishment of localized organizations (e.g. AsiaCALL), demonstrates that CALL research is increasingly embraced in contexts beyond the traditional inner-circle English speaking countries. It should also be added that despite the high engagement of the regions mentioned, few high-impact RAs have been published in Africa and South America (just one in South Africa and three in Argentina, Columbia and Chile), perhaps due to infrastructural issues. All studies have noted constant changes in technology adoption decade after decade: microcomputers, parsers and HyperCards have been replaced with phones, Second Life and MOOCs. The use of cutting-edge technologies in micro- and macro-educational settings has become normalized and is expected to increase (Gimeno-Sanz, Reference Gimeno-Sanz2016) owing to the widespread adoption of such technologies in learners’ daily lives. An example of this is Lan, Lyu and Chin (Reference Lan, Lyu and Chin2019) who adopted authentic contexts in Second Life to enhance the Mandarin essay writing by learners of Chinese as a second language. As for research participants, the dominance of adult learners – but not middle-aged adults – who speak English (21.5%) or Chinese (8.6%) as their L1 was an expected outcome considering the status of these two commonly studied/spoken languages, though the proportion with Spanish (8.4%) as a target language was less expected. The participants’ status shows that a large number of RA authors and researchers are teachers at tertiary levels, though clearly most teachers are not researchers in higher education. This issue, along with other contextual realities such as the high concentration of CALL papers with adult participants in universities (56.3%) as opposed to younger learners (6.5%), begs the question of whether research published in high-impact CALL journals mirrors the conventional teaching and learning practices at school level. Overall, though, CALL research contexts and participants have become progressively more diverse over the last two decades, leading to the active engagement of countries, learning environments and participants in empirical studies, educational technology projects and classroom-based research.
4.2 Research methodologies and theoretical underpinnings
Regarding the second research question, our analysis indicates that quantitative (24.6%), eclectic (21.3%) and qualitative (20.6%) methodologies have been present over the four time periods, while mixed methods (5.1%) have become more frequent in the last two decades. This is in line with Shadiev, Hwang and Huang (Reference Shadiev, Hwang and Huang2017) who found – albeit in a study of more limited scope – that quantitative, eclectic and qualitative methodologies were predominant in 57 MALL RAs published in the top 10 SSCI journals in the field of educational technology. To trace the trend, we found that early years of CALL research were largely dominated by quantitative methodologies to assess the impact of new technologies of the time (e.g. microcomputers and parsers) on learning and teaching practices. Qualitative data collection and analysis as well as eclectic methodologies were adopted more frequently in subsequent years as more complex research questions were formed, and the issues associated with real-life problems of the research participants were addressed. As an expected outcome, an increasing reliance on methodology triangulation might indicate a tendency among CALL researchers to add depth and complexity (Xu et al., Reference Xu, Zhuang, Blair, Kim, Li, Thorson Hernández and Plonskyin press) to the design of their studies. Another notable trend has been the greater adoption of mixed methods in exploratory studies, such as that of Xu and Peng (Reference Xu and Peng2017) who investigated mobile-assisted oral feedback in teaching Chinese as a second language; an encouraging move that suggests more CALL researchers are conducting multiple analyses of the same dataset rather than running the traditional quasi-experimental pre-/post-tests or questionnaires with limited groups of participants over a short period of time. An unexpected finding, however, is that over 28% of the RAs did not adopt a specific methodology, or, if they did, it was vaguely discussed – a point that testifies to poor reporting practices even in prestigious CALL publications.
Our observations confirm findings in Hew et al. (Reference Hew, Lan, Tang, Jia and Lo2019) and Yim and Warschauer (Reference Yim and Warschauer2017) that there is very little explicit engagement with theories in CALL – just 18% of the studies here. However, given that adopting well-established theories should be a common practice to add coherence and depth to research (Levy, Reference Levy2000), we found that high-impact CALL RAs have become more theory driven in the most recent decade, particularly with the development of “middle-range theories” such as dual-coding of memory and cognition. Hew et al. (Reference Hew, Lan, Tang, Jia and Lo2019) believe such CALL-related theories “can both explain empirical findings in a concrete way and demonstrate the ability to frame a variety of research topics in the field” (p. 13). Over the first two decades though, SLA-related theories such as sociocultural (63, 9.9%) and interactionist (40, 9.3%) paradigms attracted more attention, an expected result that may indicate that CALL – similar to other fields, such as linguistics and educational technology – has been informed by the social and cognitive turn observed in the dominant theories of its time.
4.3 Research foci
For the third and final research question, the identification of 119 research foci (out of 690 recurrent themes in our collection) complements similar syntheses like that of Lei and Liu (Reference Lei and Liu2019), who listed the 168 most frequent research topics in System over five decades, and Lim and Aryadoust (Reference Lim and Aryadoust2021), who identified seven major research clusters in 11 CALL journals. However, unlike the above-mentioned studies, our synthesis is unique in offering a bird’s-eye view of the research foci reflecting all identified aspects of language teaching and learning. In other words, by taking account of the technologies adopted as a separate context-related subtheme, the research foci identified truly reflect the primary research areas in high-impact CALL RAs and are not overrepresented by technology-related themes. As expected, and similar to the findings of Gillespie (Reference Gillespie2020), the most central themes overall are CMC (9.8%) and the four language skills, along with the rudiments of first/second language learning/teaching such as vocabulary (4.7%), feedback (3.7%), evaluation (3.6%), design (3.3%) and grammar (3.3%). In other words, classroom-related research areas, physical and virtual learning environments and in-house projects still inform the research foci after four decades. However, in the last 20 years, the concentration has slowly shifted from skills and knowledge-based research foci – which could be an indication of saturation of such areas – towards studies that consider learner-related variables (e.g. learner autonomy) and the effectiveness of educational technology. Additionally, as Gillespie (Reference Gillespie2020) puts it, the increasing breadth of research foci informing the field can be observed in the most recent time periods, when new foci such as collocations, learning outcomes and identity practices have emerged and been addressed in special issues of CALL journals in the form of position and review papers.
4.4 Roadmap for future CALL studies
Despite encouraging findings, some long-standing issues of CALL research practices are yet to be resolved. First, clear reporting of basic demographic information and design should be seen as a requirement in future publications. For instance, a key concept such as the level of language proficiency is notoriously vague – “advanced” can mean very different things to different people in different contexts; therefore, clarification on the in-house tests adopted or the language background of the participants can give readers of CALL journals real insights into the learners’ level of proficiency and situate the studies more helpfully.
Collaborative projects could expand empirical studies to larger samples and across contexts (cf. Gillespie, Reference Gillespie2020: 138), and could help switch the focus from typical research contexts and participants (e.g. undergraduates, ESL/EFL settings, universities and colleges) to other micro- and macro-contexts, plus less commonly taught languages that have remained under-researched in educational settings over time. Less investigated strategic research foci proposed in other syntheses, such as CALL and ethics (Gillespie, Reference Gillespie2020) and cultural CALL (Kramsch, Reference Kramsch2018), especially through longitudinal studies with integrated research methodologies, also need further work. Moreover, CALL researchers have recently tended to move towards sophisticated areas and interdisciplinary projects in the field of computer sciences, including eye-tracking technology and robot-assisted language learning. The shift towards such trends necessitates the adoption of integrated methodologies, complex designs and well-established theories so that CALL research can reflect its current interdisciplinary state (Hubbard & Colpaert, Reference Hubbard and Colpaert2019).
A final observation concerns the theoretical orientation of high-impact CALL publications. The CALL community would benefit from more theoretical or position papers – considering their lower weight (18, 4.2%) compared to the empirical studies (283, 66.4%) – that propose models and develop frameworks for CALL research, helping novice authors think strategically when it comes to the adoption of CALL in learning environments. Theory can drive empirical research, and empirical studies can drive theoretical developments in a virtuous circle. Some SLA-oriented theories have informed research over time, and CALL-driven theories such as the technological pedagogical content knowledge framework, data-driven learning and ecological CALL attracted attention in the 2000s and 2010s. In general though, one important finding is that theory is often mentioned just in passing, a nod to something researchers feel they have to acknowledge in the introduction, background or literature review, but is not a major part of the study itself. Overall, the weight of entirely atheoretical papers has been high, nearly one fifth of the dataset. There is plenty of scope to address the significance of explicit theoretical engagement – not just a mention of the word theory – in future publications by stakeholders in the CALL research community and CALL associations (e.g. EUROCALL) at conferences or annual gatherings. After all, theory can give rise to new research questions that might not necessarily be obvious in the day-to-day of teaching, which is often what inspires CALL research. In addition, by issuing submission guidelines, the editorials of CALL journals can also ensure that authors articulate the theoretical framework of their studies clearly.
Our methodology was designed to target exclusively high-impact CALL publications from top journals in English. Many other aspects are indeed present in the rich universe of CALL research; however, they are not gaining widespread acknowledgement precisely because they are less frequently cited. Any suggestions for future avenues should thus be combined with strategies not just for research areas but also for publication venues and for attracting a wider readership among the citing community. This, of course, is no mean feat, but it does underline that publishing alone is not enough.
5. Conclusion
The present computer-assisted research synthesis sheds light on the contextual, methodological and theoretical aspects, as well as the research foci of high-impact RAs in four major CALL journals. The growth of CALL research after four decades of existence aligns with several encouraging findings such as the international reach of CALL, the preponderance of empirical RAs with adult learners in physical or virtual learning environments, the adoption of pioneering technologies and the increase of mixed-methods RAs framed by well-established social and cognitive theories. On the downside, a number of long-standing issues remain, notably the dearth of theoretical, review and meta-analytical studies, the lack of research among less traditional contexts and participants, the number of atheoretical papers and the focus on a relatively limited number of research areas. By extending the scope of previous syntheses, our aim was to go beyond the traditional features examined in CALL syntheses and consider less investigated areas such as theories and methodologies underlying RAs over four decades, using computer software to compile and analyze a targeted dataset of high-impact CALL papers. Nonetheless, research syntheses of this kind have their inevitable limitations at various phases of data collection (periods covered, journals included) or analysis (interpretation of abstract themes such as underpinning theories and research foci). For obvious reasons, it is not feasible to go through thousands of RAs one by one, analyzing all sections and commenting on them individually. Our observations could be complemented and tested against other sources such as other Scopus and SSCI-indexed CALL journals (e.g. JALT CALL) and CALL RAs in applied linguistics or educational technology journals.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S0958344023000137
Ethical statement and competing interests
The authors declare no competing interests. Elements of this paper were presented at EUROCALL conferences in 2020, 2021 and 2022.
About the authors
Yazdan Choubsaz is a PhD candidate in applied linguistics in the Department of English Language and Literature at Shahid Chamran University of Ahvaz, Iran. Over the years, he has published research articles in peer-reviewed scholarly journals, authored book chapters in his area of expertise, technology-mediated language instruction, and presented papers at international ELT and CALL conferences.
Alireza Jalilifar is Professor of Applied Linguistics in the Department of English Language and Literature at Shahid Chamran University of Ahvaz, Iran, where he teaches discourse analysis and advanced research at the postgraduate level. He has published and presented papers on academic discourses. His main interests include second language writing, genre analysis, and academic discourse.
Alex Boulton is Professor of English and Applied Linguistics at the ATILF, CNRS & University of Lorraine, France. Particular research interests focus on corpus linguistics and potential uses for “ordinary” teachers and learners (data-driven learning), and research syntheses of related fields in applied linguistics.
Author ORCIDs
Yazdan Choubsaz, https://orcid.org/0000-0002-4916-5573
Alireza Jalilifar, https://orcid.org/0000-0002-8123-6757
Alex Boulton, https://orcid.org/0000-0001-6306-8158