IMPLICIT AND EXPLICIT CORRECTIVE FEEDBACK AND THE ACQUISITION OF L2 GRAMMAR
Published online by Cambridge University Press: 21 April 2006
Abstract
This article reviews previous studies of the effects of implicit and explicit corrective feedback on SLA, pointing out a number of methodological problems. It then reports on a new study of the effects of these two types of corrective feedback on the acquisition of past tense -ed. In an experimental design (two experimental groups and a control group), low-intermediate learners of second language English completed two communicative tasks during which they received either recasts (implicit feedback) or metalinguistic explanation (explicit feedback) in response to any utterance that contained an error in the target structure. Acquisition was measured by means of an oral imitation test (designed to measure implicit knowledge) and both an untimed grammaticality judgment test and a metalinguistic knowledge test (both designed to measure explicit knowledge). The tests were administered prior to the instruction, 1 day after the instruction, and again 2 weeks later. Statistical comparisons of the learners' performance on the posttests showed a clear advantage for explicit feedback over implicit feedback for both the delayed imitation and grammaticality judgment posttests. Thus, the results indicate that metalinguistic explanation benefited implicit as well as explicit knowledge and point to the importance of including measures of both types of knowledge in experimental studies.This research was funded by a Marsden Fund grant awarded by the Royal Society of Arts of New Zealand. Researchers other than the authors who contributed to the research were Jenefer Philip, Satomi Mizutami, Keiko Sakui, and Thomas Delaney. Thanks go to the editors of this special issue and to two anonymous SSLA reviewers of a draft of the article for their constructive comments.
- Type
- Research Article
- Information
- Copyright
- © 2006 Cambridge University Press
Corrective feedback takes the form of responses to learner utterances that contain an error. The responses can consist of (a) an indication that an error has been committed, (b) provision of the correct target language form, or (c) metalinguistic information about the nature of the error, or any combination of these.
There has been a growing interest in the role of corrective feedback in SLA in the last decade. A number of descriptive studies based on data collected in classrooms (e.g., Panova & Lyster, 2002; Sheen, 2004) and on data collected in a laboratory-type setting (e.g., Iwashita, 2003; Mackey, Oliver, & Leeman, 2003; Philp, 2003) have examined the types of corrective feedback received by learners and the extent to which this feedback is noticed, or uptaken, or both by the learners. Experimental studies have attempted to examine the contribution that corrective feedback makes to acquisition (e.g., Ayoun, 2004; Han, 2002; Leeman, 2003; Lyster, 2004). This research has addressed, among other issues, the relative efficacy of implicit and explicit types of corrective feedback.
THEORETICAL ISSUES
A distinction can be drawn between implicit/explicit learning and implicit/explicit knowledge. In the case of learning, the term implicit refers to “acquisition of knowledge about the underlying structure of a complex stimulus environment by a process that takes place naturally, simply and without conscious operations,” whereas explicit learning is “a more conscious operation where the individual makes and tests hypotheses in search for a structure” (N. Ellis, 1994, p. 2).
In the case of knowledge, the term implicit refers to knowledge that learners are only intuitively aware of and that is easily accessible through automatic processing, whereas explicit knowledge consists of knowledge that learners are consciously aware of and that is typically only available through controlled processing. Explicit knowledge might be linked to metalinguistic labels. These two types of knowledge are not mutually exclusive; that is, speakers can hold implicit and explicit representations of the same linguistic feature, as, for example, in the case of linguists who formulate explicit rules on the basis of their implicit knowledge of a language.
Schmidt (1994) stated that “implicit and explicit learning and implicit and explicit knowledge are related but distinct concepts that need to be separated” (p. 20). It can be argued, for example, that implicit knowledge is not entirely dependent on implicit learning but can arise as a product of learners intentionally practicing linguistic forms that they initially know explicitly (DeKeyser, 2003). It has also been argued that the development of implicit learning involves at least some degree of consciousness, as when learners notice specific linguistic forms in the input (Schmidt).
Corrective feedback differs in terms of how implicit or explicit it is. In the case of implicit feedback, there is no overt indicator that an error has been committed, whereas in explicit feedback types, there is. Implicit feedback often takes the form of recasts, defined by Long (in press) as
a reformulation of all or part of a learner's immediately preceding utterance in which one or more non-target like (lexical, grammatical etc.) items are replaced by the corresponding target language form(s), and where, throughout the exchange, the focus of the interlocutors is on meaning not language as an object. (p. 2)
Recasts, therefore, provide positive evidence, but, as Nicholas, Lightbown, and Spada (2001) noted, it is not clear whether they provide negative evidence, as learners might have no conscious awareness that the recast is intended to be corrective. Explicit feedback can take two forms: (a) explicit correction, in which the response clearly indicates that what the learner said was incorrect (e.g., “No, not goed—went”) and thus affords both positive and negative evidence or (b) metalinguistic feedback, defined by Lyster and Ranta (1997) as “comments, information, or questions related to the well-formedness of the learner's utterance” (p. 47)—for example, “You need past tense,” which affords only negative evidence.
It can also be argued that recasts and explicit corrective strategies differ in terms of whether they cater to implicit or explicit learning. For Long (1996), recasts work for acquisition precisely because they are implicit, connecting linguistic form to meaning in discourse contexts that promote the microprocessing (i.e., noticing or rehearsing in short-term memory) required for implicit language learning. Doughty (2001), building on Long's rationale for focus-on-form, argued that recasts constitute the ideal means of achieving an “immediately contingent focus on form” and afford a “cognitive window” (p. 252) in which learners can rehearse what they have heard and access material from their interlanguage. In contrast, explicit corrective feedback strategies, such as metalinguistic feedback, are more likely to impede the natural flow of communication and to activate the kind of learning mechanisms that result in explicit rather than implicit second language (L2) knowledge. However, such a view is not unproblematic.
First, it is not certain that all recasts are as implicit as Long (1996) and Doughty (2001) assumed. Some recasts are quite explicitly corrective. Indeed, the kind of corrective recasts that Doughty and Varela (1998) employed in their experimental study were remarkably explicit. They were preceded by a repetition of the learner's utterance with the erroneous elements highlighted by emphatic stress. If the learner did not self-correct, recasts with emphatic stress to draw attention to the reformulated elements followed. Thus, if the corrective force of the recast becomes self-evident, it is difficult to argue that it constitutes an implicit or even a relatively implicit technique. Second, recasts can only work for acquisition if learners notice the changes that have been made to their own utterances, and there are reasons to believe that they do not always do so. Lyster (1998) has shown that the levels of repair in uptake following recasts are notably lower than those following more explicit types of feedback. The findings from Lyster's research, which examined immersion classrooms in Canada, were corroborated by Sheen (2004), who found that repair occurred less frequently following recasts than following explicit correction and metalinguistic feedback in four different instructional contexts (immersion, Canadian English as a second language [ESL], New Zealand ESL, and Korean English as a foreign language). Even though repair cannot be taken as a measure of learning, it is reasonable to assume that it constitutes a measure of whether learners have noticed the key linguistic forms (although noticing can occur even if there is no uptake). Further evidence of the difficulty that learners might experience in attending to the key forms comes from a study by Mackey, Gass, and McDonough (2000), which demonstrated that learners often failed to perceive recasts that contained morphosyntactic reformulations as corrections. Finally, we cannot be certain that recasts promote acquisition of implicit knowledge. Indeed, it is entirely possible that recasts result in explicit knowledge, as demonstrated in Long, Inagaki, and Ortega (1998); in this study, all eight students who had learned the target structure (Spanish adverb word order) through recasts were able to explicitly and correctly formulate an explanation of the rule. Thus, there are some doubts as to how effective recasts are in promoting learning as well as to what kind of learning and knowledge they cater.
A case can also be made for the contribution of corrective strategies that are self-evidently corrective to learning. Carroll's (2001) autonomous induction theory posits that feedback can only work for acquisition if the corrective intentions of the feedback are recognized by the learner. Additionally, learners must be able to locate the error; Carroll noted that “most of the indirect forms of feedback do not locate the error” (p. 355). Recasts do not overtly signal that an error has been made and might assist in locating the error depending on whether the recast is full (i.e., the whole erroneous utterance is reformulated) or partial (i.e., only the erroneous part of the utterance is reformulated), as Sheen's (in press) study indicates. In contrast, explicit types of feedback not only make the corrective force clear to the learner but also give clues as to the exact location of the error. As such, they might be more likely to induce learners to carry out the cognitive comparison between their error and the target form (R. Ellis, 1994), which is believed to foster acquisition (Schmidt, 1994).
Connectionist models also lend support to explicit error correction. N. C. Ellis (2005) distinguished between the mechanisms of conscious and unconscious learning, emphasizing the role of attention and consciousness in the former and of connectionist learning in the latter. He proposed the learning sequence in (1).
Ellis went on to suggest that “conscious and unconscious processes are dynamically involved together in every cognitive task and in every learning episode” (p. 340). Although he did not suggest that explicit corrective feedback is the ideal mechanism for achieving this continuous synergy (indeed, his discussion of feedback is restricted to recasts), it would seem that the metalinguistic time-outs from communicating afforded by explicit correction constitute a perfect context for melding the conscious and unconscious processes involved in learning. Within the context of a single interactional exchange, such a time-out creates an opportunity for learners to traverse the learning sequence sketched out in (1). Of course, no single exchange can guarantee that the targeted form will enter implicit memory, but repeated exchanges—directed at the same linguistic form—might be expected to do so. Thus, according to such a theoretical perspective, explicit corrective feedback caters not just to explicit learning and explicit memory but also to implicit learning and implicit memory.
PREVIOUS RESEARCH ON CORRECTIVE FEEDBACK
This review will focus on studies that have compared the effects of implicit and explicit corrective feedback on L2 acquisition and will address, in particular, methodological issues in keeping with the theme of this special issue.1
Other studies have examined the relationship between implicit/explicit feedback and learner uptake (e.g., Oliver & Mackey, 2003), but these are not included in this review, which focuses exclusively on the effects of feedback on L2 acquisition as measured in posttests. The extent to which uptake constitutes a measure of acquisition is controversial, with many researchers, including ourselves, preferring to view it as evidence of noticing.
Table 1 summarizes 11 studies that have compared implicit and explicit corrective feedback. It is not easy to come to clear conclusions about what these studies reveal due to a number of factors. First, whereas some of the studies are experimental in nature (e.g., Carroll, 2001; Carroll & Swain, 1993; Lyster, 2004; Rosa & Leow, 2004), others are not (e.g., DeKeyser, 1993; Havranek & Cesnik, 2003), as this second group of researchers investigated corrective feedback through post hoc analyses of normal classroom lessons. Second, the studies vary in terms of whether they involved laboratory, classroom, or computer-based interaction. Third, the nature of the treatment activities performed by the learners in these studies differed considerably. In some cases, the activities involved fairly mechanical exercises (e.g., Carroll; Carroll & Swain; Nagata, 1993), in others they involved communicative tasks (e.g., Leeman, 2003; Muranoi, 2000; Rosa & Leow), and in others they involved a mixture of the two (DeKeyser). Fourth, the treatment also differed in terms of whether it involved output processing (the vast majority of the studies) or input processing (Rosa & Leow; Sanz, 2003). Fifth, the studies vary considerably in how they operationalized implicit and explicit feedback. Given the importance of this variable, it is discussed in greater detail later in this section. Sixth, variation is evident in how learning was measured: Some studies utilized metalinguistic judgments (e.g., Muranoi), selected response, or constrained constructed response formats (e.g., Havranek & Cesnik; Rosa & Leow), all of which might be considered to favor the application of explicit knowledge, whereas others opted for a free constructed response format (e.g., Leeman), which is more likely to tap implicit knowledge. Finally, the studies differ in another important respect: Some included an explicit explanation of the grammatical target prior to the practice activity (e.g., Lyster; Muranoi), whereas others did not (e.g., Leeman; Sanz). These differences in design reflect the different purposes of the studies, not all of which were expressly intended to compare implicit and explicit corrective feedback.
Implicit feedback in these studies has typically taken the form of recasts (Carroll, 2001; Carroll & Swain, 1993; Kim & Mathes, 2001; Leeman, 2003; Lyster, 2004). However, Muranoi (2000) employed both recasts and requests for repetition. Sanz (2003) made use of only requests for repetition (“Sorry, try again.”). In Havranek and Cesnik's (2003) classroom study, which investigated naturally occurring corrective feedback, a variety of more or less implicit forms were identified, including recast, rejection + recast, and recast + repetition. This bears out the claims of Nicholas et al. (2001) and Ellis and Sheen (in press) that recasts actually vary considerably in how implicit or explicit they are. It should be noted, therefore, that the recasts used in the different studies might not have been equivalent in their degree of implicitness versus explicitness.
Explicit feedback has also been operationalized in very different ways. A minimal form of explicit feedback consists of simply indicating that an error has been committed (e.g., Carroll & Swain's, 1993, explicit rejection or Leeman's, 2003, negative evidence). Rosa and Leow's (2004) implicit condition actually consisted of indicating whether the learners' answers were right or wrong and, thus, might have been more accurately labeled semiexplicit. Carroll (2001), Carroll and Swain, Nagata (1993), and DeKeyser (1993) distinguished between a form of minimal explicit feedback, involving some specification of the nature of the error, and extensive corrective feedback, involving more detailed metalinguistic knowledge. Lyster's (2004) prompts consisted of clarification requests, repetitions (with the error highlighted suprasegmentally), metalinguistic clues, and elicitation of the correct form. Therefore, in Lyster's sense, prompts include both implicit and explicit forms of feedback. The nonexperimental classroom studies (DeKeyser; Havranek & Cesnik, 2003) inevitably involved a variety of explicit forms of correction. All of these studies examined explicit correction provided online immediately following learner utterances that contained errors. In contrast, Muranoi's (2000) study investigated the effects of providing an explicit grammar explanation after the treatment task had been completed.
Given the substantial differences in the purposes and designs of these studies, care needs to be taken in any attempt to generalize the findings. However, overall, the results point to an advantage for explicit over implicit corrective feedback in studies in which the treatment involved production. Carroll and Swain (1993) and Carroll (2001) reported that the group that received direct metalinguistic feedback outperformed all of the other groups in the production of sentences involving dative verbs and noun formation and that this type of feedback aids generalization to novel items. Muranoi (2000) found that the group that received formal debriefing (which included metalinguistic information) outperformed the group that received meaning-focused debriefing, although only on the immediate posttest. Havranek and Cesnik (2003) found that bare recasts were the least effective form of feedback in their classroom study. Lyster (2004) reported that the group that received prompts (which included metalinguistic feedback) performed better than the group that received recasts on both immediate and delayed posttests. There is also some evidence that a comparison between two types of explicit feedback will show that the more detailed metalinguistic feedback works better (Nagata, 1993; Rosa & Leow, 2004). It is also worth noting that the two studies that asked learners about what type of feedback they preferred (Kim & Mathes, 2001; Nagata) reported a clear preference for more explicit feedback.
However, not all of the studies point to an advantage for explicit feedback. DeKeyser (1993) found no difference between the group that received extensive explicit feedback and the group that received limited explicit feedback. Nevertheless, his study indicated that when individual difference factors, such as the learners' proficiency and language aptitude, were taken into account, the more explicit feedback was of greater benefit to the more able learners. Kim and Mathes (2001), in a study that replicated that of Carroll and Swain (1993), also failed to find any statistically significant differences in the scores of the explicit and implicit groups. Explicit feedback that consists of simply indicating that a problem exists does not appear to be helpful (Leeman, 2003). In the one study that examined feedback as part of input-processing instruction (Sanz, 2003), explicit metalinguistic feedback did not confer any advantage. It is also important to recognize that these studies provide evidence that implicit methods of feedback can assist learning. The implicit groups in Carroll and Swain, Carroll (2001), Muranoi (2000), Leeman, and Lyster (2004) all scored higher than the control groups on the posttests.
The main limitation of the research to date lies in the method of testing. As noted previously, most of the studies did not include tests that can be considered valid measures of implicit knowledge (i.e., tests that call on learners to access their linguistic knowledge rapidly online). The kinds of test used (grammaticality judgment tests, sentence completion, picture prompt tests, translation tests) favored the use of explicit knowledge. It can be argued, therefore, that they were biased in favor of explicit corrective feedback. The studies that included a test likely to measure implicit knowledge did not provide clear comparisons of the effects of explicit and implicit feedback. For example, Muranoi (2000) did not examine online feedback; in his study, feedback was provided after the treatment tasks were completed. Leeman (2003), as already pointed out, did not examine explicit feedback that contained metalinguistic explanations. Lyster (2004) did not examine metalinguistic clues separately from other types of nonexplicit feedback designed to elicit the negotiation of form.
The study reported in this article investigated the following research question:
Do learners learn more from one type of corrective feedback than from another type?
This study was designed to provide a precise comparison between implicit and explicit corrective feedback by operationalizing these constructs in terms of (a) partial recasts of those portions of learners' utterances that contained an error and (b) metalinguistic explanations in which the learner's error was repeated and followed by metalinguistic information about the target language rule but the correct target language form was not provided. The effects of the corrective feedback on learning were assessed by means of tests designed to measure learning of both implicit and explicit L2 knowledge.
METHOD
The present study compares the effectiveness of two types of corrective feedback: explicit error correction in the form of metalinguistic information and implicit error correction in the form of recasts. Group 1 received implicit feedback (recast group), group 2 received explicit feedback (metalinguistic group), and group 3 (a testing control) had no opportunity to practice the target structure and, thus, received no feedback. The relative effectiveness of both types of feedback was assessed on an oral elicited imitation test, a grammaticality judgment test, and a test of metalinguistic knowledge. There were three testing times: a pretest, an immediate posttest, and a delayed posttest. The target grammatical structure was past tense –ed.
Participants
The study was conducted in a private language school in New Zealand. Three classes of students (n = 34) were involved. The school classified these classes as lower intermediate, according to scores on a placement or a previous class achievement test. Information obtained from a background questionnaire showed that the majority of learners (77%) were of East Asian origin. Most of them had spent less than a year in New Zealand (the mean length of stay was just over 6 months). The mean age of all participants was 25 years. The learners indicated that they had been formally engaged in studying English for anywhere from 8 months to 13 years, with an average length of time of 7 years. Around 44% of participants indicated that their studies had been mainly formal (grammar-oriented) in nature, whereas 30% had received mainly informal instruction, and the rest had received a mixture of both formal and informal instruction.
The teaching approach adopted by the school placed emphasis on developing communicative skills in English. Learners received between 3 and 5 hr of English language instruction a day, for which they were enrolled as part-time or full-time students. Classes were arbitrarily assigned to one of the two treatment options (group 1 = 12 students, group 2 = 12 students) or to the control group option (group 3 = 10 students).
Target Structure
Regular past tense –ed was chosen as the target structure for two reasons. First, learners at the lower intermediate level are likely to already be familiar with and have explicit knowledge of this structure. Our purpose was not to examine whether corrective feedback assists the learning of a completely new structure, but whether it enables learners to gain greater control over a structure they have already partially mastered. Pretesting demonstrated that this was indeed the case: On the grammaticality judgment test, the learners scored a mean of 75% on past tense –ed. The second reason was that past tense –ed is known to be problematic for learners and to cause errors (e.g., Doughty & Varela, 1998); thus, it was hypothesized that although learners at this level would have explicit knowledge of this structure, they would make errors in its use, especially in a communicative context and especially in oral production (oral production poses a problem because of Asian learners' phonological difficulties in producing consonant clusters with final [t] or [d]). Once again, pretesting demonstrated that this was indeed the case: On the oral elicited imitation test, which served as a measure of unplanned language use, the learners scored a mean of 30% on regular past tense –ed.
Regular past tense –ed is typically introduced in elementary and lower intermediate textbooks, but it is not among the morphemes acquired early (Dulay & Burt, 1974; Makino, 1980). It is acquired after such morphemes as articles, progressive –ing, and plural –s but before such morphemes as long plural (–es) and third person –s. The typical error made by learners is the use of the simple or present form of the verb in place of V-ed: *Yesterday I visit my sister. Hawkins (2001) noted that some L2 learners “have difficulty in establishing the regular pattern (for past tense) at all” (p. 65).
Instructional Materials
For the purposes of the study, each experimental group received the same amount of instruction (i.e., a total of 1 hr over 2 consecutive days during which they completed two different half-hour communicative tasks). The control group continued with their normal instruction. They did not complete the tasks and did not receive any feedback on past tense –ed errors.
The tasks were operationalized according to R. Ellis' (2003) definition of tasks; that is, they included a gap, they required learners to focus primarily on meaning and to make use of their own linguistic resources, and they had a clearly defined outcome. They constituted what Ellis called focused tasks; in other words, they were designed to encourage the use of particular linguistic forms and, to this end, learners were provided with certain linguistic prompts (see the description of each task).
Task 1 (Day 1)
Learners were assigned to four triads. Each triad was given the same picture sequence, which narrated a short story. They were also given one of four versions of a written account of the same story. Each version differed in minor ways from the others. Learners were told that they would have only a couple of minutes to read the written account of the story and that they needed to read it carefully because they would be asked to retell it in as much detail as possible. They were not allowed to make any written notes. The stories were removed and replaced with the list of verbs in (2) that learners were told they would need in order to retell the story.
Learners were given about 5 min to plan the retelling of their story. They were told that they would not be able to use any prompts other than the picture sequence and verb list. The opening words of the story were written on the board, to clearly establish a context for past tense: “Yesterday, Joe and Bill. …” Learners were then asked to listen to each triad's collective retelling of the story. They were also told that each triad had been given a slightly different version of the same story and that they were to listen carefully to identify what was different.
Task 2 (Day 2)
Learners were once again assigned to triads. Each triad was given a picture sequence depicting a day in the life of one of two characters (Gavin or Peter). Each picture sequence was different. Pictures were chosen to depict actions that would require the use of verbs with regular past tense –ed forms. Learners were given 5 min to prepare for recounting the day of either Gavin or Peter. Again, they were not allowed to take any written notes.
Each triad was told to begin their account with “Yesterday Peter/Gavin had a day off.” Learners in the other triads who were listening to the narrated story were provided with an empty grid and pictures that they were to place on the grid in the appropriate sequence, according to the narration. One picture card did not fit, and learners were told they would be asked to identify which card remained.
Instructional Procedures
The same instructor—one of the researchers—was responsible for conducting both tasks. The learners had not met the instructor prior to the first treatment session. An observer sat in the classroom during each session to manually record on paper all instances of the use of the target structure and each instance of corrective feedback. The treatment sessions were also audio-recorded.
The learners received corrective feedback while they performed the tasks. Group 1 received implicit feedback in the form of recasts, as in (3).
The recasts were typically declarative and of the partial type and, as such, might be considered to lie at the explicit end of the implicit-explicit continuum for recasts (see Sheen, in press). Nevertheless, because they intruded minimally into the flow of the discourse, they might not have been very salient to the learners.
The learners in group 2 received explicit feedback in the form of metalinguistic information, as in (4).
In this example, which was typical of the corrective feedback episodes in this study, the instructor first repeated the error and then supplied the metalinguistic information. It is important to note that although corrective feedback was directed at individual learners, the task was designed to ensure that the attention of the whole class was focused as much as possible on the speaker at these times.
Table 2 indicates the number of target forms that were elicited during each task and the total number of incorrectly produced forms. The number of instances of feedback is also given.
Testing Instruments and Procedures
Five days prior to the start of the instructional treatments, the learners involved in the study signed the consent forms, as required by the University of Auckland Human Participants Ethics Committee, and completed all of the pretests. The immediate posttesting was completed the day after the second (and last) day of instruction, and the delayed posttesting was completed 12 days later. During each testing session, three tests were administered in the following order: untimed grammaticality judgment test, metalinguistic knowledge test, oral imitation test. The oral imitation test (Erlam, in press) was intended to provide a measure of the learners' implicit knowledge, whereas the untimed grammaticality judgment test (especially the ungrammatical sentences in this test; see R. Ellis, 2005) and the metalinguistic test were designed to provide measures of learners' explicit knowledge.
R. Ellis (2004, 2005) discussed the theoretical grounds for these claims. He argued that tests of implicit knowledge need to elicit use of language where the learners operate by feel, are pressured to perform in real time, are focused on meaning, and have little need to draw on metalinguistic knowledge. In contrast, tests of explicit knowledge need to elicit a test performance in which the learners are encouraged to apply rules, are under no time pressure, are consciously focused on form, and have a need to apply metalinguistic knowledge. The oral imitation test was designed to satisfy the criteria for tests of implicit knowledge, whereas the untimed grammaticality judgment test and the metalinguistic test were designed to meet the criteria for tests of explicit knowledge. The tests are described in detail in the following subsections.
Oral Imitation Test
This test consisted of a set of 36 belief statements. Statements were grammatically correct (n = 18) or incorrect (n = 18). They consisted of 12 statements that targeted simple past tense –ed, 12 that targeted comparative adjectives (a focus of another study), and 12 distracter items. Examples of the past tense –ed items are given in (5).
Statements included target items introduced during the instructional treatments (old items) and new items. The statements containing new items were designed to test whether learners were able to generalize what they had learned to new vocabulary items. Eight of the 12 statements targeting past tense –ed presented the target structure in the context of new items.
Each statement was presented orally, one at a time, on an audiotape: Test-takers were required to first indicate on an answer sheet whether they agreed with, disagreed with, or were not sure about the statement. They were then asked to repeat the statement orally in correct English. Pretest training presented learners with both grammatical and ungrammatical statements (not involving past tense –ed) to practice with, and they were given the correct responses to these items. Learners' responses to all items were audio-recorded. These were then analyzed to establish whether obligatory occasions for use of the target structure had been established. Errors in structures other than the target structure were not considered. Each imitated statement was allocated a score of either 1 (the grammatically correct target structure was correctly imitated or the grammatically incorrect target structure was corrected) or 0 (the target structure was avoided, the grammatically correct target structure was attempted but incorrectly imitated, or the grammatically incorrect target structure was imitated but not corrected). If a learner self-corrected, then only the initial incorrect production was scored, as it was felt that this would provide the better measure of learners' implicit knowledge. Scores were expressed as percentage correct. Three versions of the test were created for use over the three testing sessions; in each, the same statements were used but presented in a different order. Reliability (Cronbach's alpha) for the pretest was .779. For more information about the theoretical rationale for this test and its design, see Erlam (in press).
Grammaticality Judgment Test
This was a pen-and-paper test consisting of 45 sentences. Fifteen sentences targeted past tense –ed, and the remainder targeted 30 other structures. Of the 15 sentences, 7 were grammatically correct and 8 were grammatically incorrect. Sentences were randomly scrambled in different ways to create three versions of the test. Test-takers were required to (a) indicate whether each sentence was grammatically correct or incorrect, (b) indicate the degree of certainty of their judgment (as proposed by Sorace, 1996) by typing a score on a scale marked from 0% to 100% in the box provided, and (c) self-report whether they used a rule or feel for each sentence. Learners were given six sentences to practice on before beginning the test. Each item was presented on a new page, and test-takers were told that they were not allowed to turn back to look at any part of the test that they had already completed. For past tense –ed, 7 of the 15 statements presented the target structure in the context of new vocabulary and 8 presented the target structure in the context of vocabulary included in the instruction. Learners' responses were scored as either correct (1 point) or incorrect (0 points). In addition to the total score, separate scores for grammatical and ungrammatical test items and also for new and old verb items were calculated. Reliability (Cronbach's alpha) for the pretest was .63. Test-retest reliability (Pearson's r) was calculated for the control group (n = 10) only. For the pretest and immediate posttest, it was .65 (p < .05), and for the pretest and delayed posttest, it was .74 (p < .05).
Metalinguistic Knowledge Test
Learners were presented with five sentences and were told that they were ungrammatical. Two of the sentences contained errors in past tense –ed. The part of the sentence containing the error in each example was underlined. Learners were asked to (a) correct the error and (b) explain what was wrong with the sentence (in English, using their own words). They were shown two practice examples. As in the previous test, each item was presented on a new page and test-takers were told that they were not allowed to turn back. Learners were scored one point for correcting the error and one point for a correct explanation of the error. A percentage accuracy score was calculated.
Analysis
Descriptive statistics for the three groups on all three tests were calculated. On the oral imitation test and the grammaticality judgment test, (a) total scores, (b) separate scores for grammatical and ungrammatical items, and (c) separate scores for old and new items were calculated. The decision to examine grammatical and ungrammatical items separately was motivated by previous research (see R. Ellis, 2005), which showed that these might measure different types of knowledge (i.e., ungrammatical sentences provide a stronger measure of explicit knowledge). The decision to examine old items (i.e., items that tested verbs included in the instructional treatment) versus new items (i.e., verbs not included in the instructional treatment) was motivated by the wish to examine whether the instruction resulted only in item learning or whether there was also evidence of system learning.
The t-tests showed that there were statistically significant differences among the groups on the oral imitation and grammaticality judgment pretests. To take account of this difference, analyses of covariance (ANCOVAs; with pretest scores as the covariate) were computed to investigate to what extent group differences on the two posttests were statistically significant.
RESULTS
Oral Imitation Test
The descriptive statistics for regular past tense on the imitation test (see Table 3) show a range in overall accuracy from 24% to 39% on the pretest. The scores increase on both posttests. The ungrammatical items have lower accuracy scores than the grammatical items.
The results of the ANCOVAs reveal that there is a significant difference between the groups on their pretest scores. Once these differences are taken into account, there is no effect for group on the immediate posttest, F(2, 34) = 0.961, p = .394; however, there is on the delayed posttest, F(2, 34) = 7.975, p < .01. The post hoc contrasts for the delayed posttest show that the metalinguistic group differed significantly from the recast and control groups. There was also a tendency toward a significant difference between the recast and control groups.
The analysis of the grammatical and ungrammatical items showed no significant group differences on the immediate posttest for either the grammatical, F(2, 34) = 0.853, p = .436, or ungrammatical items, F(2, 34) = 0.753, p = .480. However, on the delayed posttest, there were significant differences on both the grammatical, F(2, 34) = 6.697, p < .01, and ungrammatical items, F(2, 34) = 4.769, p < .05, with the metalinguistic group differing significantly from the control on both. Additionally, the metalinguistic group differed significantly from the recast group on the grammatical items, with a similar trend toward significance for metalinguistic over recasts on the ungrammatical items.
Untimed Grammaticality Judgment Test
The descriptive statistics for regular past tense on the grammaticality judgment test (see Table 4) show relatively high levels of accuracy on the pretest, ranging from 69% to 78%. These accuracy scores generally increased over both posttests.
The ANCOVAs show overall that there is no difference for group on the immediate posttest, F(2, 34) = 0.714, p = .498, although there is for the delayed posttest, F(2, 34) = 4.493, p < .05. The post hoc contrasts for the delayed posttest showed that the metalinguistic group differed significantly from the recast group and that there was a trend toward significance for metalinguistic over the control.
The ANCOVAs do not reveal any group differences on the immediate posttest for the grammatical F(2, 34) = 1.482, p = .243, or ungrammatical items F(2, 34) = 0.092, p = .912. Additionally, there were no differences on the delayed posttest for the ungrammatical items, F(2, 34) = 0.900, p = .417. However, there were significant differences on the delayed posttest for the grammatical items, F(2, 34) = 5.194, p < .05, with the post hoc contrasts showing that the metalinguistic group differed significantly from the recast group, and the control differed significantly from the recasts.
The Metalinguistic Test
The results from the metalinguistic test (see Table 5) show that all three groups had high accuracy scores on the pretest and that these generally remained high over the two posttests. Due to the small number of test items (n = 2), inferential statistics were not calculated for the metalinguistic test.
Old versus New Items
Tables 6 and 7 present the descriptive statistics for test performance on past tense verbs that appeared in the treatment tasks (i.e., old items) and for past tense verbs that were not the object of feedback (i.e., new items). If the effect of the treatment were only evident on the old items, this would suggest that corrective feedback caters only to item learning; if the effect could be shown to extend to new items, this would constitute evidence of generalization (i.e., system learning).
The results of the ANCOVAs reveal that there were significant differences between the groups on the pretest oral imitation scores. Once these differences are taken into account, there were no differences for the new items on the immediate posttest, F(2, 34) = 0.397, p = .676; however, there were differences on the delayed posttest, F(2, 34) = 8.943, p < .01. The post hoc contrasts show that the metalinguistic group is significantly more accurate than both the recast and control groups. Similarly, for the old items, there was no difference on the immediate posttest, F(2, 34) = 1.211, p = .312, but there was on the delayed posttest, F(2, 34) = 3.188, p = .056, with the post hoc contrasts showing that the metalinguistic group was significantly more accurate than the control group. Thus, the results reported for the oral imitation test as a whole apply equally to old and new items.
The results of the ANCOVAs for the grammaticality judgment test reveal that there were significant differences between the groups on the pretest scores. Once these differences are taken into account, there was no difference among the groups for the old items on either the immediate posttest, F(2, 34) = 0.452, p = .640, or the delayed posttest, F(2, 34) = 0.817, p = .451, nor for the new items on the immediate posttest, F(2, 34) = 0.467, p = .632. However, there was a significant difference for the new items on the delayed posttest, F(2, 34) = 4.295, p < .05, which showed the metalinguistic group to be significantly more accurate than the recast group as well as a trend toward metalinguistic group over the control group.
Summary
Table 8 summarizes the main results, focusing on the statistically significant differences in the pairwise comparisons.
DISCUSSION
An inspection of the pretest oral imitation test scores suggests that all of the learners initially had only limited implicit knowledge of past tense –ed. This was especially apparent in their inability to produce the correct forms when asked to imitate and correct sentences containing errors in structure. In contrast, the grammaticality judgment pretest scores were high (i.e., above 70% in the two experimental groups). It is also noticeable that scores on the ungrammatical sentences were higher than on the grammatical sentences. If the ungrammatical sentences are taken as affording a better measure of explicit knowledge than the grammatical sentences, as suggested by R. Ellis (2004) and demonstrated in R. Ellis (2005), this might explain the higher scores on the ungrammatical sentences. Also, the metalinguistic test indicated a high level of explicit knowledge of past tense –ed. Thus, the pretest scores can be interpreted as showing that the learners generally possessed a high level of explicit knowledge of past tense –ed but were lacking in implicit knowledge.
The descriptive statistics in Table 3 and the results of the ANCOVAs show that the corrective feedback resulted in significant differences among the groups on the oral imitation test for past tense –ed, but these differences were only evident on the delayed posttest. Corrective feedback also led to gains on the grammaticality judgment test. However, the gains were almost entirely due to improved performance on the grammatical sentences, which R. Ellis (2004, 2005) argued tap more into implicit knowledge.
These results suggest that corrective feedback has an effect on the learning of implicit knowledge. Indeed, overall, the feedback appears to have had a greater effect on the learners' implicit knowledge than on their explicit knowledge, although this might simply reflect the fact that the learners possessed ceiling levels of explicit knowledge at the beginning of the study. It is possible, of course, that the treatments increased learners' awareness of the grammatical targets of the oral imitation test, thus encouraging them to monitor their output using their explicit knowledge. However, we do not believe that this occurred. First, when asked at the end of the final test if they were aware of which grammatical structures the test was measuring, only one learner was able to identify past tense. Second, as Table 9 shows, there is no clear evidence that the experimental groups were monitoring more than the control group or more in the posttests than in the pretest. If the learners were attempting to use their explicit knowledge in this test, we would have expected a much higher incidence of self-correction.3
When scoring the oral imitation test, self-corrections were discounted.
Further evidence that the corrective feedback induced changes in learners' implicit knowledge can be found in the fact that the effects of the experimental treatments on the oral imitation test scores were more evident 2 weeks after the instruction than 1 day after. This finding reflects previous research (e.g., Mackey, 1999), which has also shown that the effects of instruction become more apparent in delayed tests that tap the kind of language use likely to measure implicit knowledge. The enhanced accuracy evident in the oral imitation delayed posttest is indicative of the learners' successful incorporation of the target structure into their interlanguage systems.
The main purpose of the study was to investigate the relative effects of explicit and implicit corrective feedback on the acquisition of both types of knowledge. In this study, explicit corrective feedback was operationalized as metalinguistic information, and implicit corrective feedback was operationalized as recasts. The results point to a distinct advantage for metalinguistic information despite the fact that the learners in the recast group received substantially more corrective feedback than those in the metalinguistic group (see Table 2). Nor was the advantage found for the metalinguistic group only evident in the grammaticality judgment test: It was also clearly evident in the oral imitation test. Also, metalinguistic feedback (but not recasts) was found to result in learning that generalized to verbs not included in the treatment, which suggests that system learning took place.
How can we explain the general superiority of explicit feedback over implicit feedback? In the earlier discussion of theoretical issues relating to corrective feedback, we noted that in connectionist models of L2 acquisition, explicit corrective feedback in the context of communicative activity can facilitate the conversion of explicit knowledge into implicit knowledge.4
A reviewer of a draft version of this paper pointed out that the results could be explained in terms of the learners' having automatized their declarative knowledge of past tense –ed as a result of the treatment. This interpretation draws on the distinction between declarative and procedural knowledge, which informs skill-building theories of the kind advocated by DeKeyser (1998). However, we have chosen to frame the paper in terms of the implicit/explicit distinction, noting with Eysenck (2001) that recent changes in the definitions of both pairs of terms have brought them closer together, making it “increasingly difficult to decide on the extent to which different theories actually make significantly different predictions” (p. 213).
The superiority of the metalinguistic feedback only reached statistical significance in the delayed oral imitation and grammaticality judgment posttests. However, gains from pretest to the immediate posttest were also evident. Thus, the general pattern of the results was: pretest scores < immediate posttest scores < delayed posttest scores; that is, the benefits of the metalinguistic feedback became more evident as time passed. This finding supports the claims of R. Ellis (1994) and N. C. Ellis (2005) that explicit L2 knowledge can enhance the processes involved in the development of implicit knowledge (e.g., noticing and cognitive comparison); that is, the awareness generated by metalinguistic feedback promotes the kind of synergy between explicit and implicit knowledge that is hypothesized to underlie L2 learning.
The relatively weak effect found for either type of feedback on the ungrammatical sentences in the grammaticality judgment test reflects the fact that the learners possessed the explicit knowledge required for judging such sentences from the beginning, which was clearly evident from their high pretest scores on the ungrammatical sentences on the grammaticality judgment test and near-perfect scores on the metalinguistic test.
One final comment is in order. All of the learners in this study demonstrated partial implicit knowledge of past-tense –ed, as demonstrated by their performance on the oral imitation pretest. It is possible that for corrective feedback of any kind to have an effect on learning, the structures must be at least partially established in the learners' interlanguages. Further research is needed to establish whether corrective feedback is effective in enabling learners to acquire completely new grammatical structures.
CONCLUSION
This study demonstrates that explicit feedback in the form of metalinguistic information is, overall, more effective than implicit feedback (in the form of recasts) and contributes to system as well as item learning. Table 10 summarizes the actions that learners are hypothesized to carry out in order to process feedback for acquisition (based on Carroll's, 2001, account of corrective feedback), and the extent to which the two types of feedback engage these processes. It illustrates how both implicit and explicit types of feedback might facilitate these actions and it demonstrates why explicit feedback might do so more effectively than implicit feedback. In particular, explicit feedback seems more likely to promote the cognitive comparison that aids learning.
The study reported in this paper incorporated a number of unique methodological features. To the best of our knowledge, this is the first experimental study that has compared the effects of online explicit corrective feedback in the form of metalinguistic explanations and implicit corrective feedback in the form of recasts in classroom-based instruction. Other studies that have compared explicit and implicit corrective feedback (see Table 1) have either examined other forms of corrective feedback or have been laboratory based, or both. We argue that metalinguistic explanation and recasts constitute the best exemplars of explicit and implicit corrective feedback, as both are supported by previous research that shows them to be effective in promoting learning. We also argue that from a pedagogical perspective, it is important to examine corrective feedback within the classroom context. We do not believe that it is easy to extrapolate the results obtained from laboratory studies that involve one-on-one interactions to classrooms in which the teacher interacts with the whole class. In our view, ecological validity can only be achieved through classroom-based research.
Another unique feature is that the corrective feedback occurred in the context of learners performing communicative tasks. In Long's (1991) terms, therefore, the feedback constituted focus-on-form. In the majority of studies listed in Table 1, the feedback was part of a focus-on-forms. The importance of the study reported in the preceding sections is that it demonstrates that when learners receive explicit feedback on their attempts to communicate, acquisition takes place. Thus, the study supports Long's claims about focus-on-form. In the view of the teacher who taught the lessons and of the researchers who inspected the transcripts of the corrective feedback episodes, the learners were engaged with performing the tasks and the primary focus was on meaning, not form. The study demonstrates that metalinguistic feedback does not detract unduly from the communicative flow of a lesson.
A third unique feature of the study is to be found in the instruments used to measure learning. These measures were based on prior research designed to develop relatively separate measures of implicit and explicit knowledge (R. Ellis, 2005; Erlam, in press). These measures enable us to consider whether the effects of the corrective feedback were related to implicit as well as explicit knowledge. The results reported earlier suggest that corrective feedback results in gains in implicit knowledge. As such, they constitute evidence against the claims that have been advanced by some theorists (Krashen, 1982; Schwartz, 1993) that corrective feedback plays no part in the development of implicit knowledge.
As in all classroom studies, there are inevitable limitations. First, the sample size for this study was small. Also, we were forced to use intact groups, with the result that the groups were not equivalent at the commencement of the study, thus obligating the use of ANCOVA. Second, because our main aim was to compare the relative effectiveness of the two types of corrective feedback, we only included a testing group as a control group (i.e., we did not have a control group that completed the communicative tasks without any corrective feedback). Third, the length of the treatments was very short (approximately 1 hr). It is possible that with a longer treatment, recasts would have proved more effective. Fourth, the structure we chose for study was a structure that the learners had already begun to acquire. In one respect, this can be considered a strength, as it enabled us to examine which type of corrective feedback works best for structures already partially acquired. However, in another respect, it constitutes a weakness, in that we are unable to say whether corrective feedback (and what type of corrective feedback) is effective in establishing new knowledge. As always, further research is needed.
References
REFERENCES
- 374
- Cited by