The usefulness of providing L2 learners with explicit information (EI) about a target feature and subsequent practice in processing the input is not fully understood. As noted by Henry, Culman, and VanPatten (Reference Henry, Culman and VanPatten2009, p. 573) “not all EI is the same, not all structures are the same, and the interaction of EI, structure, and processing problem may yield different results in different studies.” The current study investigates the effects of EI with practice in the L2 and, also, in light of research documenting persistent difficulty when the L1 and L2 express the same meaning differently (Izquierdo & Collins, Reference Izquierdo and Collins2008; McManus, Reference McManus2013, Reference McManus2015; Roberts & Liszka, Reference Roberts and Liszka2013), whether additional EI about the L1 with L1 practice can help a specific processing problem—interpreting the habitual versus ongoing meanings of L2 French Imparfait (IMP) for L1 English learners. We tested whether making this conceptual distinction explicit, through EI and meaning-based practice in both L1 and L2, would aid form-meaning mapping. First, we briefly discuss research into L2 EI and practice, before justifying the investigation of a role for L1 EI and practice, and then move on to discuss why EI and practice (in L2 and L1) may have an effect on online processing.
L2 EI AND PRACTICE
EI about the L2 is useful for learning, according to information processing and skill acquisition theories, because some declarative information can become proceduralized through practice and automatized, resulting in automatized declarative knowledge and/or knowledge that appears indistinguishable from implicit knowledge (DeKeyser, Reference DeKeyser, VanPatten and Williams2015). “Weaker” accounts suggest that learners can use EI to segment or parse the input (Terrell, Reference Terrell1991), notice features (Schmidt, Reference Schmidt1990), understand a rule and help production (Leow, Reference Leow2015), and arrive at correct interpretations with fewer practice items (Henry et al., Reference Henry, Culman and VanPatten2009). The effectiveness of EI is likely to depend on several factors, including its precise nature—the type of information conveyed and the feature in focus. There is a considerable body of research into feature difficulty and amenability to different kinds of instruction (e.g., DeKeyser, Reference DeKeyser2012). VanPatten and Rothman (Reference VanPatten, Rothman and Rebuschat2015) suggest that features most likely to benefit might be those that must be learned directly from the input, such as representations of inflectional verb morphology, as EI may make them better noticed in the input. In addition, the selection of which features could benefit from which kinds of EI and practice could partially be informed by the nature of L1-L2 differences.
L1 EI AND PRACTICE
Long-term difficulties have been documented for learning L2 features that share some similarity across the L1-L2, and yet have different form-meaning mappings (Murakami & Alexopoulou, Reference Murakami and Alexopoulou2016; Spada, Lightbown, & White, Reference Spada, Lightbown, White, Housen and Pierrard2005). Yet, as Ellis and Shintani (Reference Ellis and Shintani2014, p. 247) noted, “there is almost no research that has investigated the actual effects of the classroom use of the L1 on L2 learning.” A few classroom studies have shown that raising learners’ awareness about L1-L2 differences benefits learning of L2 lexis (Laufer & Girsai, Reference Laufer and Girsai2008; White & Horst, Reference White and Horst2012) and grammar (Horst, White, & Bell, Reference Horst, White and Bell2010; Kupferberg, Reference Kupferberg1999; Spada et al., Reference Spada, Lightbown, White, Housen and Pierrard2005), as measured during the actual learning events or immediately afterward through offline vocabulary and writing tests. Evidence also suggests that providing EI about L1-L2 differences correlated positively with learners’ performance on untimed tests that allowed access to that awareness, that is, grammaticality judgments and sentence construction tasks (Ammar, Lightbown, & Spada, Reference Ammar, Lightbown and Spada2010). One of the few intervention studies to address the amenability of L1-L2 differences to instruction was conducted by Tolentino and Tokowicz (Reference Tolentino and Tokowicz2014). Grammaticality judgment tests showed that providing EI, input flood, and repetition practice of features that exist in both languages but are realized differently did not in fact benefit from instruction (although benefits were found for features unique to the L2). However, as that study provided no EI about the L1 or any meaning-based practice, the effectiveness of instruction for cross-linguistically different features remains to be explored further.
To our knowledge, the current study addresses three significant gaps in this agenda. First, learning has not yet been documented using delayed posttests. Second, research has not yet investigated the benefits of systematic L1 practice as an intentional component of instruction. Third, research has not yet examined the effects of L1 EI and practice on online sentence processing, discussed next.
ONLINE EFFECTS OF EI AND PRACTICE
Studies showing benefits for EI with (or without) practice have almost exclusively used offline tests that allow access to explicit knowledge, with very little use of online measures (as noted by DeKeyser & Prieto Botana, Reference DeKeyser and Prieto Botana2015; VanPatten & Rothman, Reference VanPatten, Rothman and Rebuschat2015). Online measures can provide “fine-grained information about moment-by-moment sentence comprehension [. . . to] examine what happens at precise points in a sentence” (Keating & Jegerski, Reference Keating and Jegerski2015, p. 2). For example, longer reaction times relative to comparison items may indicate a processing cost brought about by ungrammaticality, ambiguity, or complexity (Roberts, Reference Roberts, Mackey and Marsden2016). Some theorists argue for an even stronger, causal relation between online processing and learning (e.g., O’Grady, Reference O’Grady2005, Reference O’Grady2015; see Phillips & Ehrenhofer, Reference Phillips and Ehrenhofer2015). That is, online processing may be a mechanism by which learning is driven and constrained and processing difficulty (e.g., complexity or cost) a key factor in learning. To this end, VanPatten and Rothman (Reference VanPatten, Rothman and Rebuschat2015, p. 113) recommended:
moving away from knowledge-testing more generally and more into the interface between knowledge and processing via techniques such as . . . self-paced listening/reading. . . . Currently, these are used largely to understand the processing outcomes of acquisition. We think they can be used to study acquisition-as-processing itself.
To our knowledge, the relationship between online L2 processing and explicit instruction has been investigated in two published studies to date. Footnote 1 First, Andringa and Curcic (Reference Andringa and Curcic2015) provided half their participants with brief EI explaining that a preposition predicts the animacy of the upcoming direct object in a novel language. Learners in both conditions (±EI) were then exposed to 104 sentences, 52 of which provided exposure to the direct object marker rule. Although the +EI group performed better offline in a grammaticality judgment test, there was no evidence that EI developed into knowledge that was beneficial online, as measured by predictive eye movements. Second, again in an artificial language study, Marsden, Williams, and Liu (Reference Marsden, Williams and Lui2013) investigated the effects of task-essential practice with yes/no feedback on interpreting inflections for tense and number (Experiment 3). Again, findings were that offline measures (accuracy of lexical decisions) demonstrated learning, but online measures (reaction times in a cross-modal priming task) provided no evidence that training the learners to orient their attention to the meaning of the inflection had affected online processing of cross-modal representations.
Related to the knowledge gap about the effects of L2 EI and practice on online performance is whether L1 EI and practice can affect L2 online processing. This is of particular relevance for features with cross-linguistic differences in processing routines. Of course, some theories foreground a role for the L1 in critical aspects of input processing, such as attention allocation being entrenched by the L1 (Ellis, Reference Ellis2006); processing routines being influenced by the L1 when a feature is not unique to the L2 (MacWhinney, Reference MacWhinney2005); or L2 processing routines being difficult to learn in cases where adopting related L1 routines would require fewer processing resources (O’Grady, Reference O’Grady2005, Reference O’Grady2015). There is also growing evidence of L1 coactivation/influence during online L2 sentence processing (e.g., Tolentino & Tokowicz, Reference Tolentino and Tokowicz2011). Tokowicz and Warren’s (Reference Tokowicz and Warren2010) self-paced reading (SPR) study with beginner learners reported slower reading times at morphosyntactic violations for L2 Spanish features that were cross-linguistically similar (verb aspect licensing), but not for those that were entirely unique to the L2 (determiner–noun gender agreement). Similarly, Roberts and Liszka (Reference Roberts and Liszka2013) with advanced learners found slower reading times at morphosyntactic (aspectual) violations in L2 English when L1 and L2 both grammaticalized aspect (L1 French), but not when the means of expressing aspect was unique to the L2 (L1 German). L1-L2 morphosyntactic coactivation has also been documented among bilinguals, for example, during comprehension (Sanoudaki & Thierry, Reference Sanoudaki, Thierry, Thomas and Mennen2014), production (Runnqvist, Gollan, Costa, & Ferreira, Reference Runnqvist, Gollan, Costa and Ferreira2013), and cross-linguistic priming (Hartsuiker & Pickering, Reference Hartsuiker and Pickering2008). However, there is surprisingly little evidence about the role of the L1 in online processing among learners, rather than bilinguals/near natives. As VanPatten (Reference VanPatten, VanPatten and Williams2015, p. 120) noted, “the question is open as to whether and to what degree there is L1 influence in basic Input Processing, and whether that influence is an actual processing procedure or lexical influence.” Critically, as noted in the preceding text, there is very little, if any, research on the influence of L1 explicit information and practice on online processing that could inform us about the development of sentence processing in instructed contexts.
RATIONALE FOR THE CURRENT STUDY
In sum, the preceding lines of research inform our understanding about the roles of EI with practice in L2 offline knowledge and about the existence of L1 effects in L2 online processing. But there is no research, to our knowledge, that investigates (a) the potential benefits of EI with practice (L1 or L2) on online processing for learning a natural language over time and (b) the benefits of EI about the L1 with practice in processing it (on offline or online performance).
It remains an empirical question as to whether providing EI about and practice in the different processing routines in learners’ L1 can benefit L2 learning. Benefits may be found if making a processing routine explicit could create some declarative knowledge that might serve processing and learning in a variety of ways. For example, clarifying and rehearsing nontransparent, conceptual distinctions in the L1 (e.g., polyfunctionality of the English “-ed” verbal morpheme; McManus, Reference McManus2015) may facilitate accurate mapping of those concepts to L2 forms, compatible with views that assume a role for EI and explicit rehearsal in language development (DeKeyser, Reference DeKeyser, VanPatten and Williams2015). Thus, L1 and L2 EI with meaning-based practice may lead to new or more efficient L2 processing routines. Initially represented declaratively, these routines may gradually become proceduralized and automatized through practice (DeKeyser, Reference DeKeyser, VanPatten and Williams2015). Combined L1 and L2 EI and practice may also provide data about morphosyntactic distributions of features in the input that could, according to some theories (e.g., Ellis, Reference Ellis2006, Reference Ellis2008; see Andringa & Curcic, Reference Andringa and Curcic2015), interface with the developing language system and promote L2 form-meaning remapping.
A POTENTIAL TESTING GROUND: THE FRENCH IMPARFAIT
The IMP is well documented to be late acquired, even after considerable naturalistic exposure (Bartning & Schlyter, Reference Bartning and Schlyter2004), and its habitual meaning in particular has been shown to be influenced by L1 background (Howard, Reference Howard, Labeau and Larrivée2005; Izquierdo & Collins, Reference Izquierdo and Collins2008; McManus, Reference McManus2013, Reference McManus2015). Although French and English both express past habituality and ongoingness with verbal morphology, these are mapped differently. For English-speaking learners of French, this is illustrated in 1-2.
-
1. Past ongoing and habitual meanings can be expressed by one morpheme in French (a), but not in English (b and c):
-
(a) Il jouait IMP_ongoing_a/habit_b au foot quand j’ai _a appelé/ quand nous étions _b petits
“He playIMP_ongoing_a/habit_b football when I_PERF_a called / when we were_IMP_b little”
-
(b) He was ongoing_a playing football
-
(c) He played_habit_b football
-
-
2. Past perfective and habitual meaning can be expressed by one morpheme in English (a), but not in French (b and c):
-
(a) He playedPERF_a /IMP_habit_b football once last year_a / every Saturday_b
-
(b) Il jouait _habit_j au foot quand nous étions _j petits
“He play IMP_habit_b football when we were_IMP_j small”
-
(c) Il a PERF_a joué au foot
“He PERFplay football”
-
These sentences illustrate that one French inflectional verb morpheme alone does not disambiguate habitual from ongoing meaning in the past. Nor do lexical phrases reliably co-occur with IMP to distinguish between ongoing and habitual (“la semaine dernière” [“last week”] co-occurs with morphology other than IMP). However, morphosyntactic information in the discourse context is a reliable cue to meaning (De Swart, Reference De Swart1998; Smith, Reference Smith1997). That is, to resolve the aspectual ambiguity inherent in IMP, the discourse context either provides an “interruption” to the event through the past perfective, Passé Composé (PC), thus coercing an ongoing meaning of IMP (sentence 1a_a); or it provides concurrent information with another IMP, coercing a habitual meaning (sentence 1a_b and 2b_b) (Comrie, Reference Comrie1976). The IMP can be before or after its disambiguating verb, and not necessarily in the same sentence or speech turn.
English, by contrast, does not require interclausal morphosyntax to disambiguate past habituality from ongoingness. Instead, this is done within a clause using verb morphology and/or lexical means, leaving no ambiguity in need of resolution, for example, I was walking = past ongoingness; I walked/used to walk/would walk = habituality.
If learners tend to rely on L1 English processing routines they would not reliably use morphological information in the discourse to disambiguate past habituality from ongoingness (which arguably demands more processing resources than immediate disambiguation within the verb phrase [O’Grady, Reference O’Grady2005, Reference O’Grady2015]). This could result in nonoptimum (less accurate and slower) interpretations of IMP. It seems possible that EI with practice that renders explicit the mapping procedures required in L1 could facilitate the remapping of procedures for interpreting the L2.
TASK-ESSENTIAL FORM-MEANING MAPPING PRACTICE FOR ASPECTUAL DISTINCTIONS
A large body of research has demonstrated the learning benefits of focusing learners’ attention on making form-meaning mappings from the input (Loschky & Bley-Vroman, Reference Loschky, Bley-Vroman, Crookes and Gass1993; VanPatten, Reference VanPatten1996, Reference VanPatten2002). This has included presenting stimuli without temporal adverbs, thus forcing attention on the temporal meaning of verb inflections (e.g., Benati, Reference Benati2005; Marsden, Reference Marsden2006; Marsden & Chen, Reference Marsden and Chen2011; Sagarra & Ellis, Reference Sagarra and Ellis2013), without overt subjects, thus forcing attention on person and number meanings (Marsden, Reference Marsden2006; Marsden et al., Reference Marsden, Williams and Lui2013), or without lexical phrases for doubt and certainty, forcing attention on subjunctive versus indicative inflections (Fernández, Reference Fernández2008). Task-essential form-meaning input mapping practice has been found to lead to more learning than input activities with equal numbers of target features that focus attention on verb semantics or sentential meaning (Marsden, Reference Marsden2006; Marsden & Chen, Reference Marsden and Chen2011).
To date, two studies have examined task-essential activities involving IMP (Benati, Lee, & Laval, Reference Benati, Lee, Laval, Benati and Lee2008 and Lee, Benati, Aguilar-Sánchez, & McNulty, Reference Lee, Benati, Aguilar-Sánchez, McNulty, Lee and Benati2007). Footnote 2 These studies removed aspecto-temporal adverbs (e.g., tous les jours “every day”) that can sometimes co-occur with IMP and may render it less likely to be attended (VanPatten, Reference VanPatten2002). However, no research has yet investigated how to make task-essential the two different aspectual meanings of IMP—ongoingness and habituality. As described in the preceding text, this can only be determined reliably by morphosyntactic information in the discourse context. To date, we do not know whether task-essential practice can help learning that requires co-indexation with morphosyntax in another clause to ascertain the correct form-meaning mapping.
CURRENT STUDY
The current study begins to address the gaps identified in the preceding text by (a) investigating a hitherto neglected target feature—two aspectual dimensions of the French IMP that require interclausal morphosyntactic cues (past habituality and ongoingness); (b) measuring both the speed of online processing and offline interpretation to investigate learning; and (c) investigating the role of L1 EI and L1 task-essential practice in L2 learning.
Based on previous research, we expected that L2 EI and task-essential practice would result in gains in offline measures, whereas the control condition (tests only) would not. As little/no research has examined specifically (a) the online effects of EI with task-essential practice (L1 or L2) or (b) the role of L1 EI with practice, we could not adopt strong expectations for these two dimensions of the study. However, for (a), based on research suggesting links between processing and learning, we thought that online measures might show increased sensitivity to correct/incorrect use of IMP following both L2-only and L2+L1 treatments. This is partly because our treatments included extensive form-meaning mapping practice, unlike previous studies that have not found online effects for EI or practice. For (b), based on research suggesting L1-L2 coactivation during processing and evidence of potential benefits of making explicit and rehearsing cross-linguistically complex form-meaning mappings, it was thought that, compared to control, the L2+L1 treatment would lead to larger and more consistent online and offline effects than the L2-only treatment compared to control.
METHOD
Participants
Participants were 50 (42 females, 8 males) English-speaking learners of French as a foreign language in semester two of a four-year bachelor of arts honors degree in French at a large university in England. We required every participant to be a native speaker of English, have completed A-level (A2) French (equivalent to level B2 in the Common European Framework of Reference for Languages, normally after about 700 to 800 hours of instruction), and not to have spent more than six weeks in a French-speaking country. We collected background information using a questionnaire and excluded six people based on these criteria. Participants’ mean age was 19, and the mean time spent in a French-speaking country was 3.8 weeks. Thirty-nine participants declared knowledge of other romance languages (Spanish = 29, Portuguese = 5, Italian = 4), twelve declared knowledge of German, and twelve declared knowledge of other languages (two each for Greek, Latin, and Welsh, and one each for Arabic, British Sign Language, Japanese, Mandarin, Polish, and Russian).
Design
The study had three between-subjects groups (control, L2-only, L2+L1) and three within-subject tests (pretest in week 1, posttest in week 5, delayed posttest in week 12). All tests and treatments were administered one-to-one with laptops using E-Prime 2.0 (Schneider, Eschman, & Zuccolotto, Reference Schneider, Eschman and Zuccolotto2012). We assigned participants to a group using matched randomization, Footnote 3 resulting in 17 participants in the L2+L1 group, 17 in the L2-only group, and 16 in the control group (tests only). By contemporary standards these are small numbers, due to the amount of time and relatively long timescale required by the study, and we acknowledge, therefore, that this constitutes an exploratory study. The treatments were delivered in four 45-minute sessions over three weeks, totaling 3.5 hours. Sessions 1 and 2 were delivered in week 2, session 3 in week 3, and session 4 in week 4. The control group completed all tests and did not receive any intervention treatment, but continued normal instruction along with all other participants between pre- and posttests. Due to the vacation between posttests and delayed posttests, none of the participants received any instruction (either as part of their university program or our experiment). This increased the likelihood that any effects found at delayed posttest were due to our intervention. In the university program from which all participants were drawn, explicit grammar instruction only took place in semester 1, that is, prior to the study, corroborated by interviews with university tutors. The entire study was piloted in a condensed timescale with 10 English-speaking learners of French at another university.
TARGET STRUCTURE: FRENCH IMP
All exemplars of IMP were third-person singular forms. This included 25 regular (e.g., jouait) and 23 irregular (e.g., finissait) verb types. Regulars and irregulars were included because (a) the study’s focus was inflectional morphology that remains orthographically and phonologically constant in the IMP across regular and irregular stems (L’Huillier, Reference L’Huillier1999); (b) the tests were receptive and so production of irregular stems was not measured; and (c) there is some evidence that learning the two functions of IMP (habitual and ongoing) may relate to lexical verb type (e.g., activities—manger, achievements—arriver), and as we wished to counterbalance lexical types this entailed inclusion of frequent irregular verbs (Andersen & Shirai, Reference Andersen and Shirai1994); Footnote 4 (d) the study’s ecological validity was increased by including both regular and irregulars; and (e) any potential effect of verb type was experienced by all three participant groups equally, as verb types were counterbalanced across test versions, and test versions counterbalanced across conditions and test phases.
INSTRUCTIONAL TREATMENTS
The L2+L1 and L2-only treatments included an identical core of EI and practice in interpreting French IMP. Footnote 5 We first describe this common core, before describing the L1 treatment received by the L2+L1 group (see also Supplementary Materials: Treatment). All instructional treatments and outcome measures are publicly accessible in the IRIS digital repository at http://www.iris-database.org.
EI About the L2
EI was provided in two ways: (a) prepractice, approximately five minutes at the start of each session; and (b) during the task-essential practice activities following only incorrect answers, which, as Supplementary Materials: Treatment shows, was infrequent and occurred in almost identical amounts in both treatments. The prepractice EI depicted conceptual-semantic information using a short video, image, or sound file of events. Then the appropriate aural and written forms were presented, and information given about how to interpret their meaning.
Practice in Interpreting the L2
The short prepractice EI was followed by task-essential, form-meaning mapping practice; listening and reading in equal amounts; and focusing attention on meaning contrasts expressed by different forms. In line with other task-essential activities, this was done through learners choosing the meaning of a stimulus from fixed options (e.g., Marsden, Reference Marsden2006; Marsden & Chen, Reference Marsden and Chen2011; Sanz & Morgan-Short, Reference Sanz and Morgan-Short2004; VanPatten, Reference VanPatten2002).
The numbers of French exemplars are shown in Table 1. The practice drew on 48 lexical verb types: each one occurred eight times with IMP (n = 384): four for reading (two habitual, two ongoing/interrupted) and four for listening (two habitual, two ongoing/interrupted). The lexical semantic properties of verb types were counterbalanced across listening/reading and ongoing/habitual items: 12 states (e.g., be happy), 12 activities (e.g., swim in the sea), 12 accomplishments (e.g., walk to the shop), and 12 achievements (e.g., arrive home). Verb type frequency was balanced across the four lexical semantic classes using Lonsdale and Le Bras’s (Reference Lonsdale and Le Bras2009) frequency dictionary of French. Aural stimuli were recorded by two native French speakers. The French sentences were verified for authenticity by 26 native French speakers: All were rated as 100% acceptable, with the meanings (ongoing/ habitual, present/past) as intended by the researchers.
The L2+L1 Treatment
The L2+L1 group received the exact same treatment as previously mentioned, with no additional French L2 exemplars. The L2+L1 treatment additionally included EI about English and task-essential form-meaning mapping practice in English. Table 2 shows the numbers of tokens of English exemplars (all third-person singular). See Supplementary Materials: Treatment for full descriptions and example activities and stimuli.
OUTCOME MEASURES
Two versions of each outcome measure were administered in a split-block design. The versions alternated the lexical items carrying IMP and PC inflections and the order of items.
Context-Matching Tests (Listening and Reading)
All participants took two context-matching tests: first a listening (CMT-Listen), then a reading (CMT-Read), each with 24 target and 8 filler trials. Each trial consisted of two parts: (a) the English context: two sentences describing either a habitual or an ongoing activity written in English and (b) the French stimulus: a two-clause French sentence that either matched (k = 12) or mismatched (k = 12) the meaning of the English context. Critically, the French stimuli and English contexts were never translations of each other; rather, the context gave a fuller description of an event in which either a habitual or ongoing function of IMP would be required, on different lexical items, in the shorter stimulus sentence. In this way, we were not eliciting direct translations between context and stimulus, but specific functions of IMP. For example, see the following text.
Matched Trial in CMT-READ
-
Context (ongoing): Yesterday, Patrick was expecting his wife to come back from work any minute. Just as he was on his way out, she appeared in the driveway.
-
Stimulus (ongoing): Quand Patrick quittait la maison, il a vu sa femme
“When Patrick was leaving the house, he saw his wife”
Mismatched Trial in CMT-READ
-
Context (ongoing): Yesterday, Patrick was expecting his wife to come back from work any minute. Just as he was on his way out, she appeared in the driveway.
-
Stimulus (habitual): Quand Patrick quittait la maison, il voyait sa femme
“When Patrick left the house, he used to see his wife”
In both CMTs the English context appeared on screen for 10 seconds. Then, the French stimulus appeared orally (CMT-Listen) or in writing (CMT-Read). Participants were instructed to rate how good the match was between the meaning of the French stimulus and English context by pressing a number on the keyboard from 1 (“very good”), 2 (“good”), 3 (“neither good nor bad”), 4 (“poor”), and 5 (“very poor”), with a separate option for “I don’t know” (9). The written French stimulus remained on screen until a number was pressed, and then participants could not change their answer. The task was untimed and took approximately 20 to 25 minutes.
The CMTs drew on 24 of the 48 lexical verbs (third-person singular) from the intervention. Footnote 6 Items were counterbalanced across the match and mismatch conditions for: ongoingness/habituality, verb frequency, lexical aspect class, verb regularity, and clause ordering (main- > subordinate / subordinate- > main).
In addition to the pilot study, we checked the English contexts with three native speakers of English, the French stimuli with 26 native speakers of French, and the match and mismatch combinations with three L1 English very advanced learners of French.
Self-Paced Reading Test
The SPR test was administered after the CMTs and used 16 items from the CMT-Listen, Footnote 7 with eight context-stimulus matches and eight mismatches, counterbalanced as described in the preceding text for the CMT tests. Half the items were followed by yes/no comprehension questions to increase the likelihood that participants focused on meaning (see Keating & Jegerski, Reference Keating and Jegerski2015). Footnote 8 The answers to the questions only depended on a lexico-semantic feature (but not verb stems) and not on inflectional morphology. For each trial, the English context appeared for 10 seconds before an X appeared in the center of the screen. A spacebar press brought up the first and then each subsequent word of the French stimuli. After the last word, the next screen displayed “END.” Participants were instructed to read as quickly as possible. Reaction times were collected from each word according to the noncumulative moving-window procedure (Marinis, Roberts, Felser, & Clahsen, Reference Marinis, Roberts, Felser and Clahsen2005). The font was 18-point Courier New, displayed in the center of a white background, without line breaks.
DATA SCORING AND ANALYSIS
For the CMTs, responses were coded following standard protocols for judgment tasks (see Mackey & Gass, Reference Mackey and Gass2013): five points for each correct response (i.e., pressing 1 or 2 for match trials, and 4 or 5 for mismatch trials); three points for midway responses (pressing 3 for match and mismatch trials); and one point for each incorrect response (pressing 4 or 5 for match trials, and 1 or 2 for mismatch trials). Cronbach’s alphas were: CMT-Listen version A (α = .72), CMT-Listen version B (α = .74), CMT-Read version A (α = .72), and CMT-Read version B (α = .74).
In the SPR, reaction times in each French stimulus were on the critical word, underlined for illustrative purposes in Table 3—the verb in the coordinating clause (either IMP or PC) that disambiguated the main clause verb’s meaning (habitual vs. ongoing). Mean sentence length was 10 words (SD = 1.5, range 9–14). Mean length of the critical word was 2.3 syllables for IMP and 2.9 for PC (all auxiliaries were one syllable, past participle mean = 1.9), with bi- and trisyllabic words counterbalanced across test versions. Reaction times (RT) for the critical word were calculated from the onset of the critical word to the onset of the next word. Footnote 9 We also analyzed whole sentence RTs calculated as the time taken to read from the onset of the first word to the onset of the “END” screen. We analyzed the raw RT data, which we trimmed in line with recommendations for SPR (Keating & Jegerski, Reference Keating and Jegerski2015), removing critical word RTs less than 150 ms and greater than 2,000 ms (three [0.5%, habitual match] and four [0.7%, ongoing match] data points across 50 participants).
a Whether the analysis was carried out on the auxiliary (a), the past participle (vu) or both combined (a + vu) did not make any difference to the pattern of findings. Thus, for parity with the IMP critical words, the raw data we present for the Passé Composé are for the auxiliary.
Slower RTs are usually interpreted as indications of a processing burden, and, conversely, faster RTs as indications of relative processing ease (e.g., Leung & Williams, Reference Leung and Williams2014; Marinis et al., Reference Marinis, Roberts, Felser and Clahsen2005). We therefore analyzed RTs for changes over time in different training conditions. We also compared RTs in matched versus mismatched trials, to detect changes in sensitivity to violations in the use of IMP. If learners became more sensitive to the different functions of IMP following training, it was expected that differences in their RTs between match and mismatch trials would become (more) apparent. For example, following a habitual context, if learners were sensitive to a context-stimulus anomaly, an IMP+PC stimulus (mismatch) would cause a slower RT compared to an IMP+IMP (match).
As none of the data were normally distributed (according to Shapiro-Wilks tests, all data sets p < .05, see Supplementary Materials: Statistics, Table 5), we present the results of nonparametric tests (Field, Reference Field2013; Norris, Plonsky, Ross, & Schoonen, Reference Norris, Plonsky, Ross and Schoonen2015). Nevertheless, for parity with other studies, we note that the patterns of findings did not differ when parametric tests were used (i.e., mixed-design ANOVAs with planned contrasts).
First, between-group differences at pretest were checked using Kruskall-Wallis H tests. Second, Friedman tests were used to compare pretest, posttest, and delayed posttest scores within each group, and, if a significant difference was found, then within-subject comparisons were made using Wilcoxon Signed-Rank tests with Bonferroni corrected alpha levels (equivalent to post hoc analyses) between pairs of test results: pre-post, pre-delayed, and post-delayed. Finally, we compared SPR performance between matched and mismatched trials, in each group, with Wilcoxon Signed-Rank tests. Footnote 10
Following recent discussion on decreasing the probability of Type II errors, including the observation that p-values can be strongly influenced by sample size (Plonsky, Reference Plonsky and Plonsky2015; Plonsky & Oswald, Reference Plonsky and Oswald2014) and that low stakes outcomes should entail setting higher alpha levels (Norris, Reference Norris2015), for the Kruskall-Wallis H and Friedman tests the alpha level was set at 0.10. Footnote 11 For the post hoc Wilcoxon Signed-Rank tests, we used a Bonferroni adjustment making the revised alpha value 0.10/3 = .033 because three comparisons were carried out for each significant omnibus test result (Field, Reference Field2013). For interpreting the magnitude of change, we present Cohen’s d effect sizes for all paired comparisons. Effect sizes between tests, that is, within-subjects, were calculated in relation to the mean and standard deviation of the pretest as a baseline (and the posttest for effect sizes at delayed posttest). As within-subject effect sizes tend to be larger than between-subject, we also give effect sizes compared to the control group using the mean and standard deviation of the control group, Tables 5 and 7, with both the raw effect size and an adjusted effect size corrected for baseline differences (even though all were nonstatistically significant differences, Plonsky & Oswald, Reference Plonsky and Oswald2014). Following Plonsky and Oswald (Reference Plonsky and Oswald2014), Cohen’s d field-specific benchmarks are used for interpretation: d = .40 (small), .70 (medium), and .10 (large). Confidence intervals for effect sizes are in the Supplementary Materials: Statistics (Tables 2 and 3).
RESULTS
Results are presented first for habitual contexts, then for ongoing contexts, separately for matched and mismatched trials. After establishing baseline parity at pretest, we present change over time within each group on each test (CMT-Read, CMT-Listen, and SPR critical word and whole sentence) and effect sizes compared to the control group on each test. For the SPR, we present comparisons of matched versus mismatched trials in each group.
HABITUAL CONTEXTS
The accuracy of judgments (CMTs) and reaction times (SPR) are presented in Table 4. All participants achieved 100% accuracy in SPR comprehension questions. Clause ordering comparisons showed no significant effects (see Table 1, Supplementary Materials: Statistics).
Note. CMT-Listen = Context-matching listening test, CMT-Read = Context-matching reading test, SPR = Self-paced reading test, M = Mean, SD = Standard deviation, Max = Maximum, RT = Reaction time; areaction times for processing PC are for auxiliary (patterns of results were the same for aux. + participle)
MATCHED TRIALS
Kruskall-Wallis tests revealed no statistically significant between-group differences at pretest in CMT-Listen (X 2(2) = .36, p = .84), CMT-Read (X 2(2) = .48, p = .77), SPR critical word (X 2(2) = .32, p = .85), and SPR whole sentence (X 2(2) = .15, p = .93).
For the control group, in all three tests, there were no statistically significant changes over time:
-
• CMT-Read, X Footnote 2 (2) = 2.03, p = .36 (pre-post, d = .33; pre-delayed, d = -.65; post-delayed, d = -.26).
-
• CMT-Listen, X Footnote 2 (2) = 1.35, p = .51 (pre-post, d = -.13; pre-delayed, d = -.13; post-delayed, d = .00).
-
• SPR critical word, X Footnote 2 (2) = 2.63, p = .27 (pre-post, d = -.25; pre-delayed, d = -.29; post-delayed, d = -.05).
-
• SPR whole sentence, X Footnote 2 (2) = 3.88, p = .14 (pre-post, d = -.63; pre-delayed, d = -.67; post-delayed, d = .03).
For the L2-only group:
-
• CMT-Read accuracy did not change over time (X Footnote 2 (2) = .25, p = .88: pre-post, d = -.21; pre-delayed, d = .43; post-delayed, d = .51).
-
• CMT-Listen accuracy increased over time (X Footnote 2 (2) = 7.28, p = .03), due to pre-post changes (Z = -2.16, p = .03, d = .69), but not pre-delayed (Z = -.51, p = .61, d = .51) or post-delayed (Z = -1.18, p = .24, d = .00).
-
• SPR critical word RTs were not significantly different, though effect sizes suggest possible trends (X Footnote 2 (2) = 1.53, p = .47: pre-post, d = -.63; pre-delayed, d = -.67: post-delayed, d = -.03).
-
• SPR whole sentence RTs got faster over time (X Footnote 2 (2) = 12.12, p = .00), pre-post (Z = -2.34, p = .02, d = -.55) and pre-delayed (Z = -2.72, p = .01, d = -.68), with minimal change post-delayed (Z = -1.16, p = .25, d = -.16).
For the L2+L1 group, we found significant improvement over time on all measures:
-
• CMT-Read (X Footnote 2 (2) = 9.25, p = .01), with small effects pre-post (Z = -1.48, p =.14, d = .41), large for pre-delayed (Z = 2.69, p = .01, d = 1.67), and with small effects post-delayed (Z = -2.28, p = .02, d = .58).
-
• CMT-Listen (X Footnote 2 (2) = 7.36, p = .03), with small effects pre-post (Z = -.91, p = .36, d = .35), large for pre-delayed (Z = -2.84, p = .00, d = 1.13), and medium effects post-delayed (Z = -1.62, p = .10, d = .74).
-
• SPR critical word (X Footnote 2 (2) = 17.77, p = .00), pre-post (Z = -3.43, p = .00, d = -1.20) and pre-delayed (Z = -3.29, p = .00, d = -1.14), but not post-delayed (Z = -.59, p = .55, d = .02), with larger effect sizes than the L2-only group.
-
• SPR whole sentence (X Footnote 2 (2) = 12.82, p = .00), for pre-post (Z = -3.24, p = .00, d = -1.06) and pre-delayed (Z = -3.19, p = .00, d = -.90), with no differences post-delayed (Z = -1.16, p = .25, d = .03).
Effect Sizes Compared to Control
As shown in Table 5, negligible effects were found between L2-only and control at post and delayed in CMT-Read, SPR whole sentence and critical word. In CMT-Listen, negligible effects were found at post and a small effect at delayed. In contrast, larger effects were found between L2+L1 and control: in CMT-Listen, medium effects at post and large at delayed; in CMT-Read, small at post and large at delayed; in SPR, for the critical word, large at post and medium at delayed, and for the whole sentence small at post and delayed.
MISMATCHED TRIALS
Kruskall-Wallis tests revealed no statistically significant between-group differences at pretest in CMT-Listen (X 2(2) = 1.22, p = .544, CMT-Read (X 2(2) = .01, p = .998), SPR whole sentence (X 2(2) = 1.29, p = .52,), and SPR critical word (X 2(2) = .19, p = .91).
For the control group, there was no positive change over time:
-
• CMT-Read, X Footnote 2 (2) =.93, p = .63 (pre-post, d = .12; pre-delayed, d = .10; post-delayed, d = -.02).
-
• CMT-Listen accuracy scores deteriorated (X Footnote 2 (2) = 6.14, p = .05), pre-delayed (Z = -2.63, p = .02, d = -.84) and post-delayed (Z = -2.30, p = .02, d = -.63), but not pre-post (Z = -.79, p = .43, d = -.28).
-
• SPR critical word, X Footnote 2 (2) = .00, p = 1.00 (pre-post, d = -.18; pre-delayed, d = -.18; post-delayed, d = -.06).
-
• SPR whole sentence, X Footnote 2 (2) = -3.88, p = .14 (pre-post, d = -.49 and pre-delayed, d = -.53; post-delayed, d = -.11).
For the L2-only group, performance did not change over time:
-
• CMT-Read, X Footnote 2 (2) = .892, p = .89 (pre-post, d = .34; pre-delayed, d = .20; post-delayed, d = -.10).
-
• CMT-Listen, X Footnote 2 (2) = 1.58, p = .45 (pre-post, d = -.18; pre-delayed, d = .18; post-delayed, d = .37).
-
• SPR critical word, X Footnote 2 (2) = 1.41, p = .49 (pre-post, d = -.11; pre-delayed, d = -.16; post-delayed, d = -.09).
-
• SPR whole sentence, X Footnote 2 (2) = 4.35, p = .11 (pre-post, d = -.78; pre-delayed, d = -.59; post-delayed, d = -.15).
For the L2+L1 group, improvement was observed for all tests:
-
• CMT-Read (X Footnote 2 (2) = 9.27, p = .10), although negligible effects were found pre-post (Z = -1.09, p = .28, d = .37), large were found effects pre-delayed (Z = -3.00, p = .00, d = 1.65), with and medium-to-large effects post-delayed (Z = -2.29, p = .02, d = .90), notable given that no participant received instruction post-delayed.
-
• CMT-Listen accuracy (X Footnote 2 (2) = 12.41, p = .00), with negligible effects pre-post (Z = -.96, p = .34, d = .34), large effects pre-delayed (Z = -2.82, p = .01, d = 1.22), and medium-to-large effects post-delayed (Z = -2.87, p = .00, d = 1.02), despite no participant receiving instruction post-delayed.
-
• SPR critical word RTs also got faster (X Footnote 2 (2) = 19.88, p = .00), with medium-large effect sizes pre-post (Z = -3.43, p = .00, d = -.85), pre-delayed (Z = -3.57, p = .00, d = -.1.09), but not post-delayed (Z = -.49, p = .62, d = -.34).
-
• SPR whole sentence RTs did not significantly change, (X Footnote 2 (2) = 3.29, p = .19: pre-post, d = -.56; pre-delayed, d = -.26; post-delayed, d = .26).
Effect Sizes Compared to Control
For the L2-only group, effects were in CMT-Listen, negligible at post but large at delayed; in CMT-Read, effects were negligible; in SPR (critical word and whole sentence), effects were also negligible (See Table 5). For the L2+L1 group, effects sizes were, in CMT-Listen, large at post and delayed; in CMT-Read, small at post and large at delayed; in SPR, effects on the critical word were small at post and delayed, and for the whole sentence, small at post, but negligible at delayed.
The results for habitual trials (matched and mismatched) suggest patterns of greater accuracy and faster processing speeds following the L2+L1 training (corroborated by confidence intervals in Supplementary Materials: Statistics, Table 2). In terms of SPR results, we also note (a) larger effects for critical word RT in L2+L1 than for whole sentence RT; (b) effects for both critical word and whole sentence are larger for L2+L1 than for L2-only and control, suggesting a larger effect for the L2+L1 treatment on processing in general; and (c) any pre-delayed effects for L2-only and control found for whole sentence RT are larger than for critical word RT. Lastly, we do not see speed of processing effects after post.
Reading Times in Matched Versus Mismatched Trials
At pretest, no group performed significantly differently across different trial types:
-
• SPR critical word RT: control (Z = -1.66, p = .098, d = -.47), L2-only (Z = -1.44, p = .15, d = -.4), L2+L1 (Z = -1.30, p = .19, d = -.37).
-
• SPR whole sentence RT: control (Z = -.26, p = .79, d = -.08), L2-only (Z = -1.30, p = .19, d = -.37), L2+L1 (Z = -1.54, p = .124, d = -.45).
However, the L2+L1 group’s RTs were significantly slower in mismatched compared to matched trials at both post and delayed for critical word (post, Z = -2.72, p = .01, d = -.82; delayed, Z = -2.39, p = .02, d = -.82) and whole sentence (post, Z = -1.97, p = .05, d = -.58; delayed, Z = -2.01, p = .04, d = -.58). In contrast, we found no between-trial differences in the L2-only and control groups:
-
• SPR critical word
-
○ L2-only: post (Z = -.02, p = .98, d = -.01), delayed (Z = -.40, p = .69, d = -.12)
-
○ Control: post (Z = -.67, p = .50, d = -.19), delayed (Z = -.57, p = .61, d = -.16)
-
-
• SPR whole sentence
-
○ L2-only: post (Z = -.31, p = .76, d = -.08), delayed (Z = -1.25, p = .21, d = -.37)
-
○ Control: post (Z = -1.18, p = .26, d = -.35), delayed (Z = -.155, p = .88, d = -.04)
-
ONGOING CONTEXTS
Accuracy of judgment (CMTs) and reaction times (SPR) are presented in Table 6. All groups achieved 100% accuracy in SPR comprehension questions. Clause ordering comparisons showed no significant effects (Supplementary Materials: Statistics, Table 1).
Note. CMT-Listen = Context-matching listening test, CMT-Read = Context-matching reading test, SPR = Self-paced reading test, M = Mean, SD = Standard deviation, Max = Maximum, RT = Reaction time. areaction times for processing PC are for auxiliary (patterns of results were the same for aux. + participle.
MATCHED TRIALS
There were no significant between-group differences at pretest on any measure (CMT-Read, X 2(2) = .81, p = .67; CMT-Listen, X 2(2) =.07, p = .96; SPR critical word X 2(2) = .04, p = .98; and SPR whole sentence, X 2(2) = .01, p = .99).
For the control group, there was no significant change over time on any measure:
-
• CMT-Listen, X Footnote 2 (2) = 3.05, p = .22 (pre-post, d = .20; pre-delayed, d = -.22; post-delayed, d = -.44),
-
• CMT-Read, X Footnote 2 (2) = .26, p = .88 (pre-post, d = .20; pre-delayed, d = .16; post-delayed, d = -.05);
-
• SPR critical word, X Footnote 2 (2) = 3.88, p = .14 (pre-post, d = -.63; pre-delayed, d = -.47; post-delayed, d = .02);
-
• SPR whole sentence, X Footnote 2 (2) = .50, p = .78 (pre-post, d = -.56; pre-delayed, d = -.54; post-delayed, d = .09).
For the L2-only group, there was significant change pre-post in the CMTs and SPR critical word, with very little improvement pre-delayed:
-
• CMT-Read (X Footnote 2 (2) = .12.00, p = .00), improved pre-post (Z = -2.76, p = .01, d = 1.02) and pre-delayed (Z = -2.77, p = .01, d = 1.08), but not post-delayed (Z = -.09, p = .93, d = .00).
-
• CMT-Listen (X Footnote 2 (2) = 3.18, p = .20), improved pre-post (Z = -2.303, p = .02, d = .75), but not pre-delayed (Z = -.88, p = .38, d = .37) or post-delayed (Z = -.70, p = .48, d = -.24).
-
• SPR critical word (X Footnote 2 (2) = 10.71, p = .01), RTs quickened pre-post (Z = -2.72, p = .01, d = -.98), but not pre-delayed (Z = -1.97, p = .05, d = -.34), with no change post-delayed (Z = -1.34, p = .18, d = -.54).
-
• SPR whole sentence RTs did not change over time (X Footnote 2 (2) = 4.24, p = .12: pre-post, d = -.19; pre-delayed, d = -.53; post-delayed, d = -.39).
For the L2+L1 group, performance improved on all measures:
-
• CMT-Read (X Footnote 2 (2) = 27.09, p = .00), pre-post (Z = -3.413, p = .00, d = 1.67) and pre-delayed (Z = -3.415, p = .00, d = 1.89), but not post-delayed (Z = -1.29, p = .19, d = .35).
-
• For CMT-Listen (X Footnote 2 (2) = 11.83, p = .00), small gains were found pre-post (Z = -1.74, p = .08, d = .54), large gains pre-delayed (Z = -3.25, p = .00, d = 1.72), with some gains post-delayed (Z = -1.89, p = .06, d = .72).
-
• SPR critical word RTs got faster over time (X Footnote 2 (2) = 16.33, p = .00): pre-post (Z = -3.48, p = .00, d = -1.37), pre-delayed (Z = -3.10, p = .00, d = -1.20), but not post-delayed (Z = -36, p = .72, d = .15).
-
• SPR whole sentence RTs also got faster over time (X Footnote 2 (2) = 15.18, p = .00), pre-post (Z = -3.19, p = .00, d = -1.22), pre-delayed (Z = -3.34, p = .00, d = -1.04), but not post-delayed (Z = -1.07, p = .29, d = .09).
Effect Sizes Compared to Control
As shown in Table 7, for L2-only in CMT-Listen effects were small at post and delayed; in CMT-Read, medium at post and large at delayed. In SPR, there were negligible effects at post and delayed for critical word and whole sentence RTs. In contrast, for the L2+L1 group we found larger effects: in CMT-Listen, effects were small at post and large at delayed; in CMT-Read, large at post and delayed; in SPR, effects were large for the critical word at post and delayed and medium for whole sentence at both post and delayed.
MISMATCHED TRIALS
There were no significant between-group differences at pretest (CMT-Listen, X 2(2) = .05, p = .974; CMT-Read, X 2(2) = 1.14, p = .57; SPR critical word, X 2(2) = .18, p = .91; SPR whole sentence, X 2(2) = 1.03, p = .59).
For the control group, performance did not improve over time:
-
• CMT-Read, X Footnote 2 (2) = .45, p = .79 (pre-post, d = .39; pre-delayed, d = .26; post-delayed, d = -.09).
-
• CMT-Listen accuracy deteriorated (X Footnote 2 (2) = 7.17, p = .03): pre-delayed (Z = -2.66, p = .01, d = -.71) but not pre-post (Z = -.91, p = -.36, d = -.29) or post-delayed (Z = -.51, p = .61, d = -.40).
-
• SPR critical word, X Footnote 2 (2) = .88, p = .65 (pre-post, d = -.12; pre-delayed, d = .10; post-delayed, d = .18).
-
• SPR whole sentence, X Footnote 2 (2) = 3.50, p = .17 (pre-post, d = -.28; pre-delayed, d = -.52; post-delayed, d = -.17).
For the L2-only group, we found no change over time on any measure:
-
• CMT-Read accuracy, X Footnote 2 (2) = .43, p = .81 (pre-post, d = .19; pre-delayed, d = .31; post-delayed, d = .09).
-
• CMT-Listen accuracy, X Footnote 2 (2) = 2.71, p = .26 (pre-post, d = .47; pre-delayed, d = .14; post-delayed, d = -.36).
-
• SPR critical word, X Footnote 2 (2) = 1.88, p = .39 (pre-post, d = -.20; pre-delayed d = -.19; post-delayed, d = -.01).
-
• SPR whole sentence, X Footnote 2 (2) = .35, p = .84 (pre-post, d = -.52; pre-delayed, d = -.40; post-delayed d = .10).
For the L2+L1 group, we found significant improvement over time in all tests:
-
• CMT-Read (X Footnote 2 (2) = .14.30, p = .00), medium effects for pre-post (Z = -2.88, p = .00, d = .93) and pre-delayed (Z = -3.24, p = .00, d = 1.46), with small post-delayed differences (Z = -1.96, p = .05, d = .43).
-
• CMT-Listen (X Footnote 2 (2) = .11.38, p = .00), accuracy increased pre-delayed (Z = -2.96, p = .00, d = 1.27), but not pre-post (Z = -1.22, p = .22, d = .42), with a medium effect post-delayed (Z = -1.62, p = .11, d = .71).
-
• SPR critical word RTs got faster (X Footnote 2 (2) = .22.59, p = .00) at pre-post (Z = -3.57, p = .00, d = -1.01) and pre-delayed (Z = -3.62, p = .00, d = -.98), but not post-delayed (Z = -.639, p = .52, d = .04).
-
• SPR whole sentence RTs did not change over time (X Footnote 2 (2) = .4.24, p = .12): pre-post, d = -.49; pre-delayed, d = -.28; post-delayed, d = .17.
Effect Sizes Compared to Control
For the L2-only group, medium effect sizes were found in CMT-Listen at both post and delayed, with negligible effects found in CMT-Read; for SPR, there were no effects for both critical word and whole sentence processing (See Table 7). For the L2+L1 group, in CMT-Listen effect sizes at post were small and large at delayed; in CMT-Read, effects were negligible at post, but medium at delayed; in SPR, effect sizes for the critical word were medium at both post and delayed, and effects were negligible for the whole sentence.
In these comparisons to control, in matched and mismatched trials, effects seemed larger in the L2+L1 group for accuracy and processing speed. Effect sizes for RTs on the critical word seemed particularly affected, and were larger than for the whole sentence, whereas the L2-only group’s effects on both critical word and whole sentence were negligible, suggesting an across-the-board increase in processing speed due to test-familiarity cannot adequately explain the results. (See also confidence intervals in Supplementary Materials: Statistics, Table 3.) For example, for L2+L1, pre-delayed effect sizes are larger for critical word RT than for whole sentence, and SPR effects sizes for L2+L1 are consistently larger than for both L2-only and control, suggesting an advantage for the L2+L1 treatment. Lastly, we do not see speed of processing effects after posttest (except for L2-only in ongoing match).
Reading Times in Matched Versus Mismatched Trials
At pretest, Wilcoxon Signed Rank tests revealed no significant between-trial RT differences in any group:
-
• SPR critical word: control (Z = -.36, p = .72, d = -.10), L2-only (Z = -.17, p = .87, d = -.04), L2+L1 (Z = -.59, p = .55, d = -.16).
-
• SPR whole sentence: control (Z = -.26, p = .77, d = -.08), L2-only (Z = -.88, p = .38, d = -.24), L2+L1 (Z = -1.44, p = .15, d = -.41).
However, after the intervention, the L2+L1 group’s processing was significantly slower in mismatched than in matched trials at both post (critical word, Z = -2.40, p = .02, d = -.72; whole sentence, Z = -2.49, p = .01, d = -.75) and delayed (critical word, Z = -2.49, p = .01, d = -.75; whole sentence, Z = -2.58, p = .01, d = -.77). In contrast, no between-trial type differences were found for the L2-only and control groups:
-
• SPR critical word
-
○ L2-only: post (Z = -1.16, p = .25, d = -.32), delayed (Z = -1.11, p = .27, d = -.32)
-
○ Control: post (Z = -.78, p = .44, d = -.24), delayed (Z = -.16, p = .88, d = -.24)
-
-
• SPR whole sentence
-
○ L2-only: post (Z = -.49, p = .62, d = -.14), delayed (Z = -1.35, p = .18, d = -.39)
-
○ Control: post (Z = -.31, p = .76, d = -.08), delayed (Z = -.21, p = .84, d = -.06)
-
DISCUSSION
We examined the extent to which EI with task-essential form-meaning practice influenced learners’ online processing and offline interpretation of L2 French morphosyntax for past habituality and ongoingness. An L2-only group received L2 EI with task-essential form-meaning practice. An L2+L1 group received the same, with additional L1 EI with task-essential form-meaning practice.
Summary of Findings
Our expectation that L2 EI with practice would result in learning gains, whereas the control condition would not, was partially supported. The control group showed no significant improvement over time (and performed worse in the CMT-Listen pre-delayed, mismatched, habitual and ongoing). For the L2-only group, some improvements were found, although these were mostly in offline tasks and limited: development up to delayed posttest was maintained only in CMT-Read (ongoing, matched); short-term gains made in CMT-Listen (habitual, matched) and SPR (ongoing, matched) were lost by delayed posttest; and negligible-to-no gains were made in mismatched contexts in all tests. This is somewhat inconsistent with previous research, which has shown clear benefits for task-essential form-meaning mapping practice (e.g., Marsden, Reference Marsden2006; Marsden & Chen, Reference Marsden and Chen2011; VanPatten, Reference VanPatten2002). This discrepancy may relate to the fact that the SPR online measure has not been used in previous form-meaning practice studies. It could be that the interclausal nature of the processing problem, and/or the specific cross-linguistic properties, moderated the benefits of this instructional treatment.
Our hunch that the L2+L1 EI and task-essential practice would result in development was supported, in all outcome measures at six weeks after the intervention. Learners’ interpretation of IMP had improved at delayed posttest for both habitual and ongoing contexts, in matched and mismatched trials, and in reading and listening CMTs. They also increased the speed of distinguishing habitual and ongoing meanings of IMP between pre-post, maintained at delayed, with larger and longer lasting effect sizes than the L2-only group. Additionally, the L2+L1 group demonstrated increased sensitivity to incorrect (mismatch) compared to correct (match) usage of IMP. At pretest, all groups’ processing speeds were similar in matched and mismatched trials, indicating a lack of online sensitivity. Only the L2+L1 group’s processing in mismatched trials became significantly slower relative to matched trials at post and delayed posttest. In contrast, there continued to be no between-trial differences for L2-only and control. These results are in line with findings about sensitivity to the processing cost of anomalous aspectual distinctions (Roberts & Liszka, Reference Roberts and Liszka2013) and other violations (Leung & Williams, Reference Leung and Williams2014; Marinis et al., Reference Marinis, Roberts, Felser and Clahsen2005; Tokowicz & Warren, Reference Tokowicz and Warren2010).
The likelihood that these observations were largely ascribable to the pre-post phase is increased by the fact that participants received no instruction at all post-delayed.
Learning Mechanisms Potentially at Play
Providing learners with EI about and practice in L1 form-meaning mappings may have helped establish, or strengthen, conceptual representations of habituality and ongoingness. In turn, this may have facilitated the strengthening, through L2 instruction, of L2 mappings for these concepts. By practising these (re)mappings, declarative knowledge of new processing routines may have been proceduralized and automatized, reflected in faster online processing (DeKeyser, Reference DeKeyser, VanPatten and Williams2015). That is, the L1 EI clarified concepts and form-meaning mappings; the L1 practice reinforced these concepts and mappings; and the L2 EI and interspersed practice strengthened the mappings between these (now better represented) concepts and French forms. Our data largely supports evidence of difficulties created by cross-linguistic form-meaning mapping differences (Izquierdo & Collins, Reference Izquierdo and Collins2008; McManus, Reference McManus2013, Reference McManus2015).
Observation of Online Effects
Our findings from SPR may be accounted for in various ways. The L2+L1 treatment could have reduced reliance on L1-based processing strategies for interpreting aspect—which do not require co-indexation with interclausal morphosyntax—and routinized L2 processing strategies that do require interclausal co-indexation. Indeed, the interweaving of the English and French practice items may have promoted some coactivation of French morphosyntax when, at test, English contexts were read. Thus, after reading the English, the L2+L1 group were more ready to make a faster decision about the expected French morphosyntax. Clearer habituality and ongoing conceptual distinctions could have given anticipatory benefits for accessing the appropriate French in the stimuli, for example, increasing the speed of processing interclausal forms required for habituality. These findings are broadly compatible with existing evidence about L1 effects during L2 processing (e.g., Sanoudaki & Thierry, Reference Sanoudaki, Thierry, Thomas and Mennen2014; Tolentino & Tokowicz, Reference Tolentino and Tokowicz2011). However, further research is required to examine whether (a) this coactivation is unique to mixed language tests (our CMTs and SPR both included an L1 context and L2 stimulus) and (b) current findings would hold for tests only presented in the L2. To this end, preliminary findings from spoken narrative tests (beyond the scope of the current article) suggest some similar patternings of results in tests that do not provide an L1 context, suggesting that L1-L2 coactivation is not entirely restricted to mixed language testing contexts. Further research should also examine whether comprehension and SPR tests that are only in the L2 would pattern similarly.
However, our observations of online effects contrast with findings from Andringa and Curcic (Reference Andringa and Curcic2015) and Marsden et al. (Reference Marsden, Williams and Lui2013). A number of reasons could explain this difference: our provision of L1 EI and practice; our fuller EI about the L2 (compared to Andringa & Curic’s brief EI containing two examples, and Marsden et al.’s yes/no feedback); our longer practice with its task-essential form-meaning mappings; our different outcome measures (SPR here vs. anticipatory eye movements in Andringa & Curic and cross-modal priming in Marsden et al.); and, finally, our different target features and languages (a more established, natural lexicon here vs. a novel artificial lexicon in Andringa & Curic and Marsden et al.).
In sum, our findings suggest that providing L2+L1 EI and task-essential form-meaning practice for a feature exhibiting complex L1-L2 differences resulted in L2 performance that appeared to benefit from L1 knowledge, rather than being adversely affected by it. Compatible with previous studies that explored teaching/knowledge about the L1 for L2 learning (Horst et al., Reference Horst, White and Bell2010; Kupferberg, Reference Kupferberg1999; Spada et al., Reference Spada, Lightbown, White, Housen and Pierrard2005), it seems likely that our L2+L1 treatment was beneficial (and more reliably so than L2-only) because of the specific nature of the learning problem, that is, L1-L2 form-meaning mapping differences. Our evidence therefore supports the points raised by the opening quotation (Henry et al., Reference Henry, Culman and VanPatten2009): that the effectiveness of EI seems dependent on the nature of the EI, the target structure, and processing problem.
LIMITATIONS AND FUTURE RESEARCH
First, we emphasize that our findings and accounts of learning are tentative, given our small sample sizes.
As described in the preceding text, the English exemplars in the L2+L1 condition were in addition to the L2-only treatment, and this difference in exposure requires interpretation. We consider it highly unlikely that, for English speakers who already have 19 years of exposure to English as L1 speakers, the mere exposure alone to the English exemplars would affect L2 French online or offline performance. Furthermore, as described in the preceding text, English habituality and ongoingness are expressed with entirely different morphology to French (played => jouait and a joué; was playing => jouait; used to play/would play => jouait). Thus, additional exposure alone to English is unlikely to change the processing of French. We suspect, therefore, that the combination of the EI about English and practice in interpreting the English and French resulted in stronger conceptualizations, and more accurate and faster (re)mapped interpretations of L2 French IMP. However, further investigation, including replication, is required to isolate the role of the English EI from the English practice in order to understand the contribution of each to learning.
This data set clearly offers opportunities for further analyses that are beyond the scope of this article, such as detailed statistical analyses on the effects of lexical verb type and grammatical/viewpoint aspect, in line with research informed by the Aspect Hypothesis (Andersen & Shirai, Reference Andersen and Shirai1994). We found tentative support for previous evidence that habituality in French seems more difficult than ongoingness for L1 English learners (Howard, Reference Howard, Labeau and Larrivée2005; McManus, Reference McManus2013, Reference McManus2015), as, descriptively, pretest scores were slightly lower for habituality. However, we find no clear patterns of more improved performance in habituality versus ongoingness as a result of either treatment. That is, the effect sizes on our interpretation measures did not suggest more benefits for one meaning of IMP than the other, although further research is required.
This study has provided some evidence that EI about the L1 with interpretation practice had benefits on both off and online measures, at least for our feature with cross-linguistic form-meaning differences. However, further research is required, with larger sample sizes, to ascertain whether this type of L2+L1 instruction would be as beneficial for different language features, including syntactic phenomena with and without referential meaning, for different L1-L2 combinations, and for different L2 proficiency levels. In particular, further research should consider (a) the influence of different types of EI and practice on processing over time and (b) the relationship of the L1-L2 morphosyntax and the extent to which types of EI and practice interface with this, for online and offline behaviour. It is also important to investigate how changes in online and offline interpretation behavior relate to the development of production.
SUPPLEMENTARY MATERIAL
To view supplementary material for this article, please visit https://doi.org/10.1017/S027226311600022X.