1. Introduction
The English phenomenon of our interest is wanna, a contracted form of want to. What makes this phenomenon interesting is that wanna contraction is sometimes possible, as in (1), but not always, as in (2) (Lakoff, Reference Lakoff1970).
(1)
a. Who do you [want to, wanna] kiss ___?
b. Who do you [want to, wanna] dive with ___? (Zukowski & Larsen, Reference Zukowski and Larsen2011, p. 213, (1a)–(1b))
(2)
a. Who do you [want ___ to, *wanna] kiss you?
b. Who do you [want ___ to, *wanna] dive first? (Zukowski & Larsen, Reference Zukowski and Larsen2011, p. 213, (2))
This contrast can be considered in terms of subject wh-questions versus non-subject wh-questions. Whereas contraction between want and to is impossible if the question is about the subject of the to-infinitival clause (e.g., (2)), such contraction is possible if the question is about any other constituent within the to-infinitival clause, such as a direct object of the verb (e.g., (1a)) or the preposition (e.g., (1b)).
The contrast at issue has been explained by several theoretical accounts, including the wh-trace account (Chomsky, Reference Chomsky, Culicover, Wasow and Akmajian1977; Chomsky & Lasnik, Reference Chomsky and Lasnik1977; Lightfoot, Reference Lightfoot1976), the subject-sharing account (Postal & Pullum, Reference Postal and Pullum1982), the clitic climbing account (Goodall, Reference Goodall, Georgopoulos and Ishihara1991), and the emergentist account (O'Grady et al., Reference O'Grady, Nakamura and Ito2008).Footnote 1 (For a constructivist approach, see Boas, Reference Boas, Achard and Kemmer2004.Footnote 2) The most widely accepted of these is the wh-trace account, which attributes the wanna contrast to the blocking effect of wh-movement. Specifically, the sentences in (2) involve extraction of the ‘subject’ of the to-infinitival complement from the position intervening between want and to to the front; this extraction leaves the trace of the subject (indicated by ___) in the original position, thus blocking the contraction of want and to over the trace. By contrast, the sentences in (1) involve extraction of non-subject constituents within the to-infinitival clause from positions other than between want and to to the front; hence, these cases do not give rise to a blocking effect. As such, there is a structurally-based constraint on wanna contraction.
Knowledge of the constraint on wanna contraction has been demonstrated in adult native language (L1) speakers of English, most clearly in comprehension. While production experiments using elicited production tasks or oral repair tasks have shown overgeneralized use of wanna contraction even over an intervening wh-trace (Kweon & Bley-Vroman, Reference Kweon and Bley-Vroman2011; Zukowski & Larsen, Reference Zukowski and Larsen2011; see also Pullum, Reference Pullum1997), comprehension experiments using grammaticality judgment tasks have shown that adult native speakers accept possible wanna sentences (e.g., (1)) and reject impossible wanna sentences (e.g., (2)) (Ito, Reference Ito2018; Kweon & Bley-Vroman, Reference Kweon and Bley-Vroman2011).
When it comes to acquisition, however, the constraint on wanna contraction raises learnability challenges for both English-speaking children and second language (L2) learners of English. First, traces are not phonetically realized and so they are not observable in the input that learners are exposed to. Furthermore, it is unlikely that learners would be able to learn the target contrast (and particularly the ungrammaticality of subject questions involving wanna contraction) from input alone because subject questions have been found to be nearly absent in input. In the analysis of 700,000 adult utterances from the CHILDES database (MacWhinney, Reference MacWhinney2000), Zukowski and Larsen (Reference Zukowski and Larsen2011) found 842 non-subject questions, including 576 questions using want to and 266 using wanna. However, they found only 12 subject questions, of which three were illicit (e.g., What do you wanna go in our car?). More crucially, learners do not receive negative evidence that would help them disallow illicit wanna contraction for subject questions. The same learnability issue also holds for L2ers of English (a) whose L1 does not have a wanna phenomenon, such as Japanese or Korean, and (b) who do not receive explicit instruction on wanna contraction in L2 classrooms.
In sum, the grammatical constraint on wanna contraction involves a learnability issue. Although a number of empirical studies on wanna contraction have attempted to investigate whether knowledge of the constraint might exist among both L1-English children (e.g., Crain & Thornton, Reference Crain and Thornton1998; Getz, Reference Getz2019; Thornton, Reference Thornton1990; Zukowski & Larsen, Reference Zukowski and Larsen2011) and L2ers of English (e.g., Ito, Reference Ito2018; Kweon, Reference Kweon2000; Kweon & Bley-Vroman, Reference Kweon and Bley-Vroman2011; Witzel & Witzel, Reference Witzel and Witzel2008), their findings have been inconsistent, making further work necessary.
Furthermore, previous acquisition studies have had some methodological issues. For example, the majority of them rely on elicited production tasks (cf. Ito, Reference Ito2018; Kweon, Reference Kweon2000; Kweon & Bley-Vroman, Reference Kweon and Bley-Vroman2011), which impose more cognitive burden on learners than comprehension tasks (Ionin & Zyzik, Reference Ionin and Zyzik2014) and are not as sensitive to the nuances of grammar as acceptability judgment tasks (Kweon & Bley-Vroman, Reference Kweon and Bley-Vroman2011). The sentences tested in these studies are also rather complex, involving do-insertion as well as wh-movement. To address these issues, this study employs an acceptability judgment task created to be child-friendly and less cognitively demanding. The stimulus sentences are presented both aurally and in written form, and each is repeated twice in the task (e.g., Deen et al., Reference Deen, Bondoc, Camp, Estioca, Hwang, Shin, Takahashi, Zenker, Zhong, Bertolini and Kaplan2018). To minimize the processing burden on the participants, both the words and structures of the stimulus sentences are as simple as possible (e.g., Friedmann et al., Reference Friedmann, Belletti and Rizzi2009). The four conditions and example items are shown in (3)–(6): (5) and (6) serve as baseline conditions.
(3) Who + Gap (object question, k = 5): I wonder who you wanna work with.
(4) *Who + No gap (subject question, k = 5): *I wonder who you wanna work.
(5) *If + Gap (k = 5): *I wonder if you wanna work with.
(6) If + No gap (k = 5): I wonder if you wanna work.
Another important issue in previous L2 research on wanna contraction is that it has focused solely on adult L2ers. To expand the current body of research, this study tests both child and adult L2ers of English. What makes looking at both groups particularly interesting is that these two groups show differences in cognitive resources and age of acquisition. For example, children are known to have fewer working memory resources than adults (Kharitonova et al., Reference Kharitonova, Winter and Sheridan2015; Swanson et al., Reference Swanson, Sáez and Gerber2006). Because sentences containing wanna contraction are syntactically complex, they may impose too much processing burden on L2ers, and especially on child L2ers. A possible result is that whereas adult L2ers will show target-like knowledge of the constraint on wanna contraction, the child L2ers will fail to show such knowledge. Alternatively, the critical period hypothesis (Abrahamsson & Hyltenstam, Reference Abrahamsson and Hyltenstam2009; Lenneberg, Reference Lenneberg1967) suggests that successful acquisition of L2 morphosyntax is no longer possible after the age of about 7 years (e.g., Johnson & Newport, Reference Johnson and Newport1989), which would give child L2ers an advantage over adult L2ers in acquiring wanna contraction, with the opposite results. Therefore, testing child L2ers of English and adult L2ers of English as well as L1-English children will provide us with a deeper understanding of how these groups compare with respect to the acquisition of a structurally-based constraint.
2. L1 acquisition research on wanna contraction
The first work on wanna contraction in L1 acquisition is Thornton's (Reference Thornton1990) study (also reported in Crain & Thornton, Reference Crain and Thornton1998). In an elicited production task, he led 14 young children (age: 3;6–5;5) to produce wh-questions for a puppet who, the children were told, was too shy to talk to adults. The task had two conditions manipulated by prompts. The non-subject question condition asked about the direct object, the object of a preposition, or the second object in a double object construction in the to-infinitive complement of want, as in (7).
(7) The rat looks kind of hungry. I bet he wants to eat something. Ask him what.
(Crain & Thornton, Reference Crain and Thornton1998, p. 181)
The subject question condition asked about the subject of the to-infinitive complement of want, as in (8).
(8) In this game, there's a baby, a dog, and Cookie Monster, OK? And some different things are going to happen, and the rat gets to choose who gets to do different things. Now, one of these guys gets to take a walk, one of these guys gets to take a nap, and one of these guys gets to eat a cookie, right? Let's do the cookie first. So, one of these guys gets to eat a cookie, right? Ask the rat who he wants.
(Crain & Thornton, Reference Crain and Thornton1998, p. 182)
The results showed that children as young as 3;6 had target-like knowledge of the constraint on wanna contraction. Whereas they used (licit) wanna contraction in 60 out of 68 non-subject questions (88%), they used (illicit) wanna contraction in only 6 out of 74 subject questions (8%). These results indicate the presence of knowledge of possible versus impossible wanna contraction in young children.
However, Zukowski and Larsen (Reference Zukowski and Larsen2011) pointed out that Thornton's (Reference Thornton1990) prompts include a confound. Unlike the non-subject question prompts (e.g., Ask him what he wants to eat ___ in (7)), the subject question prompts (e.g., Ask the rat who he wants ___ to eat a cookie in (8)) have the ellipsis in the position of the wh-trace, where a blocking effect arises. According to Zukowski and Larsen, the study's use of such prompts may have led to the extremely low rates of wanna contraction in the children's subject questions. To address this confound, they modified Thornton's method in their Experiment 1 by using the full forms of embedded questions for both non-subject questions (e.g., Ask him what he wants to eat) and subject questions (e.g., Ask the rat who he wants to eat a cookie). (In Experiment 2, they indeed found an effect of the form of questions, which this paper will not discuss due to space limitations.) In addition, they varied critical questions by having two types of wh-phrase (what, who), which resulted in four conditions with five items each. The two groups of (a) children (n = 13; age 4;00 to 7;03) and (b) adult controls (n = 14) completed the modified elicited-production task.
Zukowski and Larsen (Reference Zukowski and Larsen2011) did not find a significant effect of wh-phrase in either group. Their individual analysis showed that 12 adults (86%) made one or fewer wanna violations out of 10 and only one adult (7%) made four or more violations. By contrast, only two children (15%) produced one or fewer violations and seven (54%) made four or more wanna violations out of 10. This result was taken by the researchers to cast doubt on the claim that young children do have categorical knowledge of possible versus impossible wanna contraction.
However, Zukowski and Larsen's (Reference Zukowski and Larsen2011) statistical analysis did not reveal any group-related interaction effect (i.e., group × question type, group × wh-phrase, group × wh-phrase × question type), which makes it hard to claim a significant difference between children and adults in how they treated possible and impossible wanna contraction. In addition, the two groups exhibited somewhat similar patterns in that they both used wanna contraction far less often in (illicit) subject questions than in (licit) non-subject questions.
Another study modeled on Thornton's (Reference Thornton1990) was conducted by Getz (Reference Getz2019), who carried out two production experiments that varied by the lexical frequency of the embedded verbs (Experiment 1: low-frequency, medium-frequency; Experiment 2: high-frequency). Both experiments included younger children (Experiment 1: n = 12; age 3;09–4;10; Experiment 2: n = 10; age 3;09–4;09) and older children (Experiment 1: n = 14; age 5;03–7;03; Experiment 2: n = 14; age 5;00–7;03). Getz hypothesized that if the wanna constraint is purely structural, children should not produce the impossible wanna contraction whether the frequency of the embedded verbs is high (e.g., do) or low (e.g., explore). Overall, the results showed that the children's use of wanna was not adult-like regardless of their age. Furthermore, their accuracy differed depending on the frequency of the embedded verbs only in the context of the impossible wanna contraction: their use of the impossible wanna contraction decreased with higher-frequency embedded verbs.
Based on these results, Getz (Reference Getz2019) argued that a learning procedure for wanna contraction is necessary because “children apparently do not have access to [the structural constraint on it]” (p. 136). (To preview, our results, however, show target-like knowledge of wanna contraction in young children.) She proposed that children may learn the contrast in wanna contraction by storing instances of (a) wanna + embedded verb and (b) want ___ to + embedded verb as separate item-based constructions and keeping track of the different distributions of (a) and (b), which would lead to the adult-like generalization at a later stage of language acquisition. Although her study offers novel perspectives on the acquisition of wanna contraction, it raises a few concerns. First, lexical frequency was treated as categorical (high, medium, low) rather than continuous. Second, the embedded verbs were not identical across the critical conditions. For example, whereas the low-frequency verbs for the possible condition were carry, meet, share, surprise, paint, and visit, those for the impossible condition were cover, explore, fly, grow, hide, and kick. To address this issue, we used the same embedded verbs across the conditions.
More crucially, it remains unclear why Getz (Reference Getz2019) observed a frequency effect only for the impossible wanna contraction. If children's learning mechanisms rely on the storage of particular item-based patterns and thus their frequency, then one would also expect this frequency effect to arise for the possible wanna contraction. Getz argued that “[h]igher-frequency verbs will be more likely to be encountered (hence represented) in a particular construction, so these verbs will be correctly acquired earlier” (p. 136); if so, it is conceivable that at least younger children, who are still in the process of generalizing the item-based patterns, would show a frequency effect in the possible wanna contraction, as well.
To summarize, whereas Thornton's (Reference Thornton1990) child data suggest intact knowledge of the grammaticality contrast in wanna contraction in children as young as 3;6, Getz's (Reference Getz2019) data suggest the lack of such knowledge in children. Zukowski and Larsen's (Reference Zukowski and Larsen2011) data are not clear enough to draw a firm conclusion. This discrepancy in previous literature highlights the need for further research.
3. L2 acquisition research on wanna contraction
Whether L2ers come to know the constraint on wanna contraction is controversial. Some studies report that they do so, or seem to do so (Ito, Reference Ito2018; Witzel & Witzel, Reference Witzel and Witzel2008), while other studies demonstrate that they do not (Kweon, Reference Kweon2000; Kweon & Bley-Vroman, Reference Kweon and Bley-Vroman2011).
One of the most influential L2 studies on wanna contraction is Kweon and Bley-Vroman (Reference Kweon and Bley-Vroman2011; see also Kweon, Reference Kweon2000). They administered three tasks to 39 native speakers of English and 104 L1-Korean L2ers of English, whose proficiency the authors deemed ‘advanced’ based on e.g., their TOEFL scores of 550 or higher. However, it may not be fair to group all 104 L2ers as advanced, given that, for example, the average TOEFL score at the 50th percentile is between 550 and 589. The first task was an elicited production task based on Thornton's (Reference Thornton1990) study, which was designed to elicit six subject questions as well as six object questions. Note that it is in the subject questions that wanna contraction is impossible. The second task was an oral repair task where participants heard five subject questions (e.g., *Who do you wanna help you with your homework?) and five object questions (e.g., What kind of book do you wanna buy in the book store?); after hearing each question, they were asked to repeat it while rephrasing anything that sounded unnatural. The last task was a grammaticality judgment task in which participants rated 10 subject questions (e.g., *Who do you think the boys wanna go to the beach?) and 10 object questions (e.g., What do you think they wanna see in Hawai‘i?) on a four-point Likert scale (0: absolutely impossible; 1: probably impossible; 2: probably possible; 3: absolutely possible).
For data analysis, Kweon and Bley-Vroman (Reference Kweon and Bley-Vroman2011) proposed four categories to classify participants, as shown in Appendix S1 in Supplementary Materials. Participants who did not use/accept wanna in both subject and object questions were classified into the CONSERVATIVE group; those who used/accepted wanna only in object questions were classified into the CORRECTLY DIFFERENTIAL group; those who used/accepted wanna only in subject questions were classified into the BACKWARD group; and those who used/accepted wanna in both subject and object questions were classified into the OVERGENERAL group. In order for wanna contraction to be considered part of the L2 grammar for each of the subject and object questions, a cut-off point was set at (a) more than one instance of wanna for the elicited production task and the oral repair task and (b) an average score of 2 for the grammaticality judgment task. Critically, Kweon and Bley-Vroman considered the difference between CORRECTLY DIFFERENTIAL and BACKWARD as a key indicator of the grammaticality contrast between subject and object questions with wanna contraction. In particular, those sensitive to this distinction can never fall into the category of BACKWARD.
The results showed obvious differences between native speakers of English and L1-Korean L2ers of English across the three tasks (see Appendix S2 in Supplementary Materials). In all three tasks, the majority of the native English speakers fell into the CORRECTLY DIFFERENTIAL category although many of them were also classified as CONSERVATIVE and OVERGENERAL in the elicited production task and the oral repair task. However, none of the native English speakers were BACKWARD. In the case of the L2ers, there is some suggestive evidence that learners’ grammars do distinguish the subject and object questions: for all tasks, a greater number of L2ers are in the CORRECTLY DIFFERENTIAL category than in the BACKWARD category. This difference, however, approached statistical significance only in the oral repair data and acceptability judgment data, but not in the elicited production data. Furthermore, at least some L2ers (n = 2–14) fell into the BACKWARD category in every task.
One thing to note about Kweon and Bley-Vroman's (Reference Kweon and Bley-Vroman2011) categorization scheme, which has been established as a paradigm for examining individual grammar in follow-up L2 studies, is that the categories CONSERVATIVE and OVERGENERAL do not address the availability of the constraint on wanna contraction in learner grammar (p. 210). In fact, there seem to be many native English speakers who are liberal in the use of wanna in their production, overgeneralizing it to any question type (see also Pullum, Reference Pullum1997). Also problematic is what constitutes a valid cut-off criterion to classify contractors and non-contractors. Furthermore, the categorization scheme cannot take response bias into account. This study addresses all these issues by proposing a sensitivity score for wanna contraction (see Section 4.4); this score offers its own advantages as it is a continuous measure that effectively addresses response bias and provides a clearer understanding, compared to other sensitivity measures like d’ scores (Wickens, Reference Wickens2002), of what extent individuals differentiate possible from impossible wanna contraction.
On the other hand, there is some evidence for knowledge of wanna contraction in L2 English. For example, Witzel and Witzel (Reference Witzel and Witzel2008) administered an elicited production task to 54 L1-Japanese L2ers of English whose proficiency was said to be intermediate by their instructor; this task aimed to elicit subject questions (k = 6) and object questions (k = 6). In their statistical analysis, Witzel and Witzel first excluded 19 participants who did not produce at least three wh-questions containing either want to or wanna; the average rate of wanna wh-questions from the remaining 35 L2ers was significantly higher in object questions (49%) than in subject questions (39%). For their individual analysis, they adjusted the cutoff value from more than one instance of wanna (Kweon & Bley-Vroman, Reference Kweon and Bley-Vroman2011) to the use of wanna in at least 33.3% of the produced subject or object questions. This choice was made because the more-than-one criterion was considered too conservative for their participants, as not all of them produced six subject questions and six object questions. Their analysis showed 16 CONSERVATIVE L2ers (45.71%), 4 CORRECTLY DIFFERENTIAL L2ers (11.42%), and 15 OVERGENERAL L2ers (42.85%). Importantly, not a single L2er showed the BACKWARD pattern.
Despite some differences between the two L2 studies discussed above, their results from production tasks are consistent in that many of the L2ers are CONSERVATIVE or OVERGENERAL. Crucially, the same pattern was observed even in the native English speakers (e.g., Kweon & Bley-Vroman, Reference Kweon and Bley-Vroman2011). This fact casts doubt on the method of tapping into grammar using only production data, which has many limitations. Ionin and Zyzik (Reference Ionin and Zyzik2014) pointed out that (a) the absence of a particular expression or (b) a production error does not necessarily indicate a lack of knowledge. It can be, in fact, attributed to other sources, such as “avoidance, phonological complexity, or difficulty with retrieval from memory” (p. 37). This study, therefore, employs an acceptability judgment task with the aim of examining how wanna contraction is represented in learner grammars.
Using an acceptability judgment task, Ito (Reference Ito2018) claimed that L1-Japanese L2ers’ development pattern constitutes evidence for successful acquisition of the constraint on wanna contraction. The task was a paper-and-pencil offline task with four items for each of the six conditions. The critical condition was the subject question with contraction (e.g., *Who do you wanna advise Mary at the training session next week?), which was the only ungrammatical condition among the six conditions. Also included were the subject question without contraction (e.g., Who do you want to advise Mary at the training session next week?), the object question with contraction (e.g., Who do you wanna advise at the training session next week?), and the object question without contraction (e.g., Who do you want to advise at the training session next week?). There were also two control conditions: the question without extraction but with contraction and the question without extraction and without contraction (e.g., Do you {wanna/want to} advise Mary at the training session next week?). Based on a Cloze proficiency test (Brown, Reference Brown1980), 103 L2ers were divided into three proficiency groups: high, intermediate, and low. Results showed that with increasing proficiency, there was an increase of correct differentiations between the licit and illicit contraction patterns and a decrease of overgeneralized wanna patterns. This suggests the development of knowledge of wanna contraction in L1-Japanese L2ers of English.
Expanding on previous research, the current study attempts a novel method of conducting an acceptability judgment task to minimize the cognitive burden on participants in multiple ways (see Section 4.2).
4. The present study
The following research questions frame this study:
-
RQ1: How early do native English-speaking children have knowledge of wanna contraction?
-
RQ2: Do child L1-Korean L2ers have knowledge of wanna contraction? What role does L2 proficiency play?
-
RQ3: Do adult L1-Korean L2ers have knowledge of wanna contraction? What role does L2 proficiency play?
With a novel way of conducting an acceptability judgment task, we predict that (some) participants in all three learner groups will show evidence of knowledge of wanna contraction. Although we do not have a specific prediction for RQ1, one possibility is that native English-speaking children will show target-like knowledge of the target phenomenon before or at the age of four, when children have been found to master most of their L1 grammar (e.g., Guasti, Reference Guasti2002). For RQ2 and RQ3, a proficiency effect is predicted such that higher-proficiency child and adult L2ers will show target-like knowledge of wanna contraction.
4.1 Participants
Seventy native-English-speaking adult controls (‘L1 adults’) and 46 native-English-speaking children (‘L1 children’) were recruited in Honolulu, Hawai‘i. Forty-three L1-Korean child L2ers of English (‘child L2ers’) and thirty-one L1-Korean adult L2ers of English (‘adult L2ers’) were recruited in Seoul, Korea. This study received administrative and ethical clearance and obtained the informed consent from each participant (or from one of the parents in the case of child participants) prior to its beginning.
The two L2 groups differ in the age of their first exposure to English. The child L2ers were first exposed to English between ages 4 and 6, and the adult L2ers were first exposed to English between ages 8 and 12 (e.g., Kim & Schwartz, Reference Kim and Schwartz2022). Note that the age of 7 was chosen as the cutoff age between the two L2 groups following Johnson and Newport (Reference Johnson and Newport1989), who showed that children who begin L2 acquisition earlier than age 8 demonstrate target-like knowledge of various morphosyntactic phenomena. It is thus conceivable that the participants who started learning English above and below this age will show a qualitative difference in their grammar.
Among these participants, the data from seven L1 children and four child L2ers who exhibited an acceptance bias (Crain & Thornton, Reference Crain and Thornton1998) were excluded from the analyses. We applied a criterion of being able to reject at least one out of six ungrammatical control items (e.g., *Andrew played soccer because Paul baseball; see Thornton et al., Reference Thornton, Notley, Moscati and Crain2016). Also excluded were the data from two child L2ers whose proficiency level was impossible to gauge because they did not produce any sentence-level utterances in the picture narration proficiency task (Park, Reference Park2014; Unsworth, Reference Unsworth2005; Whong-Barr & Schwartz, Reference Kim and Schwartz2002; see Appendix S3 in Supplementary Materials). This procedure left us with data from 39 L1 children (two 3-year-olds, two 4-year-olds, nine 5-year-olds, 17 6-year-olds, and nine 7-year-olds) and 37 child L2ers, as well as 70 L1 adults and 31 adult L2ers.
The background information of the four participant groups is summarized in Table 1.
4.2 Acceptability judgment task
The stimuli for the acceptability judgment task (AJT) were 20 critical sentences and 48 filler sentences. The 20 critical sentences were crossed in a 2 × 2 Latin square design with the factors Clause (Who; If) and Gap (Gap; No gap), as shown in (3)–(6). Ten of these sentences were grammatical and another 10 were ungrammatical. The two If conditions served as the baseline in contrast to the two Who conditions with respect to grammaticality: the *If + Gap condition contrasts to the Who + Gap condition, and the If + No gap condition contrasts to the *Who + No gap condition. Appendix S4 in Supplementary Materials provides a full list of all critical items.
All critical and filler sentences were created so as to minimize the processing burden on participants. First, they were short in length (6 to 8 words). Second, for the critical items, wanna appeared in a simple indirect question that did not involve the operation of do-support because our L2ers’ L1 (Korean) lacks this operation. Third, we ensured that the lexical items in the critical sentences were frequent and easy. The lemmas of all these items were among the 3,000 basic words of the English curriculum in Korea (Ministry of Education, 2020). We also confirmed that each lemma was frequent in the Corpus of Contemporary American English (Davies, Reference Davies2008–): the mean rank of the lemmas of the lexical items was 716.86 (SD = 843.86; range = 6–3163). In addition, all arguments in the critical sentences were pronouns (i.e., I, you).
The embedded verbs following wanna were clearly intransitive (k = 16; e.g., work) or optionally (in)transitive verbs that, when transitive, do not typically allow a human object (k = 4; e.g., read, walk, eat, drink).Footnote 3 Importantly, the use of these verbs in the illicit sentence type in (4) was intended to force participants to interpret this sentence type as a wh-question about the subject and not the object of the embedded verb (e.g., work in I wonder who you wanna work). If learners do overgeneralize wanna contraction everywhere as argued in previous studies (e.g., Kweon & Bley-Vroman, Reference Kweon and Bley-Vroman2011; Zukowski & Larsen, Reference Zukowski and Larsen2011), they should accept I wonder who you wanna work because for them, it is essentially the same as I wonder who you want ___ to work.
Each item was presented to participants as both an audio stimulus and a written sentence, the latter of which was expected to help L2ers whose main exposure to English might have been through written materials. For the ungrammatical sentences, the audio stimuli were created using cross-splicing techniques in Praat (Boersma & Weenink, Reference Boersma and Weenink2017). To construct the audio stimulus for (4), for example, the first part of (3) (i.e., I wonder who) and the last part of (6) (i.e., you wanna work) were cut at the nearest zero-crossing points and then combined. Native English speakers confirmed that the sentences created in this manner sounded natural in terms of prosody.
4.3 Procedure
First, all adult participants or their parents in the case of child participants filled out a language background questionnaire on Google Form (see Appendix S5 in Supplementary Materials). All participants then completed our main task, the AJT, which was designed and presented in PsychoPy (Peirce, Reference Peirce2017). Next, they performed a picture-sentence matching task, which will not be discussed in this paper, and then the picture narration task. All participants were tested individually in a quiet room. The entire session took approximately 40–60 minutes (5 minutes for the questionnaire; 15–20 minutes for the AJT; 15 minutes for the picture-sentence matching task; 5–10 minutes for the picture narration task) including the break.
In the AJT, participants received instructions (see Appendix S4 in Supplementary Materials) and performed the task with four practice sentences. They then proceeded to the experimental session with two blocks. A break of approximately 5–10 minutes was given between the two blocks to prevent participants from losing interest. In each block, participants were presented with 10 critical sentences and 24 fillers in a pseudo-random order. Two stimuli from the same condition were never presented consecutively. For each item in the AJT, participants heard an audio stimulus twice with a one-second interval in between; at the same time, they were presented with the corresponding written sentence on the computer screen. At the offset of the audio stimulus, they were asked to judge each sentence on a 4-point ‘smiley face’ scale (see Figure 1) that popped up on the screen. The four smiley faces were described as ‘very bad/definitely impossible,’ ‘bad/impossible,’ ‘good/possible,’ and ‘very good/definitely possible’; an additional ‘I don't know’ option was provided in the form of a question mark in case participants could not give a rating for any reason. Participants provided their judgments by pressing one of the images, which were attached to buttons on the keyboard. For analysis, their judgments as well as their reaction times (RTs) were recorded in PsychoPy. The RT data will not be reported in this paper.
As an independent measure of English proficiency, the picture narration task was administered to all L2 participants but only a subset of L1-English participants (32 L1 adults and 37 L1 children). In this task, the pictures were presented on PowerPoint and the stories the participants produced were recorded on Praat. Detailed information of the analysis of the picture narration data for proficiency is provided in Appendix S3 in Supplementary Materials. Notably, there happened to be a significant group difference, with the adult L2ers showing higher proficiency scores than the child L2ers (t(66) = 3.710, p < .001, Cohen's d = 0.903).
4.4 Data analysis
Before the main analysis, we removed the ‘I don't know’ judgments (0.14% of the L1 adult data; 6.15% of the L1 child data; 5.27% of the L2 child data; 0.16% of the L2 adult data). We then converted all sentence ratings to binary format by coding the two smiling face responses as 1 (accept) and the two frowning face responses as 0 (reject).
Using the whole dataset, we first ran a binary logistic regression analysis (with a generalized linear mixed-effects model) on the judgment values (1; 0), with Clause (If; Who), Gap (Gap; No gap), and Group (L1 adults; L1 children; child L2ers; adult L2ers) as categorical fixed effects and participant and item as random effects. The fixed effects were contrast-coded for Clause (If: −.5; Who: .5) and Gap (Gap: −.5; No gap: .5) and simple-coded for Group with ‘L1 adults’ as a reference level. To further explore any group-related effects and carefully examine a group-specific performance on the AJT, we constructed a separate model for each group. For the L1 adult group, we ran a binary logistic mixed-effects model on the judgments, which included Clause and Gap as fixed effects and participant and item as random effects. To this model, we added Age as a continuous fixed effect for the L1 child data to investigate a potential effect of Age. Age was computed as the difference in months between the participant's date of birth and date of testing. Separately for the child L2 data and the adult L2 data, we added the continuous fixed effect of Proficiency to the model. All these logistic mixed-effects models were constructed in R (R Core Team, 2022) with the maximal random effects structure justified by the design, using the ‘glmer’ function within the ‘lme4’ package (Bates et al., Reference Bates, Maechler, Bolker and Walker2015). All models with the maximal random effects structure including all by-participant and by-item random slopes for all fixed effects converged, except the model based on the whole dataset. For this exceptional model, we progressively simplified the random effects structure until the model reached convergence (Barr et al., Reference Barr, Levy, Scheepers and Tily2013); the final model had the by-participant random slopes for Clause and Gap and the by-item random slope for Group (model formula: Judgment ~ Clause * Gap * Group + (1 + Clause * Gap | Participant) + (1 + Group | Item)).
Any significant interaction effects between Clause and Gap and between Clause and Group in a model were unpacked by post-hoc pairwise comparisons using the ‘emmeans’ package (Lenth, Reference Lenth2018) in R. Any three-way significant interactions among Clause, Gap, and Age or Proficiency were further examined by running a simple regression with Age or Proficiency as a predictor and ‘sensitivity scores on wanna contraction’ as a dependent variable. The sensitivity scores were computed by subtracting the acceptance rates of the ungrammatical conditions (*Who + No gap, *If + Gap) from the acceptance rates of the grammatical conditions (Who + Gap, If + No gap) such that higher scores could indicate higher sensitivity to wanna contraction. The use of this measure in analysis allowed us to test the relationship between (a) sensitivity to wanna contraction and (b) age (for the child data) or proficiency (for the L2 data).
5. Results
A visual inspection of the graphed results (Figure 2) shows that the L1 adult controls accepted the grammatical conditions (Who + Gap, If + No gap) and rejected the ungrammatical conditions (*Who + No gap, *If + Gap). Although the acceptance patterns of the L1 children, child L2ers, and adult L2ers were similar to those of the L1 adults, the three learner groups accepted the two ungrammatical conditions more often than the L1 adults.
A mixed-effects regression analysis constructed on the whole dataset did not show a significant effect of Clause (β = 0.251, SE = 0.214, p = .242) or Gap (β = −0.193, SE = 0.213, p = .363). There was a significant Group effect between the child L2ers and L1 adults (β = 1.117, SE = 0.377, p = .003) and between the adult L2ers and L1 adults (β = 1.144, SE = 0.436, p = .009), but not between the L1 children and L1 adults (β = 0.557, SE = 0.370, p = .132). Importantly, the analysis showed a significant interaction between Clause and Gap (β = −9.336, SE = 0.573, p < .001). Post-hoc analyses to unpack the interaction revealed that the participants accepted the Who + Gap condition more often than the *Who + No gap condition (β = 5.184, SE = 0.405, p < .001) and *If + Gap condition (β = −5.518, SE = 0.472, p < .001). They also accepted the If + No gap condition more often than the *Who + No gap condition (β = 4.646, SE = 0.346, p < .001) and *If + Gap condition (β = −4.979, SE = 0.424, p < .001). Such a clear distinction between the grammatical sentences and the ungrammatical sentences in our participants as a group points to their knowledge of the constraint on wanna contraction.
On the other hand, there was no interaction between Gap and Group in any of the three learner groups ([L1 Children vs. L1 adults]: β = −0.472, SE = 0.432, p = .275; [child L2ers vs. L1 adults]: β = −0.752, SE = 0.444, p = .090; [adult L2ers vs. L1 adults]: β = −0.584, SE = 0.471, p = .216). However, an interaction between Clause and Group reached significance in all learner groups ([L1 Children vs. L1 adults]: β = −1.511, SE = 0.436, p < .001; [child L2ers vs. L1 adults]: β = −1.268, SE = 0.450, p = .005; [adult L2ers vs. L1 adults]: β = −1.932, SE = 0.481, p < .001). Post-hoc analyses showed that the child L2ers (β = −0.815, SE = 0.199, p = .001) and the adult L2ers (β = −0.784, SE = 0.207, p = 0.004) accepted If conditions more often than the L1 adults did; and the child L2ers accepted Who conditions more often than the L1 adults did (β = −0.620, SE = 0.172, p = .007). The L1 children showed no particular pattern. Also, there emerged a significant three-way interaction among Clause, Gap, and Group ([L1 children vs. L1 adults]: β = 7.561, SE = 1.259, p < .001; [child L2ers vs. L1 adults]: β = 11.343, SE = 1.278, p < .001; [adult L2ers vs. L1 adults]: β = 5.169, SE = 1.390, p < .001).
To carefully examine these group-related effects, we conducted a separate analysis on the judgment data for each group, whose results are reported in the following sub-sections.
5.1 L1 adults
The mixed-effects regression model for the L1 adults did not show a significant effect of Clause (β = 2.099, SE = 1.166, p = .072) or Gap (β = 0.259, SE = 1.177, p = .826). However, it found a significant interaction between Clause and Gap (β = −19.331, SE = 2.143, p < .001). Post-hoc analyses revealed that this interaction stemmed from the fact that the L1 adults accepted the Who + Gap condition more often than the *Who + No gap condition (β = 9.406, SE = 1.735, p < .001) and *If + Gap condition (β = −11.765, SE = 1.962, p < .001); and that they accepted the If + No gap condition more often than the *Who + No gap condition (β = 7.566, SE = 1.079, p < .001) and If + Gap condition (β = −9.925, SE = 1.434, p < .001). This result indicates that L1 adults have firm knowledge of the constraint on wanna contraction.
5.2 L1 children
The mixed-effects regression model for the L1 children did not display a significant effect of Clause (β = −0.481, SE = 1.428, p = .736), Gap (β = 0.822, SE = 1.617, p = .611), or Age (β = 0.011, SE = 0.300, p = .971). Nor was there any significant interaction effect between Clause and Age (β = 0.087, SE = 0.281, p = .757) or between Gap and Age (β = −0.204, SE = 0.283, p = .470). The model, however, showed a significant interaction effect of Clause and Gap (β = 12.683, SE = 4.578, p = .006) and a significant three-way interaction effect of Clause, Gap, and Age (β = −3.370, SE = 0.767, p < .001).
Our post-hoc analyses to unpack the two-way interaction showed that, as in the case of L1 adults, this effect arose from (a) a higher acceptance rate for the Who + Gap condition than for the *Who + No gap condition (β = 4.818, SE = 0.969, p < .001) and *If + Gap condition (β = −4.361, SE = 0.930, p < .001); and (b) a higher acceptance rate for the If + No gap condition than for the *Who + No gap condition (β = 4.120, SE = 0.690, p < .001) and *If + Gap condition (β = −3.663, SE = 0.691, p < .001). This result indicates that like L1 adults, L1 children have knowledge of the constraint on wanna contraction.
Crucially, an age effect was revealed by a follow-up simple regression analysis run on the sensitivity scores on wanna contraction such that older children showed higher grammaticality sensitivity (β = 18.714, SE = 4.486, p < .001), as shown in Figure 3. For the sensitivity scores for all participants, see Appendix S6 in Supplementary Materials. In our individual analysis, all but three L1 children were found to have a positive sensitivity score, which indicates that they had the grammaticality contrast involved in wanna contraction in the right direction. This pattern was found as early as the age of 3;11.
5.3 Child L2ers
In the mixed-effects regression model for the child L2ers, there was no significant effect of Clause (β = 0.080, SE = 0.308, p = .795), Gap (β = −0.318, SE = 0.290, p = .273), or Proficiency (β = −0.039, SE = 0.106, p = .712). No significant interaction was found between Clause and Proficiency (β = 0.092, SE = 0.114, p = .423) or between Gap and Proficiency (β = 0.009, SE = 0.103, p = .932). However, the model displayed a significant two-way interaction between Clause and Gap (β = −3.947, SE = 0.719, p < .001) and a significant three-way interaction among Clause, Gap, and Proficiency (β = −1.121, SE = 0.255, p < .001).
Post-hoc analyses showed that the sources of the two-way interaction are the same as those found for the L1 adults and the L1 children. The child L2ers accepted the Who + Gap condition more often than the *Who + No gap condition (β = 2.235, SE = 0.497, p < .001) and *If + Gap condition (β = −1.974, SE = 0.518, p < .001). In addition, the If + No gap condition was accepted more often than the *Who + No gap condition (β = 1.906, SE = 0.507, p = .001) and *If + Gap condition (β = −1.645, SE = 0.506, p = .006). This result suggests the child L2ers’ sensitivity to the constraint on wanna contraction.
A simple regression of the sensitivity scores on wanna contraction showed that the three-way interaction came from the learners’ different judgment patterns depending on their proficiency. As shown in Figure 4 (see also Appendix S6 in Supplementary Materials), child L2ers with higher proficiency had higher sensitivity to the wanna grammaticality contrast (β = 8.344, SE = 1.714, p < .001).
5.4 Adult L2ers
The mixed-effects regression model for the adult L2 data did not show a significant effect of Clause (β = 1.215, SE = 2.084, p = .560), Gap (β = −2.096, SE = 2.118, p = .322) or Proficiency (β = −0.172, SE = 0.195, p = .379). Two-way interactions between Clause and Proficiency (β = −0.509, SE = 0.341, p = .135) and between Gap and Proficiency (β = 0.336, SE = 0.354, p = .342) and a three-way interaction among Clause, Gap, and Proficiency (β = 0.301, SE = 0.807, p = .710) were not significant, either. It was only an interaction between Clause and Gap (β = −13.620, SE = 3.920, p < .001) that reached significance in the model.
As with the L1 adults, the L1 children, and the child L2ers, post-hoc analyses revealed that the Clause-by-Gap interaction was attributable to (a) the adult L2ers’ (marginally) higher acceptance of the Who + Gap condition than of the *Who + No gap condition (β = 8.051, SE = 2.821, p = .022) and *If + Gap condition (β = −7.138, SE = 2.800, p = .053) and (b) their higher acceptance of the If + No gap condition than of the *Who + No gap condition (β = 5.843, SE = 1.527, p < .001) and *If + Gap condition (β = −4.930, SE = 1.545, p = .008). This result demonstrates the adult L2ers’ knowledge of wanna contraction.
6. Discussion
The findings of this study indicate that English-speaking children, child L2ers of English (with higher proficiency), and adult L2ers of English manage to develop knowledge of the constraint on wanna contraction.
6.1 Fundamental identity in first language acquisition, child second language acquisition, and adult second language acquisition
Despite the fact that the grammaticality contrast involved in wanna contraction raises learnability challenges equally for English-speaking children, L1-Korean child L2ers of English, and L1-Korean adult L2ers of English, they successfully distinguished between possible wanna contraction and impossible wanna contraction in the acceptability judgment task. Note that our findings are inconsistent with the L1 findings from Getz (Reference Getz2019) and Zukowski and Larsen (Reference Zukowski and Larsen2011) and the L2 findings from Kweon and Bley-Vroman (Reference Kweon and Bley-Vroman2011). One possible source of this discrepancy is the method. The previous studies adopted production tasks, which are vulnerable to performance errors and impose a large processing burden on learners (Ionin & Zyzik, Reference Ionin and Zyzik2014); Kweon and Bley-Vroman's study also used an acceptability judgment task, but some of their wanna sentences were formed as direct questions involving the complex operation of do-support. By contrast, the current study designed its acceptability judgment task to be child-friendly, with the wanna sentences made as simple as possible at both lexical and syntactic levels (see Section 4.2). Furthermore, each sentence was presented twice (e.g., Deen et al., Reference Deen, Bondoc, Camp, Estioca, Hwang, Shin, Takahashi, Zenker, Zhong, Bertolini and Kaplan2018) to allow the learners to fully process the structural information in it (Ferreira & Patson, Reference Ferreira and Patson2007). This novel design might have made it possible to unveil knowledge in children, child L2ers with higher proficiency, and adult L2ers where previous studies failed to do so (see also Kim & Schwartz, 2022).
Our findings also have further theoretical importance in that they suggest an answer to the controversial question of whether L1 acquisition and L2 acquisition are fundamentally different (e.g., Bley-Vroman, Reference Bley-Vroman, Gass and Schachter1989, Reference Bley-Vroman1990, Reference Bley-Vroman2009) or the same (e.g., Hopp, Reference Hopp2007, Reference Hopp2009; Schwartz & Sprouse, Reference Schwartz and Sprouse1996, Reference Schwartz, Sprouse, Herschensohn and Young-Scholten2013). The Fundamental Difference Hypothesis (FDH; Bley-Vroman, Reference Bley-Vroman, Gass and Schachter1989, Reference Bley-Vroman1990) maintains that adult L2ers do not have access to a cognitive system specific to language that is available in early childhood, as in L1 acquisition and child L2 acquisition; thus adult L2ers must resort to domain-general problem-solving skills and L1 grammar, which results in the failure of adult L2ers’ grammars to converge on the target grammar.
According to Song and Schwartz (Reference Song and Schwartz2009), the FDH is testable by (a) investigating whether adult L2ers have knowledge that is subject to learnability problems and (b) looking at how adult L2ers and child L2ers compare with respect to developmental sequences. The logic behind the latter is that similar developmental trajectories would indicate that the same mechanism specific to language guides the acquisition process both in child L2 acquisition and in adult L2 acquisition (Schwartz, Reference Schwartz1987, Reference Schwartz1992, Reference Schwartz, van Kampen and Baauw2004). In Song and Schwartz's study, L1-Korean-speaking children, L1-English child L2ers of Korean, and L1-English adult L2ers of Korean were tested for knowledge of wh-questions with the negative polarity item amwuto ‘anyone’ in Korean, which constitutes a learnability problem. Korean is a wh-in-situ language with subject-object-verb as its canonical word order, and it is generally possible to scramble the object to presubject position without a considerable meaning change beyond a shift of focus to the object. However, in the context of negative wh-object questions with the negative polarity item as subject, scrambling of the object wh-phrase is obligatory, as in (9a).
(9)
a. Scrambled word order (object-subject-verb) Mwues-ul amwuto sa-ci anh-ass-ni? what-acc anyone buy-comp neg-past-q ‘What didn't anyone buy?’
b. Canonical word order (subject-object-verb) Amwuto mwues-ul sa-ci anh-ass-ni? anyone what-acc buy-comp neg-past-q ‘Didn't anyone buy something?’ (adapted from Song & Schwartz, Reference Song and Schwartz2009, (3))
Crucially, the counterpart of (9a) with the canonical word order in (9b) cannot have a wh-question reading, but only a yes-no question reading. For L1-English L2ers of Korean, these properties of negative questions with the negative polarity item cannot be transferred from their L1, or learned from L2 instruction or target language input. Nevertheless, Song and Schwartz, using a variety of tasks (elicited production, acceptability judgment, and interpretation verification), showed that (a) high-proficiency child and adult L2ers demonstrated firm knowledge of the target phenomenon and (b) child and adult L2ers follow the same developmental route to convergence on the target grammar. Their findings indicate that “the nature of language is fundamentally similar in natives and (child and adult) nonnatives” (p. 324), providing evidence against the FDH.
The findings of this study are consistent with those of Song and Schwartz (Reference Song and Schwartz2009), thereby adding support to the fundamental identity position (Hopp, Reference Hopp2007, Reference Hopp2009). While the high proficiency of the adult L2ers in this study (see Section 4.1) means the findings do not provide information on their developmental sequence, all of the adult L2ers, as well as the higher-proficiency child L2ers, demonstrated firm knowledge of the constraint on wanna contraction, just like the L1 adults.
6.2 Child versus adult second language acquisition
Much research has documented child-adult differences in their L2 acquisition, and such differences are arguably due to age of acquisition and working memory capacity. An effect of age of onset was demonstrated in Johnson and Newport's (Reference Johnson and Newport1989) seminal study in which L2ers of English with different ages of acquisition (range: 3–39) were tested on a wide variety of structures of English grammar (for more recent work, see Bosch et al., Reference Bosch, Veríssimo and Clahsen2019). The main finding from the study indicated an advantage for child L2ers over adult L2ers such that the performance of L2ers whose age of onset was greater than 7 never patterned like the performance of native speakers of English. Such deficits in knowledge among adult L2ers (see also Bley-Vroman, Reference Bley-Vroman1990; Clahsen & Muysken, Reference Clahsen and Muysken1986; Hawkins & Hattori, Reference Hawkins and Hattori2006; Tsimpli & Dimitrakopoulou, Reference Tsimpli and Dimitrakopoulou2007) have been taken as evidence for the presence of a critical period in L2 acquisition (Abrahamsson & Hyltenstam, Reference Abrahamsson and Hyltenstam2009; Lenneberg, Reference Lenneberg1967).
Regarding working memory, adults have greater capacity than children (Kharitonova et al., Reference Kharitonova, Winter and Sheridan2015; Swanson et al., Reference Swanson, Sáez and Gerber2006). Children's ability to encode and maintain information indeed exhibits a remarkably protracted developmental improvement until adolescence. It is thus reasonable to consider that adults with greater working memory resources are more likely than children with fewer working memory resources to efficiently parse complex sentences, such as those containing a dependency between a wh-filler and a gap, as in the case of wanna contraction. Albeit controversial, a few studies have identified an adult advantage over children at least in the beginning stage of L2 acquisition (Hyltenstam & Abrahamsson, Reference Hyltenstam and Abrahamsson2000; Long, Reference Long1990).
This study does not provide a conclusive answer as to the issues around age of onset or working memory because (a) it showed that both child L2ers and adults L2ers converged on the target-like knowledge and (b) it did not independently test working memory. However, we showed continuity between child and adult grammar in L2 acquisition, a result that would not be explained by either age of onset or working memory. All adult L2ers (who had higher proficiency than the child L2ers) did have tacit knowledge of the constraint on wanna contraction. The child L2ers as a group also showed target-like knowledge on wanna contraction. Further analyses revealed a proficiency effect in this group such that child L2ers’ sensitivity to the wanna contrast increased along with their proficiency. We leave the issue of child versus adult L2 acquisition open for future research with larger sample sizes of child and adult L2 participants whose age of onset and working memory are systematically varied but whose length of exposure to the target language and proficiency are controlled.
Acknowledgements
I would like to thank all participants in this study as well as Leilani Au, Jeffery Bock, Jihyun Kim, Rakhun Kim, Ju Young Min, Alison Onishi, and Yangon Rah, who helped me with data collection. I am also grateful for Robert Bley-Vroman, Kamil Deen, William O'Grady, Amy Schafer, and Bonnie D. Schwartz for their invaluable input to this work. This study was supported in part by a Fulbright Scholarship to the author.
Data availability statement
The data and materials that support the findings of this study are openly available in OSF at http://doi.org/10.17605/OSF.IO/TR34D.
Competing interests
The author declares none.
Supplementary Material
The supplementary material for this article can be found at https://doi.org/10.1017/S1366728923000640.
Appendix S1: Categorization Criteria for Participants in Kweon and Bley-Vroman (Reference Kweon and Bley-Vroman2011)
Appendix S2: Summary of Results in Kweon and Bley-Vroman (Reference Kweon and Bley-Vroman2011)
Appendix S3: Proficiency Data from L2 learners
Appendix S4: Critical Stimuli for the Acceptability Judgment Task
Appendix S5: Background Questionnaires
Appendix S6: Individual Sensitivity Scores in the Acceptability Judgment Task