1. Introduction
The general context for this article is the study of recursion as a grammatical property of languages and, specifically, its emergence in child language. Previous research indicates that recursive self-embedding constructions present difficulties for children in different languages. For instance, previous studies have examined recursive possession and recursive PP modification in English, noting that these constructions are difficult for children to understand and also, not very frequent in the input; see Roeper and Snyder (Reference Roeper, Snyder and Sciullo2004), Limbach and Adone (Reference Limbach, Adone, Franich, Iserman and Keil2010), and Roeper (Reference Roeper2011). Studies of elicited production suggest that specific recursive constructions might become productive at different ages (Pérez-Leroux et al. Reference Pérez-Leroux, Castilla-Earls, Béjar and Massam2012, Roberge et al. Reference Roberge, Pérez-Leroux and Frolova2018). At the same time, despite substantial cross- and intra-linguistic variation in the structures that can be recursive and in the type of embedding markers involved (prepositions, particles, case-marker, etc.) recursion appears to develop robustly in child language (Pérez-Leroux and Roberge Reference Pérez-Leroux and Roberge2018). Japanese represents a fascinating case study for recursion in L1 acquisition: while certain languages use a wide diversity of different markers to build recursive nominals (e.g., prepositions in English), Japanese relies on two uniform strategies: an adnominal particle (no) that is used to embed DPs inside DPs, expressing several types of semantic relations (possession, accompaniment, location, etc.), or a relative clause (RC), where the nature of the relation between the two DPs is fully specified by a clause. We investigate the development of recursive no in Japanese in order to determine whether the recursive properties of no emerge at the same time for the different semantic functions it fulfills.
We adopt a minimalist approach to syntactic derivations (Chomsky Reference Chomsky1995, Reference Chomsky2005) in which the basic operation Merge results in the creation of a new entity by combining two separate entities. Our focus is not on the formal computational properties of the operation Merge, but on one of its grammatical consequences: recursive outputs generated by repeated applications of Merge to syntactic categories of the same type; see Widmer et al. (Reference Widmer, Auderset, Nichols, Widmer and Bickel2017). A recursive DP can be formed when other DPs are generated inside it. In such cases, a relation is formed between the various DPs and this semantic relation is expressed through various grammatical means: prepositions or Saxon 's in the case of possession in English, repeated PP adjunctions, with prepositions expressing various semantic connections (possession, location, accompaniment, etc.). In Japanese, all these forms can be expressed by linking the nominals with no. This is shown in (1) where 3 nominals are involved.
Clearly, in the study of the L1 development of recursive nominals in English, the question is whether the structures with Saxon 's develop similarly to the structures with PPs. Likewise, within the PP structures, we would want to determine if different semantic relations (for instance locative vs. comitative) develop simultaneously and, if not, why not. In contrast, Japanese presents a very different system with a maximally simple inventory of possibilities for DP embedding, since the same multifunctional marker no can be used to express the modification relations shown in (1). What are the implications of this for L1 development?
The article is organized as follows: section 2 provides some background on no and its acquisition, and introduces our research questions; section 3 describes our study and its results, followed in section 4 by a discussion and conclusion.
2. On の (no)
Particles perform a wide range of functions in Japanese grammar (Kuno Reference Kuno1973) and serve among other things to express relationships that are conveyed by prepositions and conjunctions in a language like English. In this section, we first delve into the nature of no, its various uses and how it has been analyzed (section 2.1), before we briefly consider previous findings on its development in child language (section 2.2). We then formulate the research questions (section 2.3) that we seek to answer through the experimental study that is presented in section 3.
2.1. Description and analysis
As we have already seen, at a very basic descriptive level, no is a particle that attaches to an N in the context of N_N and the relationship it creates between the two nouns is subject to a wide variety of interpretations. The no morpheme is frequently labeled as a genitive case marker in Japanese. It is, indeed, commonly found in contexts expressing possession and kinship relationship.Footnote † But no, as we have hinted, is also found in various other contexts. Due to the diversity of contexts in which no is observed, the role as well as the identity of no have been highly debated. In the literature on Japanese syntax and morphology, no is variously referred to as a particle, a modifying particle, a case (genitive), a case-marking particle, a connector, a linking element, a prenominal particle, or a functional postposition. Despite these terminological differences, there is general consensus among studies that no is multifunctional (e.g., Kuno Reference Kuno1973; Kitagawa and Ross Reference Kitagawa and Ross1982; Murasugi Reference Murasugi1991; Kuroda Reference Kuroda1992, Reference Kuroda, Muraki and Iwamoto1999; Watanabe Reference Watanabe2010; Shibatani and Chung Reference Shibatani, Chung, Fukuda, Kim and Park2017; Hiraiwa Reference Hiraiwa2018; Ishizuka and Koopman Reference Ishizuka and Koopman2018; Shibatani Reference Shibatani, Shibatani, Miyagawa and Noda2018).
The very rich array of uses of no is nicely illustrated in Tan Chyn Ngian (Reference Ngian2004) in a series of 21 examples in which no serves to indicate: possession, the producer responsible for an outcome, a group that someone is a member of, existence and location, a time period, a material, a quantity or sequence, a purpose, related areas and topics, among many others. It is thus reasonable to ask whether there is in fact only one no or several homonyms that account for the various meanings; but this question extends beyond the scope of our research, and we focus here on its use as a marker in the DPs targeted by our experiment.Footnote ‡
Note also that while Japanese has locative postpositions such as ni and de (both mean ‘in/at’), the equivalents of some English locative prepositions do not exist. In such cases, the corresponding Japanese structure requires the presence of no followed by a noun which indicates a location, as shown in (2).
This fact in Japanese means that the recursive DPs in our experiment can contain extra no's beyond the ones we are targeting. For instance, “the cat in the park” in Japanese (3) includes an extra no connecting “park” and “inside” in addition to the no we are interested in, which connects “park + inside” and “cat”.
As for the grammar of no, there appears to be no widely accepted view on the morpho-syntactic status of the no marker among previous studies on Japanese syntax and morphology. Should no be handled as part of narrow syntax, or the product of post-syntactic application of a morphological operation? Should no be viewed as a functional head (or part thereof), or the spell-out of a feature found on a nominal? These questions boil down to whether or not the no marker should be represented structurally as part of the syntactic derivation of a complex NP, or be inserted in the string after Spell-Out.
The latter approach – the view that no is inserted post-syntactically-- was adopted by Kitagawa and Ross (Reference Kitagawa and Ross1982) and, in one form or another, in several further studies; Saito and Murasugi (Reference Saito, Murasugi, Ormazabal and Tenny1990), and Kitagawa (Reference Kitagawa2005) among others. In its extreme form, this approach treats no as a default marker linking an N to a following N: “regardless of the semantic relationship it expresses […] It is not a genitive marker as such but is the only means that the language has, to express whatever semantic relationships the ‘genitive’ can express. There is just one surface restriction on no: it cannot be attached to an inflected predicate” (Sells Reference Sells, Akatsuka, Hoji, Iwasaki, Sohn and Strauss1998: 8). This general view on no was maintained in subsequent research; Watanabe (Reference Watanabe2006), for instance, does not generate no as part of syntactic derivation, despite arguing for an elaborate functional structure within Japanese nominal projections. More specifically, the post-syntactic approach to no is exemplified by the late insertion rule proposed in Saito and Murasugi (Reference Saito, Murasugi, Ormazabal and Tenny1990: 296), which is shown in (4).
(4) Ø → no / [Y X _ Z ],
where X is DP or PP and Y, Z are (projections of) N or D
The rule would also apply in a recursive context where Z counts as a Y projection, as in (5), which would result in a DP with two no's within a recursive structure:
(5) [Y X _ [Y X _ Z ] ]
The above late insertion analysis of no, as opposed to a fully syntactic approach, must be contextualized within the general issue concerning the nature of functional categories in Japanese syntax as a whole; see Fukui (Reference Fukui1986). Fukui and Sakai (Reference Fukui and Sakai2003) present an in-depth examination of this theoretical issue and argue that we cannot automatically assume that the functional categories postulated for other languages exist in Japanese. For them, this potential fundamental difference extends to case-marking, which they view as one of the narrow syntactic mechanisms in Japanese that “[seem] to be transferred to the mechanisms in the phonological component.” (p. 366). The PF approach to case-marking, including Japanese no, with the aforementioned assumption of the non-existence or impoverished role of functional categories, allows for tremendous flexibility in analyzing recursion with NPs, “since Japanese does not really have active functional categories with agreement features […] the phrases/projections in Japanese are never closed. Thus, additional elements are rather freely merged with a lexical projection” (Fukui and Sakai Reference Fukui and Sakai2003: 355).
However, PF approaches crucially do not assign any role to no in the interpretation of the whole DP, and in particular in the determination of the semantic relation between the nominals it connects. Obviously, the nominal structures we are interested in are always recursive, regardless of whether a PF or syntactic approach is adopted. Therefore, for our purposes, we focus on syntactic treatments of no.
A syntactic analysis of no simply means that no is considered rather an integral part of the derivation of the DP structure. A challenge, under such a view, is the determination of the label or labels of no, from which follow its position and its immediate syntactic environment. Zushi (Reference Zushi1996), among others, proposes a syntactic treatment of no as a D head:
which allows for recursive DP formation:
This, however, leaves the specific contribution of no as an open question. As such, analyses that address this question propose that the semantic relation realized by no comes either directly or indirectly from the marker itself. The option of labeling no as a functional projection of a specific type thus offers the possibility of directly combining the label and the function. As a matter of fact, no is often assumed to be a POSS heading a PossP, as illustrated in Terumura et al. (Reference Terunuma, Isobe, Nakajima, Okabe, Inada, Inokuma, Nakato, LaMendola and Scott2017):
(8) Taro no kuruma
Taro no car
‘Taro's car’
(9)
A DP with two no's such as (10a) would thus be represented as (10b):
(10)
a. Taro no otoosan no kuruma
Taro no father no car
‘Taro's father's car.’
b.
But it is also possible to preserve a D label for no, while accounting for the various interpretations of the relations between the Ns it connects via an elaboration of the constituent selected by D. For Ishizuka and Koopman (Reference Ishizuka and Koopman2018), no is indeed a D head, but the DP projection includes a substructure involving relativization represented by a CP comprising a subject predicate structure. The semantic interpretation of the relation between the two Ns derives from a silent elementary predicate within the CP. As shown in the basic structure of a DP with no in (11), in their analysis, the ordering of the constituents follows from subject raising (subject relative) and remnant movement to satisfy an EPP feature on no, and the potential interpretations of the semantic relation of no construction are based on “a restricted set of (silent) elementary predicates, such as AT, FROM, IN, TO, FOR” (p.7).
(11) [DP Spec [D’ [D no] [CP N1 pred N2 ]]]
This idea of dissociating the interpretation of the DP from the label of no or, in other words, of attributing such an interpretation to another component of the structure, makes it possible to envision a unified analysis of no. Shibatani (Reference Shibatani, Shibatani, Miyagawa and Noda2018, Reference Shibatani, Zariquiey, Shibatani and Fleck2019) offers his line of analysis by dissociating the interpretation of no from its syntactic role, while invoking another module of the grammar, namely pragmatics, instead of expanding the substructure of the DP (see Ishizuka and Koopman Reference Ishizuka and Koopman2018). In a series of single and co-authored works on nominalizations in a variety of languages, Shibatani argues that no is a nominalizer. Structurally, the NP containing no is therefore a nominalized NP that enters into a modification relation with another N to form a complex NP, as illustrated in (12); we include a case in the example to show that the NP can be used as any other NP in a sentence.
(12)
a. Taro no kuruma o
Taro no car acc
‘Taro's car’
b.
As was the case for all other analyses discussed here, Shibatani's approach is fully compatible with a recursive use of no, as shown in (13).
(13)
a. Taro no otoosan no kuruma o
Taro no father no car acc
‘Taro's father's car.’
b.
Shibatani defines nominalization as a metonymic process that derives new nominal expressions. It might seem peculiar to consider the nominalization of an N, since nominalization is most readily thought of as a process that takes a verb (or other non-nominal category) and makes it into a noun, as in the agentive nominalization of “drive” as “driver”. But Shibatani points out that the nominalization of a noun is also a very common process cross-linguistically, providing numerous examples from English: villager, quarter-pounder, left-fielder, etc. This is essentially what no does, and the resulting grammatical nominalizations are created for the nonce, with a meaning that is determined compositionally.
As we have seen above, the interpretation of the relationship between two nouns connected by no varies considerably, and this is because “N-based [nominalizations] denote [entities] metonymically evoked in close association with the denotation of the base nouns, such as people associated with specific locations, and those associated with a specific organization, philosophical orientation, quantity, time or manner” (Shibatani and Chung Reference Shibatani, Chung, Fukuda, Kim and Park2017: 65). Stated differently, the nominalization of a noun allows for the meaning of the noun to shift from the denotation of the noun itself to something or some concept outside of that noun but related to it. For example, to refer to Taro, I say “Taro” but when I say “Taro's” I no longer refer to Taro but to something that is associated with Taro: his father, his car, his hospital, his politics, etc.
What remains to be accounted for is, then, how one can determine the intended denotation if such a wide range of possibilities are made available by no nominalizations. In Shibatani's analysis, this is done by pragmatic inference (i.e., speech context and Grice's Cooperative Principle). Under this view, when we group examples that contain one or more no's into categories of meaning such as location, possession, etc., these categories correspond to groups of relations that are similarly inferred, rather than to clearly circumscribed semantic relations. This insight, derived from Shibatani's analysis, will play a significant role in the interpretation of our experimental results.
2.2 Acquisition
Before we consider the acquisition of recursive modification, let us consider first how young children start connecting two or more nominal elements. When do they first use the no particle? Are relative clauses, a more complex structure, later to enter the grammar? According to Clancy (Reference Clancy and Slobin1985), the first uses of no in the acquisition of Japanese appear at the two-word utterance stage, when children can already use the particle no in its possessive sense, often in an elliptical structure following a single noun: Maho no (‘Maho's_’). This is quickly followed by frequent uses of two-noun utterances (at approximately age two), when other particles emerge (-mo, -ga, -ni, -de). In the early years, no is often overgeneralized (for all its various functions). Murasugi et al. (Reference Murasugi, Nakatani and Fuji2012) note that some of this overgeneration is due to the miscategorization of adjectives as nominals (*Adj no N). As the set of morphological devices continues to expand after the third year of age, other uses of no enter children's speech, including complex locatives “N no tokoro ni (in N's place)” and ‘N no N’ for a variety of other relations between Ns. Interestingly, the earliest uses of relative clauses appear almost as early as no. Ozeki and Shirai (Reference Ozeki and Shirai2010) report on spontaneous use of relative clauses by five children. Three of these children had used relative clauses by age 2;01, the other two by 2;07 and 2;09. On the comprehension side, Terunuma and Nakato (Reference Terunuma, Nakato, Amaral, Maia, Nevins and Roeper2018) describe a process of staged acquisition. Japanese children first manage to comprehend a single possessive, and subsequently, two possessives. However, structures with more than two possessives are acquired almost at the same time as those with two possessives. Terunuma et al. (Reference Terunuma, Isobe, Nakajima, Okabe, Inada, Inokuma, Nakato, LaMendola and Scott2017) confirm that comprehension of structures with two and three possessors is solidly mastered by age four, for both locatives and possessives.
A recent study of recursive possessives by Hirayama et al. (Reference Hirayama, Colantoni and Pérez-Leroux2020) finds that by the age of five, most children produce recursive possessives in an elicitation task, but at low rates. Between the ages of five and six, production of recursive possesives increases drastically, doubling in rates. Hirayama and colleagues asked whether Japanese children who have acquired the prosodic patterns that characterize recursive embedding were better at producing recursive DPs. This did not seem to be the case. While children had clearly learned the Japanese pattern of downstep with accented phrases, they were uneven in differentiating this pattern from those of similar non-embedded phrases.Footnote § Ability to mark prosodic contrasts was unrelated to their capacity to produce the recursive structure.
We can thus conclude from this that no appears not to pose a particular challenge for young children. It emerges relatively early, among the set of first particles to appear in the speech of children (Clancy Reference Clancy and Slobin1985). However, it has not been made clear when children start using it in more complex configurations, and whether the various recursive uses of no are acquired homogeneously; these questions are the focus of this article.
2.3. Questions and hypotheses
Our review of the previous literature suggests that simple use of modifiers (no) and of relative clauses enter the speech of young Japanese children very early. While there is no documentation on initial use of recursive structures in Japanese, studies from other languages indicate that children first start to use recursive descriptions around the age of four (Pérez-Leroux et al. Reference Pérez-Leroux, Castilla-Earls, Béjar and Massam2012, Roberge et al. Reference Roberge, Pérez-Leroux and Frolova2018, Giblin et al. Reference Giblin, Zhou, Bill, Shi, Crain, Brown and Dailey2018, for English; Roberge et al. Reference Roberge, Pérez-Leroux and Frolova2018, for French; Pérez-Leroux (Reference Pérez-Leroux, Alboiu and King2022), for Spanish; Pérez-Leroux et al. Reference Pérez-Leroux, Roberge, Lowles and Schulz2021, for German). In these studies, many four-year-old children still do not produce any form of recursive modification, but rates grow substantively over time. Although the syntax of Japanese no structures remains a matter of deep controversy within the theoretical literature, from a broad perspective there is no question that the structure with two nouns linked by no entails fewer derivational steps than a structure mediated by a relative clause. Naturally, this asymmetry in structural complexity extends to recursive configurations.
We have adopted a straightforward analysis of no, as a simple, highly polysemous nominalizer that links two nouns into a configuration that introduces a very abstract metonymic sense, which can refer to the broader sense of adjacency (location/accompaniment) or to possession (ownership or part–whole).Footnote ** In Shibatani's analysis the various senses of no reflect an abstract, underspecified unique meaning, rather than many types, or structural alternatives, with context supplying the details.
Our goal is to examine children's production of recursive modification in Japanese. Is the path of acquisition of recursive modification in Japanese determined by matters of structural simplicity? We ask three questions:
Q1) Are children's response patterns qualitatively different from adults’? When children fail to produce a recursive NP, how do they respond?
Q2) Is performance comparable across conditions? This question has two dimensions:
a) are there differences in terms of what semantic types of modification relations (i.e., possession, accompaniment, location, part–whole) children can successfully produce?
b) and conversely, are structures that primarily rely on no comparably successful in children?
Q3) Do children generally prefer the simpler no structure over relative clauses, when compared to adults?
3. Elicited production study
We conducted an elicited production study comparing the production of recursive DPs in Japanese-speaking children and adults.
3.1 Methods
The following sections describe the design and methodological components of our study.
3.1.1. Participants
Children and adults were recruited using the snowball method. Fifteen adults participated in the study; ten speakers were from the city of Tokyo or surrounding areas and five were from Nagoya. Child participants were 37 in total, including 17 five-year-olds and 20 six-year-olds. One additional participant did not complete the session and was removed from the analysis. Most of the children (n = 27) were from the Tokyo region (age range 5;00-6;07; median age 5;11; mean 70.3 months, SD 5.2 months). Gender and socioeconomic status were not considered. The Nagoya children were a few months older on average (range 5;00-6;11, median age 6;08; mean 74.7 months, SD 9.7 months). There are phonological differences between these two varieties of Japanese, but no identified differences in DP syntax. Table 1 summarizes the overall sample.
3.1.2. Materials and procedures
The task employed was a referential elicitation task based on the design used in Pérez-Leroux et al. (Reference Pérez-Leroux, Castilla-Earls, Béjar and Massam2012) to elicit recursive possessives and recursive comitative modifiers, extended to two additional types. The materials were designed in the context of a comparative project investigating the acquisition of nominal recursion in five languages (Pérez-Leroux and Roberge Reference Pérez-Leroux and Roberge2018). The stories were written first in English (Pérez-Leroux et al. Reference Pérez-Leroux, Peterson, Béjar, Castilla-Earls, Massam and Roberge2018a), and subsequently translated to Japanese by one of the authors, a native speaker of Japanese. The task targeted production of four types of self-embedded nominals: recursive possessives, recursive comitatives, recursive locatives and recursive relational nouns, as in (14)–(17).
Note that the two formal strategies available in Japanese for nominal embedding: no, and relative clauses, can be used to express these various notional-based relations.Footnote †† This protocol aims to elicit no constructions, but speakers have alternative choices, such as identifying a referent by using relative clauses. Given a picture that has a boy who is wearing glasses, a participant may describe the boy in the picture using either a construction with a no morpheme linking the two nominals (18a) or a relative clause (18b).
The materials contained a set of picture-based elicitation stories, portraying double sets of referentially contrasting pairs, which required the use of two layers of DP modification to disambiguate the target reference (i.e., in the case of (19), two similar girls, accompanied by similar dogs; one of the dogs had something special: a hat). Each trial involved two pictures. The first picture was used to verbally introduce each of the three entities required to formulate the target response, and to focus attention on the contrasting referents. The second image illustrated some change in the scene, which served as focal point for the prompting question (i.e., the girls now had ice cream). The experimenter then asked a simple referential question (which x…?) to prompt participants to provide a description. In (19), the prompt asks: Of the two girls in the context, which has the larger ice cream? It is the visual scenario that biases towards a recursive response. The girls are highly similar in appearance and dress, and they have similar dogs. The most visually salient contrast pertains not to the girls, but to the dogs, namely, that one of them is wearing a hat. As such, an answer that states the girl with the hat is incorrect, since the hat is not related to the girl except through her dog.
The materials contained six trials per condition, for a total of 24 stories plus the same number of additional distractors of various types. The trials were presented in two semi-randomized orders, evenly across participants. Participants were first presented with two short practice trials. After the recursion test, participants took part in a sentence repetition task involving complex NPs, for the purpose of prosodic analysis, reported elsewhere (Hirayama et al., Reference Hirayama, Colantoni and Pérez-Leroux2020).
Test sessions were conducted individually, by two of the authors, native speakers of Japanese. With adult participants, the session took place either in a room with sound-attenuated walls, or in a quiet room. Child participants were interviewed in their homes or at a friend's home, after obtaining parental consent. If the participants manually pointed to the referent in the picture, they were asked to answer in words. If they gave a deictic answer such as kore ‘this,’ they were asked to give a more specific answer (“which one…?”). If the child did not respond to a prompt, the prompt was repeated. After two repetitions, the experimenter moved on to the next story.
3.1.3. Coding
Data was recorded and transcribed by the native speaker authors. Elicited production data grants a degree of freedom to the speaker. Given our focus on structural matters, our definition of “target” responses was limited to DPs that contained each of the three nominals, which were identified as N1 (target head noun), N2 (target first modifier), and N3 (target second modifier), configured in a recursive DP. However, in a referential task of this nature, there is always a possibility for speakers to describe an entity in an alternative manner. Our picture scenarios elicited successful descriptions that did not involve the intended structures, such as calling attention to the spatial configuration of the array (‘the one to the left”, etc.). For this reason, the syntactic and referential properties of the responses were coded separately (see also Pérez-Leroux et al. Reference Pérez-Leroux, Castilla-Earls, Béjar, Massam, Peterson, Maia, Amaral and Nevins2018b).
For the syntactic analysis, responses were classified for the level of embedding (Single, Level 1 and Level 2), and for type of linking mechanism (no, or relative clauses). Single NPs included unmodified nouns, as in (20a), and adjective + noun sequences. A head noun modified by a single noun-no phrase or a RC was coded as Level 1, as in (20b). A head noun containing a modifier which itself had a nominal or clausal modifier was coded as Level 2 (20c); subsequent levels of embedding were coded as higher levels (e.g., Level 3, etc.). A head noun with two non-interacting modifiers was coded as 2 Level 1, as in (20d). Levels were not added for locational nouns, colour nouns, and demonstratives (e.g., kocchi ‘here’, midori-no ‘green’) occurring with the particle no, as in examples in (20b–d), since these nouns did not add further differentiation to the contrastive referents, and for lexicalized instances of no (e.g., onna-no ko: female-no child: ‘girl’; kami-no ke: head hair-no hair: ‘hair’).
For each referential relation we identified the linking strategies used to connect the relevant nouns, including the target structural strategies which consisted of the linker no, as in (21a), and relative clause (RC), as in (21b). At times, speakers connected a pair of referents without actually embedding NPs, using other configurations such as inclusion in the same clause (22a), or compounding (22b).
Independent from this analysis, the data was analyzed in terms of whether they were referentially successful, and whether the referent was described using the targeted referential expressions (e.g., wani ‘alligator,’ mizu ‘water,’ and kotori ‘bird’ for (20c)), or not (20a–b). The referential coding included five categories, as given in (23) with examples. Incomplete responses (23a) are clearly not referentially successful, but sequential and alternative responses (23b–c) are. Non-embedded answers (23d) often come close, but often represent pragmatically or syntactically degraded responses (see Pérez-Leroux et al. Reference Pérez-Leroux, Castilla-Earls, Béjar, Massam, Peterson, Maia, Amaral and Nevins2018b).
3.2. Results
In this section, we present the findings of our study.
3.2.1. Overall response patterns
Our first analysis examined the types of referential responses produced by children and adults. We aimed to determine whether children's response patterns were different from adults. Table 2 reports the referential coding.
This data shows a clear developmental trade-off between incomplete responses, which were the primary responses for the younger children, and target responses, which are the dominant responses for adults. All groups produced a fair number of alternative responses, as well as some non-embedded responses. Six-year-olds had slightly more non-embedded responses than either five-year-olds or adults. Children also provided a few sequential responses, but this response type was absent in adults. A comparison of response types across groups shows that response patterns are significantly different across groups (χ2 = 307.82, df = 8, p <.001).
3.2.2. The development of targets across conditions
We then focused on the distribution of recursive target responses. As shown in Figure 1, during the age span under observation we observe substantive changes in the ability to produce recursive NPs.Footnote ‡‡ Although, individually, one child failed to produce targets, once we consider all possible forms of no, including lexicalized forms and compounds, all children produced sequences of at least two consecutive instances no. In fact, over four fifths of the children produced sequences of three or more no’s. Nonetheless few children approached the adult averages.
Our next step was to analyze statistically the effect of condition and group in the distribution of target responses. Here, we asked two questions: 1) Are the various conditions equally likely to yield successful responses?; and 2) Does development proceed uniformly across conditions, that is, are there differences in the relative success of the various conditions across age groups? To answer these questions, we entered the data into a generalized linear mixed effect (logit) model in R (version 3.6.2, R Core Team, 2019) fit by the maximum likelihood method (Adaptive Gauss-Hermite Quadrature). The dependent measure was Response (target or not), with Age Group (five-year olds, six-year-olds and adults) and Condition (Possessive, Comitative, Locative and Relational) as fixed effects, and Participant and Item as random effects. Treatment coding was set with Possesive and six-year-olds as baseline levels. The model (Target ~ Age Group + Condition + (1 | Participant) + (1 | Item)) was based on 1243 observations from 52 participants and 24 items. There are five pictures whose elicitation was mistakenly skipped in the data collection. The contribution of both fixed effects was significant. We will discuss each in turn.
Figure 2 shows: 1) differences between groups, and 2) that all conditions are growing in parallel. Two groupings emerge in terms of target productions, according to the statistical analysis. On one hand, recursive possessives and comitatives elicit a higher proportion of target responses for Adults and Older Children, but not for Younger children, who give fewer target responses to comitatives. On the other hand, recursive locative and relational nouns elicit fewer target responses, across speaker groups.
The results of the model confirm the developmental progression observed above. Five-year olds had fewer target responses than the older children (ß = -1.007, Z = -3.101, p < 0.01). In turn, adults had more target responses than the older children (ß = 2.175, Z = 6.369, p < 0.001). The overall characterization of performance across conditions is confirmed by the statistical model. Comitatives were not statistically different from the possessive baseline. Locatives yielded statistically significantly lower rates of target responses (ß = -1.514, Z = -2.796, p = 0.005), whereas the lower rate of target responses associated with relational nouns was marginal (ß = -1.018, Z = -1.800, p = 0.071). A subsequent model was run to test the group by condition interaction. The fit of this augmented model did not yield a significant interaction, and crucially, the more basic model had a better fit (AIC = 1280.1) when compared to the augmented model (AIC = 1283.9).
3.2.3. A potential bias for no
Japanese is an ideal language to explore how structural complexity affects the development of recursion, given that it contains only two comparably unrestricted structural strategies: 1) the simple marker no, which functions as a possessive (encompassing many of the multiple senses of the term, including ownership, kinship, and others) or a locative, and 2) relative clauses, with their full expressive power. To test whether children show a preference for using the simplest embedding structure (no), we compare first the overall frequency of no and relative clauses in Level 1. The reason for analyzing the relative frequency of no vs. RC embeddings in the Level 1 trials (i.e., trials where participants did not achieve the target level of embedding) is to establish a baseline of children's use. This data is taken from the structurally simpler responses that were classified as incomplete, sequential or nonembedded. Table 3 shows the percentage of Level 1 responses given that used no. As adults produced few incomplete (Level 1) responses, their data is not included here. For five-year olds, no represented about half of the Level 1 possessive trials, and about two-thirds of the relational noun trials. It was a less frequent option in locative and comitative trials. For six-year-olds, it was rarely used in comitative and locative Level 1 responses, but represented about two-thirds of the Level 1 responses to possessives and relational nouns. This shows that the use of no is unevenly distributed across conditions, independently of whether the structure is recursive or not. This is what one might expect if the adult targets are taken as baselines.
As our second step, we classified all target trials in terms of which types of configuration they contained (no only, RC only, or a mix of these). Individually, we observed that children tended to use all configurations. However, two of the five-year-old children (out of all 36 children who produced recursive responses) had targets exclusively configured with no.
Figure 3 shows the overall distribution of target types across groups. The response patterns of children were similar, but a chi-square analysis shows that the frequency of types of target responses was significantly different across groups (χ2 (df = 4) = 63.509, p <0.001). Relative frequency of mixed responses seems stable across the age groups. The source of the asymmetry lies in the fact that children produced more recursive no responses and fewer recursive RC responses than expected. Conversely, adults produced more relative clauses than expected. The frequency of no responses in adults was significantly low, relative to expected values (>-4SDs), as seen in the pattern of residuals obtained from the mosaic plot function in R (Friendly Reference Friendly1994).
Interestingly, when the data is broken out by condition, we observe that while early preference for no is true across conditions, it manifests differently in the various conditions. In Figure 4, we see that conditions can be characterized as being of two different types: conditions with a no bias (possessive and relational nouns) and conditions with a bias towards relative clauses (locatives and comitatives).
The two no-dominant conditions, possessives and relationals, show a strong early dominance of recursive no responses.Footnote §§ For the two conditions where RC was the expected pattern, children gave less recursive RCs and more mixed responses than adults; in the case of locatives, they also use recursive no even though this response type is absent in the adult data. In all cases, however, the prevalence of no is higher in the recursive responses (adding the uses of no in recursive and mixed responses) than would be expected when compared to the frequency of single no in Level 1 responses. This is highly significant for both groups of children (five-year-olds, (χ2 (df = 1) = 28.395, p <0.001); six-year-olds (χ2 (df = 1) = 28.748, p <0.001).
In the next section, we discuss the importance of these results for our research objectives.
4. Discussion and conclusion
Our study compared production of recursive NPs in Japanese-speaking children and adults. Children were aged five and six, that is, a point at which where most children are already producing recursive NPs, but their ability to do so is still undergoing substantive development. Overall, our results show that young Japanese children are very much like adults in their patterns of responses to our recursive modification conditions, but very different in other respects. Let's consider our three main research questions in light of the results.
First, comparing children and adults, when we analyzed the types of responses children and adults gave, we saw that for adults, the primary response is target (81% of the overall responses). In contrast, children's primary response is incomplete, for about half of all responses given by five-year-olds, and about one third of the responses of the six-year-olds. As the frequencies of these incomplete responses decrease with age, target responses increase almost complementarily, from about 25% for five-year-olds to 41% for six-year-olds. Clearly, children's capacities, whether syntactic or pragmatic, are changing significantly during the period of study, but are still not on par with the adults’ capacities.
Second, comparing conditions, children were exactly like adults in how they reacted to the various notional conditions. Performance varied substantially across conditions, but the relative rates of target responses between conditions followed the same pattern in children and adults. Statistically, the number of target responses in the comitative condition was not different from what we observed in the possessive condition; locatives and relationals, however, had lower rates of target responses. This difference was significant for locatives, but only marginally significantly different for relationals. Crucially, we observed no interaction between age group and condition. So in answer to our Question 1, we found that children's response are not qualitatively different from the adult's. As for Question 2, we found differences between the different types of semantic relations represented in our experimental conditions; but the children's and adults’ pattern of differences are similar.
Now, structurally, we see that the primary type of structure favoured does not predict that more target responses will be given to a condition. In Japanese, the primary bias is for no in possessive and relational noun contexts. Similarly, we see that for recursive possessive and part-whole relations the primary response is no, with frequent use of mixed no and relative clauses, whereas for recursive locatives and comitative relations the primary response is relative clauses, with a combination of relative clauses and no also used frequently.
It was not the case, however, that these associations between form and meaning either hindered or facilitated children's performance with a particular set of trials, as compared to adults. It was not the case that the two uses of no, the relational and the possessive, emerged at distinct points. Most children were able to produce both types, although at different frequencies, and the ratio of children who could do one but not the other was to be expected from the overall frequency patterns. We infer that the data does not provide specific support for a possessive no first stage.
Our third Question concerned the impact of simpler vs. more complex structures. This exploration of a possible child bias for simpler structures did show some positive results. Overall, the use of no in recursively modified NPs is higher in children than in adults, with use of recursive relative clauses increasing across age groups. In possessive and relational conditions, children produce mostly recursive no, compared to adults for whom recursive no is roughly a third of the targets. In the locative and comitative trials, children produce more mixed responses; adults, in contrast, have recursive relative clauses as the primary response. This is compatible with results obtained in the Terunuma et al. (Reference Terunuma, Isobe, Nakajima, Okabe, Inada, Inokuma, Nakato, LaMendola and Scott2017) comprehension study.
How can we interpret these three general results: quantitative differences, but qualitative similarities between children and adults, and a preference for simpler structures in children? First, can the sizeable adult-child differences in rates of success be due to processing and sentence planning capacities? Likely. Work on recursion suggests important differences between NPs of comparable length involving phrasal coordination vs. embedding (Pérez-Leroux et al. Reference Pérez-Leroux, Castilla-Earls, Béjar and Massam2012), and even between surface similar complex DPs containing a noun followed by two PP modifiers. Pérez-Leroux et al. (Reference Pérez-Leroux, Peterson, Béjar, Castilla-Earls, Massam and Roberge2018a) showed that both children and adults were more successful when these complex NPs had two PP modifiers in a non-recursive configuration, with the two PPs modifying the head noun, than when one PP modified an noun inside a PP modifier which in turn modified the highest noun.
Embedding is costly, in terms of both processing and sentence planning. Research on sentence production suggests that, while children's sentence planning processes begin to mirror adult patterns early, there is clear evidence of resource differences. It has been observed that at the telegraphic speech stage, there is a point where speech errors shift towards sentence-initial position (Wijnen Reference Wijnen1990), and lexical retrieval and phrasal planning processes start to operate over multiple elements. As such, McKee et al. (Reference McKee, McDaniel, Garrett, Fernández and Cairns2018) suggest that sentence planning is adult-like by age four to six. Thus, when prompted to produce single- and doubly-embedded object and subject relative clauses, children show the same effects of sentence type as adults (McDaniel et al. Reference McDaniel, McKee and Garrett2010). These authors observed developmental changes in the patterns of disfluencies that suggest that children engage in syntactic planning at multiple points over complex utterances, whereas adult are capable of planning for longer spans. At the same time, when considering speech rate, children under five slow down when producing relative clauses, unlike adults, who tend to accelerate all presupposed materials, such as relative clauses. Potential limitations of memory, efficiency in lexical retrieval, and the ability to coordinate different types of information are argued to explain these specific differences between children and adult sentence production.
This perspective on the challenge of recursive modification is compatible with previous research on working memory and its interaction with the development of recursion in the thought domain. Arslan et al. (Reference Arslan, Hohenberger and Verbugge2017:2) ask: “Why do children need some years to pass second-order false belief tasks once they are able to pass first-order false belief?”. Their answer suggests that complexity increases with the number of beliefs involved and the recursive organization of second-order theory of mind stories, because of added demands on working memory. A “serial processing bottleneck” (Verbrugge Reference Verbrugge2009) occurs on the serialization task needed to process the structures. Multiple recursive embedding may create a similar issue, but proving this is beyond the scope of this article. For the moment, however, we see this possibility as a further “attempt to account for properties of language in terms of general considerations of computational efficiency, eliminating some of the technology postulated as specific to language and providing more principled explanation of linguistic phenomena” (Chomsky Reference Chomsky2005, p. 1). This would imply that the substantial differences we observed between children and adults, both in rates of success and in preference for no, are not due to grammatical differences in the two groups, but rather to third-factor considerations that may very well apply in other domains of human cognition.
As to the comparison between children and adults, we saw that despite the similarity in forms across conditions, the frequencies of target responses differ between conditions. The above interpretation of the difference between adults and children says nothing about which conditions should be more successful. The differences between conditions, i.e., the more successful possessive and commitative conditions vs. the less successful relational and locative conditions, require a separate account. As we noted above, the key observation is that differences between conditions do not change over the course of development.
As a minimally specified marker, no is the natural choice for children. Our results are highly compatible with Shibatani's analysis of no as a nominalizer. Recall that according to him, the NP containing no is a nominalized NP that enters into a modification relation with another N to form a complex NP. The function of no is not to specify the nature of the semantic modificational relation between two nouns; it simply allows for the head noun to shift its denotation from itself to something associated with it instead. Pragmatic inference then leads to more specific and contextual interpretations of the relation. This approach provides for a strict unified grammatical analysis of no, and at the same time for the flexibility needed to account for the fact that our participants were more successful with possessive and comitative than with relational and locative. As Shibatani indicates, it is likely that this asymmetry between our meaning-based conditions does not arise from anything in the grammatical domain. Our results thus favour unified analyses of no as a multifunctional marker (see section 2.1). More drastic differences in children's responses among our four conditions might have pointed to the possibility that there are several different nos corresponding to different labels and structures; but this was not the case.
Rather, that participants are more successful with the possession and accompaniment conditions likely depends on the fact that, as speakers, we are generally more likely, when describing a referent, to rely on certain properties of the object or scene than others. More specifically, the relations instantiated in our conditions may differ as to how likely they are to be used as part of a referential description. For example, as observed by Culbertson et al. (Reference Culbertson, Schouwstra and Kirby2020: 696) in a discussion about strength of association between a noun and NP internal constituents, “adjectival properties (e.g., red) are on average more closely related to the objects they modify (e.g., wine) than numerosities are (e.g., two), which are in turn more closely related to the objects they modify than demonstratives are (e.g., this).” In other words, when establishing contrastive reference, we might be more likely to invoke more salient features of a scene, such as other animate entities in a relation of ownership or adjacency (accompaniment), compared to more background properties such as the location of an object, or to some feature of a part of itself (relational nouns). This shifts the explanation to conceptual-cognitive considerations and properties of the visual context, and these general domain differences can be expected to affect children and adults equally.
Our results have shown that core syntactic properties of structures (the simplicity of no, vs. the structural elaboration of relative clauses) do not predict acquisition or adult performance, in the sense that the structural biases of our semantically-defined conditions did not determine the degree of success. While children preferred the more economical no option overall, that preference did not allocate advantages to the no-biased conditions. As previous literature indicates, children learn the basic structural toolkit early (simple relativization and no modification). This, we argue, is followed by a distinct learning step leading to using these forms of embedding recursively (Roeper Reference Roeper2011, Pérez-Leroux et al. Reference Pérez-Leroux, Roberge, Lowles and Schulz2021). These two steps, embedding and recursive embedding, underlie structural complexity and allow the expression of more complex thoughts (Hinzen and Sheehan Reference Hinzen and Sheehan2013, de Villiers Reference de Villiers2020). By five years old, where our study begins, Japanese children are starting to deploy the ability to use these forms of embedding recursively, and this ability undergoes substantial growth during the period of observation. These developmental changes, we propose, are not due to changes in grammatical capacities or representations, but to the performance system.
Finally, our results lead to particular observations on the autonomy of syntax and the syntax-semantics isomorphy. First, as pointed out by Adger (Reference Adger, Hornstein, Lasnik, Patel-Grosz and Yang2018: 153), the view that syntax is autonomous does not entail that “grammaticality is cut off from either probability or meaning. Rather it says that syntax cannot be reduced to either of these.” We have seen here a clear case where recursive structures built with a uniform marker, and derivationally identical, are used at different rates based on the conceptual relations expressed by the structures. The syntax of no is autonomous, but how it is used is influenced by other factors. Second, our study can serve to illustrate standard perspectives on the question of how explicit syntactic derivations are with respect to semantics. In the case of no, while there is perfect isomorphy between syntax and semantics, the (recursively) modified DPs are considerably underspecified semantically. It is the function of no as a nominalizer that allows it to be used to express an impressive array of conceptual relations. It also means that the task of reducing the possibilities must shift to a different module, here pragmatics as suggested by Shibatani (Reference Shibatani, Chung, Fukuda, Kim and Park2017, Reference Shibatani, Zariquiey, Shibatani and Fleck2019).
As hinted at in our introduction, the picture in the same empirical domain in other languages may at first sight appear quite different. In English, for instance, different markers can be used (Saxon 's and various lexical prepositions) to express explicitly, through compositional semantics, the effects of a particular conceptual relation to the exclusion of others (i.e., in does not mean with, or next to). But in reality, Japanese and English, or French, or Spanish function identically and the differences reduce to superficial ones concerning the lexicon and morphological processes: recursive modification structures, corresponding to various meanings, are built through Merge and interpreted; they differ only in the markers used to connect the nominals in the structure. In addition, whereas French and English (and to a lesser degree Spanish) have access to a number of prepositions to narrow the range of interpretations of a modification relation, some prepositions like de in French and Spanish remain drastically underspecified, and the intended conceptual relations must be determined through discourse and context. In other words, de in Spanish and French look very much like no in Japanese in their multifunctionality, but not morphosyntactically (since they are not particles). We could say that Japanese is just like Spanish or French, only more so! It is therefore not surprising that our Japanese results are in line with results from other languages, despite differences in the syntactic explicitness of the metonymic relations created by modificational structures.