Introduction
Once young children succeed at producing two-word utterances, there is a steady growth in the array of syntactic constructions that they can understand and produce. Their grammatical competence, however, does not develop homogeneously as various morpho-syntactic structures are mastered at a different pace and a rich literature on language development has focused on the factors of linguistic complexity that affect the emergence of different grammatical structures. Fine-grained comparisons between structure-specific growth trajectories are, however, relatively scarce in language acquisition research for many reasons, such as the practical difficulties of performing longitudinal studies or utilizing samples from large cross-sectional databases of children at different ages. Structure-specific developmental trends are often inferred only indirectly, and even the acquisition trends of grammatical structures that have received the most attention are hard to compare; this is true also for passives and Romance clitics that are the two constructions under investigation in this paper. There is, however, a remarkable source of information that might be exploited: the databases collected for the standardization of language assessment tests. These databases, primarily created for diagnostic reasons, usually cover an array of grammatical structures across different age groups, which are often much larger than the sample of participants tested in conventional experimental settings. Despite the great potential of these databases, their use in studies focusing on grammatical development remains limited. This is mostly due to the fact that participants' performance is considered only in aggregated form and scores collected for the various grammatical constructions are usually collapsed into a single composite score, making a comparison between different structures hard or impossible. This information, however, is available within the databases, and can be extracted through fine-grained analyses that go beyond total scores.
In this study, we looked at a large database of responses from Italian children between 4 and 10 years of age, part of the norming sample used to standardize the Batteria di Valutazione del Linguaggio in bambini dai 4 ai 12 anni / Battery for Language Assessment in children aged 4 to 12 years old (BVL 4–12; Marini et al. Reference Marini, Marotta, Bulgheroni and Fabbro2015). We targeted specific structures, so as to compare the developmental trend of sentences that differ in the morpho-syntactic realization of the internal argument of transitive verbs.
In Romance languages, including Italian, the internal argument of the verb, the theme/patient, appears in post-verbal position in canonical SVO sentences, while it appears preverbally in sentences such as passives and certain clitic constructions. It has been suggested that this phenomenon relates to movement operations, that are in turn associated with different degrees of grammatical complexity in language development. For example, Jakubowicz (Reference Jakubowicz2005, Reference Jakubowicz2011) proposed a complexity metric precisely defined in terms of the movement steps involved during the derivation. Her account capitalized on the initial preference for in situ wh-questions in French, both in children with typical development (TD) and those with developmental language disorders (DLD). In the same vein, Moscati and Rizzi (Reference Moscati and Rizzi2014, 2021) proposed that the developmental sequence of agreement configurations conforms to a movement-based metric of grammatical complexity. Indeed, children between 3 and 5 years of age mastered local agreement configurations in Italian that do not involve movement (e.g., Determiner-Noun agreement) earlier than those characterized by movement (e.g., Clitic-Past Participle agreement).
Following these previous approaches, other grammatical constructions, such as passives or clitics, can be investigated and compared once movement is taken as a primitive. In fact, both structures feature the dislocation of the internal argument, as illustrated by the Italian examples in (1–3):
In the SVO sentence in (1), the internal argument le ragazze “the girls” occupies its canonical post-verbal position. In (2) and in (3), instead, it surfaces to a position to the left of the finite verb/auxiliary, either as a clitic pronoun or as the subject.
Different approaches exist in the literature to characterize the fine-grained properties of (2) and (3). However, a movement step Footnote 1 has been generally assumed in both clitic constructions (see Kayne Reference Kayne1975; Sportiche Reference Sportiche, Rooryck and Zaring1996; Belletti Reference Belletti and van Riemsdijk1999) and passives (Jaeggli Reference Jaeggli1986, Collins Reference Collins2005). This is schematically illustrated in (4), where the theme/patient is raised from its post-verbal position to a higher one to the left:
Clitic constructions and passives also share other grammatical properties. For example, they both trigger past-participle agreement in Italian, a property that has also been considered in relation with movement. Although the past-participle does not agree with the object in canonical SVO sentences as in (1), movement triggers past-participle agreement in gender and number in (2) and (3). The uninflected form salut-ato in (1) is replaced by salut-ate in (2) and (3), a form that carries the same gender and number [−ate = +fem, + plur] of the fronted argument.
Besides considerations based on their grammatical properties in adult language, developmental studies have also unveiled some similarities between the two structures in (2) and (3). Both are produced later than SVO declaratives and are typically mastered only after three years of age. Many factors may be at play here, and they may not be the same for clitics and passives. However, there is at least one family of hypotheses that shares a similar assumption, namely that the observed delay in the acquisition of these structures may be traced back to the movement step in (4). Borer and Wexler (Reference Borer, Wexler, Roeper and Willliams1987) proposed that the comprehension advantage of short passives (e.g., the window is broken) over long ones (e.g., the window is broken by the boy ) stems from the fact that the full movement-based computation is obligatorily required only in the latter case, while short passives may receive a simpler adjectival reading in which no movement is involved. The asymmetry between long—arguably requiring movement—and short passives has been replicated many times across different languages (Greek: Terzi and Wexler, Reference Terzi, Wexler and Hirotani2002; Russian: Babyonyshev and Brun, Reference Babyonyshev and Brun2003; Portuguese: Rubin, Reference Rubin2009; see also Pierce Reference Pierce1992 for Spanish and the role of post-verbal subjects). In Italian, Volpato et al. (Reference Volpato, Verin and Cardinaletti2016) investigated children's comprehension of passives. They manipulated several factors, and, when the comprehension of actional long “be” passives was considered, their results indicated that children were able to select the correct interpretation shortly before turning four years old.
This time window roughly overlaps with the age at which children begin to master the syntax of clitics. Italian-speaking children can retrieve the correct antecedent of a clitic computing the local domain by the age of three (McKee, Reference McKee1992), and this is apparently generalizable across Romance languages (e.g., in French: Hamann et al., Reference Hamann, Kowalski, Philip, Hughes, Hughes and Greenhill1997; in Spanish, Bauuw et al., Reference Baauw, Escobar, Philip, Sorace, Heycock and Shillcock1997). However, other comprehension problems persist longer. For example, Dispaldro et al. (Reference Dispaldro, Ruggiero and Scali2014) analyzed the comprehension of 3rd person object clitics in Italian, testing children’s ability to use gender and number—two features that may be checked through movement—to disambiguate between two possible referents in a picture-matching task. Their results showed that by the age of four, children’s performance was still not adult-like.
More evidence, showing that it is at around the age of four that children's difficulties with clitics disappear, comes from studies focusing on production. In Italian (Hamann et al., 1997; Tedeschi, Reference Tedeschi2009), Catalan (Gavarrò et al., Reference Gavarró, Torrens and Wexler2010), and French (Hamann Rizzi and Frauenfelder Reference Hamann, Rizzi, Frauenfelder and Clahsen1996; Prévost 2009), the omission of 3rd person object clitics disappears at around this age (see also the large comparative study reported in Varlokosta et al., Reference Varlokosta, Belletti, Costa, Friedmann, Gavarró, Grohmann, Guasti, Tuller, Lobo, Anđelković, Argemí, Avram, Berends, Brunetto, Delage, Ezeizabarrena, Fattal, Haman, van Hout, de López, Katsos, Kologranic, Krstić, Kraljevic, Mi&ecedil;kisz, Nerantzini, Queraltó, Radic, Ruiz, Sauerland, Sevcenco, Smoczyńska, Theodorou, van der Lely, Veenstra, Weston, Yachini and Yatsushiro2016). The acquisitional complexity of clitics cannot be reduced to pronominalization in general, since the production of object pronouns in languages lacking a clitic series—as for example in English—does not seem to pose the same difficulty (Perez-Leroux et al., Reference Pérez-Leroux, Pirvulescu and Roberge2008; Serratrice et al., Reference Serratrice, Sorace and Paoli2004). Again, as for passives, movement-based accounts have been proposed for the late emergence of cliticization. This hypothesis dates back at least to Wexler (Reference Wexler1998), who proposed that the complete sequence of movement steps required by cliticization might be subject to maturation Footnote 2 .
More recently, new movement-based accounts have been proposed and object clitic omissions have been linked to a broader class of movement constraints. As shown in Figure 4, the first movement step raises the internal VP argument across the agent. This may give rise to classic intervention effects (Friedmann et al., Reference Friedmann, Belletti and Rizzi2009) that have been documented in both adults and children (for an overview, see Biondo et al., Reference Biondo, Pagliarini, Moscati, Rizzi and Belletti2022). This line of reasoning was initially proposed by Zesiger et al. (Reference Zesiger, Zesiger, Arabatzi, Baranzini, Cronel-Ohayon, Franck, Frauenfelder, Hamann and Rizzi2010) for French and more recently by Arosio and Giustolisi (Reference Arosio and Giustolisi2019) for Italian. According to such accounts, intervention effects are modulated by the degree of similarity, in terms of shared grammatical features, between the moved constituent and the one it crosses over.
This brief review highlights several similarities between the two constructions, both on the theoretical and on the empirical side. Although a unifying account seems beyond reach at the moment, many hypotheses already share the core idea that object movement might be a common source of complexity in language development. To the best of our knowledge, however, the only explicit comparison between clitics and passives is the production study reported in Manetti (Reference Manetti, Baiz, Goldman and Hawkes2013). She observed that Italian-speaking children between 3;6 and 4;6 years systematically avoided passives, preferring the use of clitics. Manetti argued that passives could be harder than clitics at this stage, hence the preference for using clitic constructions instead of passives. To date, however, Manetti's work is an isolated case and no comparative study exists in comprehension. Moreover, it is still impossible to have a comparative estimate of the development of the two structures in child language. Our first goal is therefore to enrich the literature in this direction and extend the observation to children between 4 and 10 years old. This is well beyond the time window considered by previous investigations regarding clitics and passives, which have usually focused on the time window between 4 and 6. In most cases, however, children's performance was still not at ceiling, suggesting that some development might still occur in the grammar school years. For this reason, we decided to also include older children.
A second major goal of this study concerns the relationship between working memory and sentence comprehension. As already stated, grammatical development is characterized by great interindividual variability. This may depend not only on sociolinguistic factors (e.g., environmental variables related to parental education and professional profile; Riva et al., Reference Riva, Cantiani, Dionne, Marini, Mascheretti, Molteni and Marino2017; Romeo et al., Reference Romeo, Leonard, Robinson, West, Mackey, Rowe and Gabrieli2018) but also on a child’s cognitive profile. Indeed, morpho-syntactic abilities hinge on the interplay between accumulating linguistic knowledge and growing cognitive skills (i.e., attention, executive functions, and memory; Miyake et al., Reference Miyake, Friedman, Emerson, Witzki, Howerter and Wager2000; Miyake and Friedman, Reference Miyake and Friedman2012; Marini et al., Reference Marini, Piccolo, Taverna, Berginc and Ozbič2020). Accumulating evidence from studies involving children with TD and DLD suggests that, among the cognitive skills involved in grammatical processing, working memory plays a major role (e.g., Baddeley et al., Reference Baddeley, Gathercole and Papagno1998; Gathercole et al., Reference Gathercole, Service, Hitch, Adams and Martin1999; Engel de Abreu et al., Reference Engel de Abreu, Gathercole and Martin2011; Marini et al., Reference Marini, Gentili, Molteni and Fabbro2014; Riva et al., Reference Riva, Cantiani, Dionne, Marini, Mascheretti, Molteni and Marino2017). According to an influential multicomponent model, working memory is a complex system made of different sub-components each dedicated to the storage and management of a specific type of representation (i.e., Visuo-Spatial Sketchpad, Episodic Buffer, Phonological Loop) under the supervision of a Central Executive (Baddeley, Reference Baddeley2003, Reference Baddeley2007). Together with the Central Executive, the Phonological Loop, made of a passive component (Phonological Short-Term memory, Pho-STM) and an active one (Verbal Working Memory, Verb-WM) which manipulates the information temporarily stored in the Pho-STM, is involved in lexical and grammatical processing. For example, vocabulary growth correlates with Pho-STM abilities as measured through serial span measures (e.g., Digit Forward recall task, Non-word repetition task) (e.g., Baddeley et al., Reference Baddeley, Gathercole and Papagno1998) in children with TD (Gathercole et al., Reference Gathercole, Service, Hitch, Adams and Martin1999; Leclerc and Majerus, Reference Leclercq and Majerus2010 ) and DLD (Archibald and Gathercole, Reference Archibald and Gathercole2006). Pho-STM memory and Verb-WM arguably operate synergically in sentence comprehension: Pho-STM holds linguistic material and makes it available for more complex or longer processing cycles; Verb-WM may instead provide the operational resources needed to perform grammatical computations such as those related to syntactic movement. Indeed, long-distance relations generated by the dislocation of a constituent from its original position likely involve working memory, at least until the dependency has been resolved (Gibson, Reference Gibson, Marantz, Miyashita and O’Neil2000; Lewis and Vasisht, Reference Lewis and Vasishth2005).
To obtain information about the different components involved, serial and composite span tasks can be used: Serial span tasks (e.g., the Digit Forward Recall and the Nonword repetition tasks) have been typically employed to assess the passive capacity of Pho-STM (storage only) whereas composite span tasks (e.g., the Digit Backward Recall and Last Word retrieval tasks) have been related to the functionality of the Central Executive to allocate Verb-WM resources (storage + processing). Results, however, still provide a blurred picture.
In a group of 6-year-old bilingual children, Engel de Abreu et al. (Reference Engel de Abreu, Gathercole and Martin2011) found that both Digit Forward and Backward Recall tasks correlated with accuracy in sentence comprehension. However, they measured accuracy considering the overall score obtained by administering the TROG-2 battery (Bishop, Reference Bishop2003). This choice does not allow us to determine whether sentences with an increasing degree of morpho-syntactic complexity also require an increasing amount of working memory resources. This issue was explicitly addressed by Montgomery, Magimairaj, and O'Malley (Reference Montgomery, Magimairaj and O’Malley2008). They looked at simple and complex sentences. Complex sentences were considered to include different types of non-local relations, such as anaphoras (pronouns, reflexives) and movement relations (passives). Comprehension accuracy was measured for simple and complex sentences separately. In children between 6 and 12 years old, the authors failed to find significant correlations between measures of Pho-STM and sentence comprehension. Footnote 3
The potential relation between Pho-STM, working memory, and syntactic complexity has been assessed also in studies that focused on clausal embedding. Arosio, Guasti, and Stucchi (Reference Arosio, Guasti and Stucchi2011) analyzed the comprehension of subject and object relative clauses in 9-year-old children. In object relative clauses, the object is moved across the subject, similarly to the movement step shown in (4). This factor has been thought to account for the higher processing load in both children (e.g., Adani, Reference Adani2011; Arosio et al., Reference Arosio, Adani, Guasti, Brucart, Gavarró and Solà2009) and adults (e.g., Gordon et al., Reference Gordon, Hendrick and Johnson2001; Grodner and Gibson, Reference Grodner and Gibson2005; King and Just, Reference King and Just1991; Staub, Reference Staub2010; Staub et al., Reference Staub, Dillon and Clifton2017; Traxler et al., Reference Traxler, Morris and Seely2002, Biondo et al. Reference Biondo, Pagliarini, Moscati, Rizzi and Belletti2022), also considering the involvement of memory resources in maintaining active long-distance dependencies (Gibson Reference Gibson, Marantz, Miyashita and O’Neil2000; Lewis and Vasisht Reference Lewis and Vasishth2005). The results presented in Arosio et al. (Reference Arosio, Guasti and Stucchi2011) showed a clear correlation between digit span and object relatives, i.e., the constructions with higher complexity.
The apparent contrast between the findings in Arosio et al. (Reference Arosio, Guasti and Stucchi2011) and the ones reported in Montgomery et al. (Reference Montgomery, Magimairaj and O’Malley2008) may depend on the different types of sentences that were used in the two studies, thus making any comparison based on grammatical complexity inconsistent. Indeed, in a systematic review of working memory and sentence processing in children, Boyle, Lindell, and Kidd (Reference Boyle, Lindell and Kidd2013) highlighted that “complexity” has usually been used to refer to either movement or embedding, two very different kinds of grammatical relations. Boyle, Lindell, and Kidd (Reference Boyle, Lindell and Kidd2013) treated movement or embedding as distinct factors and experimentally manipulated them in a sentence comprehension task administered to children aged 4 through 6 years old. They found a significant effect of movement, as sentences with object movement were harder in both moved (i.e., passives) and embedded clauses (i.e., object relatives) with a significant interaction between movement and embedding, meaning that object relatives were the hardest configuration to process. Turning to memory, however, Nonword repetition and Digit Backward Recall tasks, associated with Pho-STM and Verb-WM respectively, did not play any significant role Footnote 4 . More recently, Delage and Frauenfelder (Reference Delage and Frauenfelder2020) assessed Pho-STM, Verb-WM, and sentence comprehension in children with TD and DLD aged 5 to 14 years old. Notably, the sentence comprehension task included complement clauses as well as subject and object relatives. In children with TD, measures of both Pho-STM and Verb-WM predicted better comprehension only when complex morpho-syntactic constructions were considered. However, again, movement and embedding were not analyzed separately. This limits the results in important ways, and no conclusions could be drawn about the impact of memory resources on object movement.
Taken together, previous studies on sentence comprehension have started unveiling a relationship between language processing and Verb-WM. However, the contribution of the different components of working memory and their involvement with respect to grammatical constructions with different degrees of derivational complexity is still not clear. The mixed pattern that emerges from the previous investigations is arguably related to the selection of the stimuli used in the experiments. Moreover, differences in the age range, in the size of the experimental groups, and in the measures used to assess working memory make it hard to come up with a fully coherent understanding of this complex relationship.
Overall, this study has two main goals: the first is to compare the developmental trend of sentences that do not involve movement (i.e., canonical SVO sentences) with alternative constructions (i.e., Clitic sentences and Passives) that are derived through a dislocation of the internal argument. This is built on a primitive notion of movement-based syntactic complexity, as proposed in Jakubowicz (Reference Jakubowicz2005, Reference Jakubowicz2011), Moscati and Rizzi (Reference Moscati and Rizzi2014), and Moscati et al. (Reference Moscati, Rizzi, Vottari, Chilosi, Salvadorini and Guasti2020); the second is to investigate the potential relation between Pho-STM and Verb-WM on constructions whose grammatical computation needs temporary storage of non-local dependencies created by a movement operation (i.e., Passives and Clitics).
Cross-sectional data were obtained by administering a grammatical comprehension task to a large cohort of 996 children aged 4 to 10. The participants’ accuracy in sentence comprehension was assessed with a picture-matching task, and separate values for each of the three grammatical constructions (i.e., canonical SVO sentences, passives, and clitic clauses) were collected. According to the existing literature, we hypothesized that movement would increase the processing load, making it harder for younger children to understand passives and clitics than SVO sentences. Here, we considered matrix clauses only, so as to exclude the potential confound introduced by clausal embedding. We also aimed to assess the relationship between sentence comprehension and working memory. We, therefore, collected two working memory measures: a serial span measure (i.e., Digit Span forward) that is usually associated with Pho-STM memory and a composite span measure (i.e., Digit Span backward) that is associated with Verb-WM.
Materials and methods
Data availability
The data and the R analysis code for this study are openly available for download at the following OSF link https://osf.io/tn23r.
Participants
We analyzed data from a database of 996 children between 4 and 10 years old, taken from the norming sample used in the standardization of the BVL 4–12 (Marini et al., Reference Marini, Marotta, Bulgheroni and Fabbro2015). For descriptive purposes, participants are here grouped according to age and divided into 7 age groups. Age was treated as a continuous variable in all the statistical analyses. Sample size, mean age, and standard deviations for each age group are reported in Table 1.
Materials
Assessment of working memory skills
Participants were given the forward and backward digit recall subtests of the Wechsler Scales (Reference Wechsler1993). The former is a serial span task aimed to evaluate a child’s Pho-STM. The latter is a composite span task used to explore the ability of the child to manipulate information in working memory (i.e., the central executive ability to allocate resources to Verb-WM). In the forward digit recall test, children were asked to repeat sequences of digits in the correct serial order. The sequences ranged from 1 to 9 digits that the examiner uttered them at a rate of 1 digit per second. The number of sequences that the child was able to repeat formed the Forward Digit Recall score. In the backward digit recall test, children were asked to repeat the sequence of spoken digits in reverse order. The number of sequences that the child was able to adequately invert and repeat formed the Backward Digit Recall score.
Assessment of grammatical comprehension skills
All children were administered the 40 items of the Grammatical comprehension task of the BVL 4–12. The items belong to different grammatical constructions, and, for each item, children were asked to select the image that accurately represents the meaning of the heard sentence, choosing among four different alternatives. An example is reported in Figure 1.
A correct response for each item was coded as 1, while an incorrect response was coded as 0. Selected constructions were also isolated: we identified matrix clauses with transitive structures, excluding negative sentences. Thirteen items, grouped according to their grammatical structure, were included within this smaller pool. They were divided into SVO, Passives, and Clitic constructions (Table 2). The full list of these selected items is reported in Appendix A.
Results
Grammatical comprehension: overall performance
Before considering children's accuracy with respect to the selected constructions, the whole dataset was inspected to determine the overall trend of children's grammatical comprehension across the different ages. Children's aggregated scores were determined by calculating the proportion of correct answers (i.e., accuracy) on the total of 40 items. The overall accuracy scores for each age group are reported in Figure 2. As can be seen in this Figure, children's overall accuracy grows constantly with age, with higher variability in the younger groups.
Grammatical comprehension: a focus on SVO, Passive, and Clitic sentences
Before turning to the analysis of the different grammatical structures, the performance on each of the 13 selected items was preliminarily inspected. Figure 3 shows average accuracy values for each item in the SVO, Passive, and Clitic conditions. As can be seen from the plot, item 16 (see Appendix C) in the Clitic condition shows remarkably low accuracy values. Namely, this specific clitic sentence presents difficulties that persist even in older children while all the other items reach ceiling levels by the age of 8. Low accuracy values for item 16 are unlikely to be primarily related to grammatical development. We therefore decided to remove this item from further statistical analyses. Footnote 5,Footnote 6
The average proportion of correct responses for the comprehension of grammatical structures without (i.e., SVO) and with (i.e., clitic and passive sentences) movement is reported for each age group in Figure 4. Overall, the three grammatical constructions follow different developmental trends that are clearly distinct until age 7. The two structures with movement appear to be the most challenging, a result that is in line with the previous literature.
In order to assess the effect of grammatical structure on accuracy, a logit mixed-effect model analysis (Jaeger, Reference Jaeger2008) through the package lme4 (Bates et al., Reference Bates, Mächler, Bolker and Walker2014) in R Studio (RStudio Team, 2020) was performed. Compared to traditional non-parametric/parametric tests adopted in previous studies, mixed-effect models have the advantage of accounting for by-subject and by-item variabilities through the definition of a random effect structure. The random effect structure of the model adopted for this study included varying intercepts and slopes for Item and Subject grouping factors. The fixed effects were contrast-coded depending on our hypotheses (Schad et al., Reference Schad, Vasishth, Hohenstein and Kliegl2020; see also Brehm and Alday, Reference Brehm and Alday2022), by using the stats package (R Core Team, 2021). In particular, the Helmert contrast was adopted. This contrast allowed us to compare the average accuracy of the easier canonical structure (SVO) with the average accuracy of two more challenging grammatical constructions involving movement (i.e., clitic and passive sentences), and to compare the two complex structures with each other. The model thus included the following fixed effects: Movement (Passive/Clitic sentences vs. SVO), Complex structures (Clitic vs. Passive sentences), centered Age, as well as the interaction between age and the two structure-related contrasts. The output of the mixed-effect model analysis is reported in Table 3 and reveals a significant effect of Age, Movement, and a significant interaction Movement:Age. No significant effects were found when comparing accuracy in the comprehension of the two most complex structures (i.e., Passives vs. Clitics).
Relation between working memory and sentence comprehension
The potential relation between working memory and sentence comprehension was addressed by running correlation analyses and linear mixed-effect model analyses. The analyses were conducted on 961 participants (see Table 4) since digit recall scores of 35 participants were missing.
The strength of the relationship between working memory (i.e., forward and backward digit recall scores) and sentence comprehension while controlling for the effect of age was assessed with a series of partial correlations (see Table 5). Such correlations were performed in R through the ppcor package and by using the default Pearson method to calculate partial correlation coefficients (Kim, Reference Kim2015). These analyses showed the presence of a positive correlation between comprehension accuracy and both forward and backward digit scores for all three structures supporting the existence of a significant relation between the comprehension of sentences without (i.e., SVO) and with movement (i.e., clitics and passives) and both Pho-STM and Verb-WM.
To further assess the impact of working memory on the acquisition of different syntactic structures, we added forward and backward span values in our mixed-effect model analysis, which was performed by adopting the same packages used for the first analysis. The main effect of memory and its interaction with sentence types was tested in a (forward) stepwise fashion. The models were compared by using Anovas. The best-fitting models for forward memory and backward memory are reported in Tables 6 and 7, respectively. The output of the model selection analysis is reported in Appendix B. Table 6 shows that in addition to the main effects of grammatical structure and age and their interactions, the forward digit recall span also significantly increased the probability of providing the correct answer. Results are similar for the backward digit recall span, which significantly increased the probability of providing the correct answer (see Table 7). Children of the same age with higher backward digit recall values and with higher forward digit recall values are seen to provide the correct choice more often than children with lower Backward and/or forward digit recall values.
Discussion
From the analysis of our database of children between 4 and 10 years old, two major results can be reported. First, the acquisition of grammatical structures progresses along different trajectories, and the presence (or absence) of dislocations predicts their growth curves. Second, sentence comprehension in children is also related to working memory resources, with a generalized effect across the sentence types considered here. We discuss these two results in light of previous developmental studies.
The first finding supports the idea that different grammatical structures are not acquired at the same pace and that movement can provide an explanation for this developmental difference. This hypothesis had already been expressed in slightly different terms in Jakubowicz (Reference Jakubowicz2011) and Moscati and Rizzi (Reference Moscati and Rizzi2014, 2021) to adhere to the specific features of wh-movement in French or agreement in Italian. Overall, the data coming from wh-movement, agreement configurations, and direct-object fronting provide a coherent picture, that is a movement-based dimension of grammatical complexity.
In relation to this first result, of note is the fact that the comprehension of clitic and passive sentences did not show a distinct pattern among participants at any age. This trend is shown in Figure 4 and is confirmed by the general mixed-effect model analysis. Due to the nature of our dataset, we cannot exclude that differences indeed may exist between the two constructions and that they can be brought to light by targeted experimental manipulations. Several alternative theoretical notions of movement have been proposed to qualify cliticization or passivation, and endorsing just one or some of them goes beyond the opportunities offered by our dataset. In this sense, datasets used for standardizations provide a formidable resource in terms of sample size and the variety of structures that can be contrasted, not to mention the potential for cross-linguistic comparisons. The tradeoff is that they are designed to evaluate a broad spectrum of linguistic dimensions so that the quest for variety hampers the creation of controlled minimal pairs: the distinctive feature of lab-based experiments. Being aware of the limitations of our tool, for the time being, we take our results at face value and assume that the development of clitics and passives is contingent upon the availability of the general movement step sketched in (4). From this point of view, our study is the first one that explicitly compared and found strong parallelism in the comprehension of clitics and passives.
This can be considered in relation to the only other study that, to the best of our knowledge, has discussed the relationship between clitics and passives in Italian: the production experiment reported in Manetti (Reference Manetti, Baiz, Goldman and Hawkes2013). This study showed that approximately by four years of age, Italian-speaking children avoid the production of passives in contexts where adults may find them appropriate. In that study, children often used a different answering strategy, using a clitic instead of a passive. Manetti’s results show that, in production, the clitic option is ranked above the passive one, with the implication that the development of the two structures is asymmetric. Clearly, there are important differences between Manetti’s study and ours, leading to more general considerations on the divergence between production and comprehension. For example, production may be more severely affected by performance limitations that may obscure some truly grammatical options. It is conceivable that the availability of movement unlocks both passivation and cliticization, but that additional limitations in production only apply to passives. For example, issues related to auxiliary selection may delay the production of passives. Footnote 7 On the other hand, comprehension targets the structures of interest more directly and the constructions under investigation are generally provided in the prompt, already formed. A forced-choice comprehension task might thus reveal substantial parallelism in the grammatical competence of passives and clitics, avoiding complications that would occur downstream in production. Footnote 8
Turning to our second result, we found that the comprehension of sentences either with or without movement (i.e., SVO vs. Clitics and Passives) correlated with measures of both Pho-STM and Verb-WM. We maintained previous assumptions under which serial and composite span tasks (i.e., forward digit recall and backward digit recall, respectively) are the expressions of the working capacities of the two subsystems. Previous results diverge on the conclusions that can be drawn regarding the role of working memory in supporting complex sentence comprehension, and this could be due to at least two reasons. The first, already discussed, is related to the different grammatical structures that fall under the definition of “complex.” Here we adopted an explicit notion of grammatical complexity based on movement. In doing so, sentences with clausal embedding have been filtered out from the dataset. This allowed us to eliminate at least one of the confounds pointed out by Boyle, Lindell and Kidd (Reference Boyle, Lindell and Kidd2013).
A second reason for the differences among previous studies may be related to the characteristics of the participant samples, including sample size. For example, Montgomery, Magimairaj, and O’Malley (Reference Montgomery, Magimairaj and O’Malley2008) did not find any significant effect of memory measures on sentence comprehension in a sample of 51 monolingual children, scattered across an age interval between 6 and 12 years of age. Engel de Abreu et al. (Reference Engel de Abreu, Gathercole and Martin2011) found instead that working memory had a significant impact on comprehension, looking at a larger group of 109 six-year-old bilingual children. Apart from the important distinction between monolinguals and bilinguals, the number of participants in Engel de Abreu et al.’s study was more than double the size compared to the sample in Montgomery et al. Therefore, we argue that the effects of working memory might go undetected if the study’s statistical power is limited by a low number of participants across wide age ranges as this may lead to high variability (as in Montgomery et al., Reference Montgomery, Magimairaj and O’Malley2008). In our study, our sample was between 10 and 20 times larger than the ones considered in previous investigations and we found significant working memory effects. We believe that this is one of the main advantages of exploiting large, standardized datasets.
Another open question concerns the non-selectivity of working memory on the comprehension of simple and complex sentences: both simple SVO sentences and Clitics/Passives correlated with higher backward and forward digit recall scores. Footnote 9 On one hand, this confirms that the Central Executive and the Phonological loop play a role in sentence comprehension; on the other, it suggests that none of the two subsystems provides resources that are directly involved in the resolution of distance relations. Several models of sentence comprehension, for example the ones in Gibson (Reference Gibson, Marantz, Miyashita and O’Neil2000) and Lewis and Vasisht (Reference Lewis and Vasishth2005), have assumed that working memory provides crucial support for the temporary storage of long-distance dependencies. Under this view, we would have expected an interaction between working memory and clause types. In our opinion, there are at least two possibilities that need to be considered in relation to the lack of interactions.
The first is that passives and clitics are relatively light in terms of working memory requirements so that the effects of working memory would be almost indistinguishable from SVO matrix clauses. If this is correct, in the absence of more sensitive tasks or WM measures, it would be appropriate to increase the memory load and look at more complex sentences, for example, object relative clauses. Here, working memory may be more heavily involved than in subject relatives, due to the greater length of the dependency and the potential insurgence of intervention effects. In fact, Arosio et al. (Reference Arosio, Guasti and Stucchi2011) showed that, in Italian 9-year-olds, D-span measures modulate the comprehension of object—but not subject—relatives.
This first line of reasoning is essentially based on the quantity of the working memory resources involved. There is, however, a second possibility, based on their quality. Memory tasks such as backward and forward digit recall may be only indirectly related to sentence processing: processing relations between constituents is arguably different from repeating sequences of digits. In this respect, our study has shown that the phonological loop and the central executive support language acquisition in general. Reiterating sentences in the Phonological Loop and maintaining them active in the Verbal WM may arguably enlarge their processing window. However, this kind of working memory may not fully overlap with other kinds of mnemonic resources that could more directly support the elaboration of movement dependencies. We must therefore leave open the possibility that other syntactic working memory resources would better correlate with grammatical complexities associated with long-distance dependencies.
Conclusions
By analyzing the data collected through the standardization of grammatical tests and their use as screening tools for language-related developmental disorders, we started exploiting a new data source that is potentially available in many languages. Moving from Italian, we showed that these large datasets provide valuable information for tracing structure-specific developmental trends, as in the comparison between SVO declaratives and constructions with the dislocation of the internal argument, as clitics and passives. In our view, this creates a virtuous cycle, from language development research to clinical practice, and back. Data on syntactic structures that have been initially included in the test batteries due to previous research in language development are later analyzed and considered so to inform current research with new evidence coming from large-scale samplings. Results from our analysis provide new support to movement-based complexity metrics proposed in recent years (Moscati and Rizzi Reference Moscati and Rizzi2014; Moscati et al. Reference Moscati, Rizzi, Vottari, Chilosi, Salvadorini and Guasti2020).
In addition to this, finer-grained analyses that go beyond total scores offer a better vantage point to shed light on the relation between the operations that underlie grammatical development and other cognitive abilities. In line with previous results, our study supports the impact of working memory on comprehension tasks. However, it shows that the kind of working memory resources measured through forward and backward digit recall does not offer a greater advantage in processing long-distance dependencies created by movement. This opens the possibility that such measures cannot accurately quantify a different kind of working memory—that we can provisionally label syntactic working memory—that more directly subserves syntactic computation.
A final concluding remark concerns clinical practice. New analyses per grammatical structure, that go beyond total scores, might provide practitioners with construction-specific reference levels, thus improving the precision of the screening tools. This has important implications for language assessment. The descriptive depth of the test batteries could be enhanced with structure-specific acquisitional trends, adding more refined baselines for the evaluation of possibly distinct subgroups of children with DLDs.
Replication package
The data and the R analysis code for this study are openly available for download at the following OSF link https://osf.io/tn23r.
Competing interests
The authors declare no conflict of interest.
Ethics statement
This study has been approved by the ethical committee of the Scientific Institute “IRCCS Eugenio Medea” in conformity with the Declaration of Helsinki.
Appendix A. List of the items extrapolated by the Grammatical Comprehension task to assess age-related effects on the comprehension of SVO, passive, and clitic sentences
Appendix B. Output of the stepwise model selection analysis for memory effects, through the anova function.
Memory forward
m0: Accuracy ∼ 1 + (Complex_Structures + Movement) * Age + (Complex_Structures + Movement | Subj) + (1 | Item)
mf1: Accuracy ∼ 1 + (Complex_Structures + Movement) * Age + Forward + (Complex_Structures + Movement | Subj) + (1 | Item)
mf2: Accuracy ∼ 1 + (Complex_Structures + Movement) * Age + Forward + Forward:(Complex_Structures + Movement) + (Complex_Structures + Movement | Subj) + (1 | Item)
Memory backward
m0: Accuracy ∼ 1 + (Complex_Structures + Movement) * Age + (Complex_Structures + Movement | Subj) + (1 | Item)
mb1: Accuracy ∼ 1 + (Complex_Structures + Movement) * Age + Backward + (Complex_Structures + Movement | Subj) + (1 | Item)
mb2: Accuracy ∼ 1 + (Complex_Structures + Movement) * Age + Backward + Backward:(Complex_Structures + Movement) + (Complex_Structures + Movement | Subj) + (1 | Item)
Appendix C