1. Introduction
Bilingual children’s language production abilities have been investigated from different perspectives over the last 20 years. However, many previous studies are based on an (unintended) normativity bias. That is, more often than not they report group comparisons between monolinguals and bilinguals, with monolinguals serving as the ‘control group’ (Rothman et al., Reference Rothman, Bayram, DeLuca, Di Pisa, Duñabeitia, Gharibi, Hao, Kolb, Kubota, Kupisch, Laméris, Luque, van Osch, Soares, Prystauka, Tat, Tomić, Voits and Wulff2023). As a result, any difference between bilinguals and monolinguals is interpreted in terms of a bilingual effect (both in typically developing and in non-typically developing samples, e.g., Hambly et al., Reference Hambly, Wren, McLeod and Roulstone2013; Serratrice & Hervé, Reference Serratrice, Hervé, Serratrice and Allen2015; Vender et al., Reference Vender, Delfitto and Melloni2018). This kind of analysis hides that monolinguals may exhibit the same variation as observed among bilinguals in certain aspects of language production and also hides that some (sub-groups of) bilinguals may behave similar to monolinguals. The same argument could be applied to language comprehension.
In recent years, more and more researchers have emphasised the individual variation that can be found amongst bi- or multilingual adults and children focussing, for example, on the development of questionnaires to capture this variation (De Cat et al., Reference De Cat, Kašćelan, Prévost, Serratrice, Tuller and Unsworth2023; Wigdorowitz et al., Reference Wigdorowitz, Pérez and Tsimpli2022), arguing for embracing heterogeneity in bilingualism (Prévost & Tuller, Reference Prévost and Tuller2021) and suggesting methodological alternatives (Rothman et al., Reference Rothman, Bayram, DeLuca, Di Pisa, Duñabeitia, Gharibi, Hao, Kolb, Kubota, Kupisch, Laméris, Luque, van Osch, Soares, Prystauka, Tat, Tomić, Voits and Wulff2023). Although some of these suggested alternatives still include group-based comparisons, for example, between groups of heritage language speakers, the first alternative Rothman et al. (Reference Rothman, Bayram, DeLuca, Di Pisa, Duñabeitia, Gharibi, Hao, Kolb, Kubota, Kupisch, Laméris, Luque, van Osch, Soares, Prystauka, Tat, Tomić, Voits and Wulff2023) propose is deemed most interesting for the current investigation: having no control/comparison group. In light of this, the present study aims to introduce the concept of a ‘profile’: Speakers may exhibit similar or different behaviours in language production, independently of whether they belong to a specific group (e.g., monolinguals vs. bilinguals) or their language production adheres to some criteria of appropriateness, which is in turn typically based on monolingual samples.
In the current study, we will focus specifically on the production of referring expressions (REs, henceforth), and therefore investigate ‘reference profiles’. We operationalise this concept in terms of groups of speakers with similar reference-production behaviour, as identified here by means of a cluster analysis. The data-driven, ‘bottom-up approach’ used in this study allows us to identify clusters of speakers based on their reference-production behaviour without the need for a control/comparison group. As an additional advantage, this methodological approach allows us to investigate multiple morphosyntactic forms and discourse functions at once, which is rare in research on reference production. Once clusters have been identified, we will examine in how far the speakers in a cluster share similar cognitive (based on measures of executive function [EF] and theory of mind [ToM]) and language-experience profiles. This way, we aim to shed new light on the relationship between reference production and linguistic and cognitive variables. In the following sections, we will discuss the use of REs in discourse, the role of cognitive abilities in RE production, and RE production in bilinguals in turn.
1.1. Referring expressions and their discourse functions
Reference production is a complex task that requires keeping track of discourse referents and selecting discourse-appropriate REs to refer to them. Whether a RE is appropriate depends on the status of the corresponding referent in discourse. For example, referents that are maintained in discourse tend to be more accessible (or prominent, see von Heusinger & Schumacher, Reference von Heusinger and Schumacher2019) than referents that are newly introduced or reintroduced in discourse after a hiatus. At the level of linguistic encoding, more accessible referents can be referred to with more reduced REs (e.g., pronouns or clitics), whereas less accessible ones may require (indefinite or definite) full nounsFootnote 1 to be resolved correctly by a listener (Ariel, Reference Ariel1991). The idea that REs mark the degree of accessibility of the corresponding referent is captured by the accessibility marking hierarchy (Figure 1).
It should be noted that the inventory of REs in the accessibility marking hierarchy varies from language to language. In this contribution, we deal with Greek and Italian. They are both null-subject languages: Null pronouns encode the highest degrees of accessibility. They are in complementary distribution with clitic pronouns, which also encode the highest degree of accessibility but are used in (direct or indirect) object position, whereas null pronouns are used in subject position (Torregrossa et al., Reference Torregrossa, Bongartz and Tsimpli2015, Reference Torregrossa, Andreou and Bongartz2020). Full pronouns, that is, non-reduced pronouns, are more explicit than null pronouns and clitic pronouns and can occur both in subject and object position. In both Greek and Italian, definite and indefinite full nouns encode the lowest degrees of accessibility. A notable aspect in which Greek and Italian differ is complement clauses. That is, whereas Italian has nonfinite complement clauses, these are absent in Greek. This is shown by the contrast between (1) and (2): While the verb in the complement clause in (1) is finite, the verb in the complement clause in (2) is nonfinite.
Importantly, an individual’s utterances may not always perfectly adhere to the accessibility marking hierarchy. For example, a RE could be overspecified if an explicit form (e.g., a full noun) is produced to refer to a referent which is high in accessibility (Koolen et al., Reference Koolen, Gatt, Goudbeek and Krahmer2011): The use of ‘Elsi’ in the third clause in (3) – adapted from Torregrossa et al. (Reference Torregrossa, Andreou, Bongartz and Tsimpli2021) – is redundant, given that Elsi is established as the discourse topic in the first clause and is the subject of the second clause. Alternatively, a RE can also be underspecified, corresponding to a form that is not informative enough to clearly describe a referent which is low in accessibility. This would be the case of the use of ‘she’ in (4) to refer to Sarah, who is neither the discourse topic nor the subject of the second clause.
In the present study, we will examine in how far speakers differ from each other in how they map discourse functions onto the use of REs. The accessibility marking hierarchy will serve as some indication of (universal) pragmatic appropriateness, but we will not go as far as to (numerically) compare bilingual children’s productions to any norms or control group (based on the considerations in Section 1).
1.2. Role of cognitive abilities in reference production
The production of REs is a complex task from both the linguistic and cognitive point of view (Hendriks, Reference Hendriks2016). A referent’s discourse accessibility corresponds to a mental representation of this referent in the speaker’s mind. The task of the speaker is to map this mental representation onto the use of a RE (Kibrik, Reference Kibrik2011; Torregrossa et al., Reference Torregrossa, Bongartz and Tsimpli2019). Importantly, the referent’s mental representation interacts with the speakers’ individual cognitive profiles, for example, EFs and ToM (Hendriks, Reference Hendriks2016). This interaction motivates (at least part of) the observed inter-individual variation in reference production. Most of the studies considered in this section concern children. If a study refers to adult speakers, we will mention this explicitly.
EFs are a multifaceted construct consisting of several components (Miyake et al., Reference Miyake, Friedman, Emerson, Witzki, Howerter and Wager2000), including ‘inhibition of prepotent responses, shifting mental sets, monitoring and regulating performance, updating task demands, goal maintenance, planning, working memory, and cognitive flexibility, among others’ (McCabe et al., Reference McCabe, Roediger, McDaniel, Balota and Hambrick2010, p. 222). Updating refers to an individual’s ability to remember a previous stimulus, link a current stimulus to a previous one and continuously integrate ‘new’ information into ‘old’ one (Carriedo et al., Reference Carriedo, Corral, Montoro, Herrero and Rucián2016). In reference production, referents should be kept in memory as discourse unfolds (Swets et al., Reference Swets, Desmet, Hambrick and Ferreira2007; Van Rij et al., Reference Van Rij, Van Rijn and Hendriks2013; Vogels et al., Reference Vogels, Krahmer and Maes2015) and a referent’s current mention needs to be linked to its previous mention while new information about this referent is continuously integrated. Research has shown that children with better updating abilities use more appropriate references (Whitely & Colozzo, Reference Whitely and Colozzo2013).
Some studies have specifically found a relationship between limited updating abilities and the use of overspecified REs in bilingual children (e.g., Torregrossa et al., Reference Torregrossa, Andreou, Bongartz and Tsimpli2021). Torregrossa and colleagues interpreted this use of overspecified REs in terms of speakers’ reduced ability to link a referent’s current mention to its previous mention under limited updating abilities. This would lead to the production of REs encoding low degrees of accessibility in contexts that would allow for the use of more reduced REs (De Cat, Reference De Cat, Serratrice and Allen2015).
The use of overspecified REs may also be related to speakers’ limited ability to keep referents in memory, which is also one of the functions of updating. As a result, the accessibility of a referent decays more rapidly than when the speaker has a high working memory capacity (Hendriks & Vogelzang, Reference Hendriks and Vogelzang2020; Vogelzang et al., Reference Vogelzang, Guasti, van Rijn and Hendriks2021 for a computational model related to the comprehension of null pronouns). This low accessibility would correspond to the use of explicit REs (Rosa & Arnold, Reference Rosa, Arnold, van Deemter, Gatt, van Gompel and Krahmer2011 on adult speakers), in compliance with the accessibility marking hierarchy described in Section 1.1. Alternatively, however, if the speaker struggles to keep referents in memory and therefore fails to retrieve the full noun, an underspecified form may be used instead.
EFs are also related to sustained attention (‘goal maintenance’ in the quote by McCabe et al., Reference McCabe, Roediger, McDaniel, Balota and Hambrick2010 above), intended as the ability to maintain attention to a stimulus (a referent) for an extended period of time (Fisher & Kloos, Reference Fisher, Kloos, Griffin, McCardle and Freund2016). In the domain of reference production, less sustained attention may mean that a referent is less accessible in a speaker’s mind (its accessibility fades away more easily than when attention is high). This would lead to a similar prediction as with low WM capacity, namely the use of overspecified REs.
All the above-mentioned studies investigated how variation in EFs affects the production of REs in referent maintenance contexts. However, different patterns may be observed when reintroducing a referent. For example, Van Rij et al. (Reference Van Rij, Van Rijn and Hendriks2013) show that adult speakers with lower WM capacity tend to use underspecified REs when reintroducing a referent; they may use a pronoun to refer to an antecedent in object position, although pronouns show the tendency to refer to subject antecedents (see also Serratrice & De Cat, Reference Serratrice and De Cat2020 for similar results with bilingual children tested in their societal language English). This is because under limited WM capacity, speakers’ mental representations of a referent’s accessibility are driven more by factors such as recency of mention than the actual computation of discourse-cues (e.g., syntactic position of the antecedent; e.g., Almor et al., Reference Almor, Kempler, MacDonald, Andersen and Tyler1999; Van Rij et al., Reference Van Rij, Van Rijn and Hendriks2013). These results show that variation in EFs may affect reference production in different ways, depending on the discourse function to be expressed. Finally, it should be noted that some other studies have found no effects of WM on pronoun production or interpretation (for adults, see Arnold et al., Reference Arnold, Strangmann, Hwang, Zerkle and Nappa2018; for children, Kuijper et al., Reference Kuijper, Hartman and Hendriks2021).
Another relevant component of EFs is shifting mental sets, which refers to the ability to switch behavioural responses according to a certain context (Diamond, Reference Diamond2013). In the domain of reference production, this may correspond to the use of variable form–function relationships. For example, Bamberg (Reference Bamberg1987) has shown that young children often stick to a thematic subject strategy when telling narratives, whereby they tend to use reduced REs only in association with the main character. This may be due to their still developing cognitive flexibility. Cognitive flexibility may also be involved in the mapping between discourse functions and syntactic positions. For example, some children may tend to maintain reference in subject position, whereas others may maintain reference in both subject and object position, which leads to ‘more “dynamic” patterns of switches (from subject to non-subject and vice versa)’ (Bongartz & Torregrossa, Reference Bongartz and Torregrossa2020, p. 255).
The considerations discussed thus far concern how cognitive variables interact with the representation of a referent’s accessibility in the speaker’s mind. However, reference production is also a pragmatic process whereby a speaker selects a form that can be recovered (i.e., interpreted correctly) by the listener. ToM may be crucially involved in this process. ToM is a broad cognitive construct comprising several sub-skills, including the ability to grasp that others’ beliefs and mental representations may not be the same as one’s own. This ability to keep track of others’ beliefs and mental states is related to reference production: speakers need to know what listeners think/know when determining whether a RE will be understood correctly. Children with developing ToM may take a more self-centred perspective (Hendriks, Reference Hendriks2016). Therefore, development of ToM abilities is predicted to be associated with a decrease in the use of underspecified pronouns. Indeed, ToM has been found to be positively associated with the use of full nouns for referents with low accessibility (Kuijper et al., Reference Kuijper, Hartman and Hendriks2015 on production; Kuijper et al., Reference Kuijper, Hartman and Hendriks2021 on comprehension).
The present study will take into account all of these different types of cognitive abilities (updating, cognitive flexibility, ToM) and investigate how they relate to bilingual reference production.
1.3. Bilingualism and reference production
We have thus far established that reference production involves the interaction between cognitive and linguistic factors. Therefore, it is particularly interesting to analyse it from the perspective of bilingual language acquisition, given that bilingual children may exhibit a variety of linguistic profiles both intra-individually – since, for instance, they may have a different competence in each of their languages – and inter-individually – since, for instance, they may differ from each other in their competence in a language.
Indeed, the production of REs varies both within and across bilingual speakers (Montrul, Reference Montrul2004). Both overspecification and underspecification have been attested (Serratrice & Hervé, Reference Serratrice, Hervé, Serratrice and Allen2015; Sorace & Serratrice, Reference Sorace and Serratrice2009). Torregrossa et al. (Reference Torregrossa, Bongartz and Tsimpli2019, Reference Torregrossa, Andreou, Bongartz and Tsimpli2021) have argued that the type of reference-production pattern observed is related to the bilingual profile of the participants in terms of language exposure, with overspecification being associated with unbalanced bilinguals tested in the weaker language, and underspecification with balanced bilinguals. Contemori et al. (Reference Contemori, Tsuboi and Armendariz Galaviz2024) report similar results in adult bilinguals: overspecified REs were produced in the non-dominant language (Spanish) and underspecified REs in the dominant one (English).
Torregrossa et al. (Reference Torregrossa, Bongartz and Tsimpli2019, Reference Torregrossa, Andreou, Bongartz and Tsimpli2021) proposed a processing account for this phenomenon: overspecification emerges if the syntactic options available in a language (i.e., functional categories such as pronouns) are not proceduralised. As a result, speakers rely on a pragmatic strategy consisting in the repetition of a full noun. Underspecification is shown to be an effect of reduced speed in lexical retrieval. For example, an underspecified null pronoun is produced if children are slower in retrieving the (contextually more appropriate) full noun.
Other studies have proposed a representational account of bilingual children’s use of REs. For example, several studies have noticed that bilingual children speaking a combination between a null-subject and clitic-language (such as Italian) and a non-null-subject and non-clitic language (such as English) tend to produce full pronouns in subject and object position in the null-subject and clitic-language in contexts in which the use of null pronouns or clitics would have been more appropriate (Serratrice et al., Reference Serratrice, Sorace and Paoli2004; Sorace & Serratrice, Reference Sorace and Serratrice2009; Tsimpli & Sorace, Reference Tsimpli, Sorace, Bamman, Magnitskaia and Zaller2006). This has been interpreted in terms of cross-linguistic effects from the non-null-subject language to the null-subject one (Torregrossa & Bongartz, Reference Torregrossa and Bongartz2018), although similar effects can occur when children speak two null-subject languages (e.g., Sorace et al., Reference Sorace, Serratrice, Filiaci and Baldo2009). Mishina-Mori et al. (Reference Mishina-Mori, Nakano, Yujobo and Kawanishi2024) analysed reference production by English–Japanese bilingual children, with Japanese being a null-subject language and English a non-null-subject one. Interestingly, they observed the overproduction of full noun phrases in Japanese only in referent reintroduction contexts and concluded that only discourse contexts that require more processing resources (i.e., referent reintroduction vs. referent maintenance, according to their analysis) are more vulnerable to cross-linguistic effects. Cross-linguistic effects have also been found in the production of REs in Italian by Italian–Greek bilingual children. These children used more null pronouns for referent reintroduction than Italian monolinguals, arguably due to cross-linguistic influence from Greek to Italian (Andreou et al., Reference Andreou, Torregrossa, Bongartz, Fotiadou and Tsimpli2023), given that Greek allows for the use of null subjects in referent reintroduction contexts to a greater extent than Italian (Torregrossa et al., Reference Torregrossa, Andreou and Bongartz2020).
In addition to cross-linguistic effects, representational accounts may also examine in how far bilingual reference production is affected by development. In other words, bilinguals’ referencing abilities may still be developing as in the case of monolinguals or with a slight delay compared to monolinguals. For example, Serratrice and De Cat (Reference Serratrice and De Cat2020) found that bilingual children’s tendency to use underspecified REs decreased with increasing WM and language proficiency in English, with both variables strongly correlating with age. This suggests that bilingual children follow the same path of reference acquisition as their monolingual peers. Some of the observed differences in bilingual pronoun production have been attributed to cross-linguistic influence. Because cognitive variables, rather than cross-linguistic influence, are the main focus of the current study, we will investigate two languages with similar – although not identical – reference systems: Greek and Italian (Section 1.1).
It should be considered that overall, all the studies reviewed in this section based on the comparison between monolinguals and bilinguals report quantitative differences between the two groups. However, to the best of our knowledge, no study so far has shown any consistent or systematic violation of the accessibility marking hierarchy by bilinguals, which suggests that bilinguals are in general aware of the pragmatic principles underlying the use of REs in discourse in their two languages (Flores & Rinke, Reference Flores and Rinke2020; Torregrossa et al., Reference Torregrossa, Bongartz and Tsimpli2019).
Another important caveat is that most of the studies above have analysed a limited set of REs (e.g., null pronouns, overt subject pronouns or full nouns) within one reference function (introduction, maintenance or reintroduction), without considering which types of reference functions are expressed by other types of REs in the system. There is a gap in the literature when it comes to empirically examining bilinguals’ reference management across forms and across functions. Besides not needing a control/comparison group, the cluster analysis we will apply also has the advantage of being able to take into account different REs and reference functions at the same time.
2. The present study
Although first attempts at examining correlations between reference production and cognitive abilities have been made (Section 1.2), as of yet no clear profiles of bilingual reference production have been identified. In this study, we aim to identify distinct profiles of use of REs by bilinguals and link these to children’s language background and cognitive abilities. In order to provide a comprehensive overview of bilingual children’s abilities with regards to reference management, we have formulated three research questions, which will be investigated by means of a narrative retelling task in Greek and Italian.
-
• RQ1. Does bilingual children’s reference production adhere to the accessibility marking hierarchy in both of their languages?
To this end, we distinguish three discourse functions (following Bamberg, Reference Bamberg1987; Fichman & Altman, Reference Fichman and Altman2019; Serratrice, Reference Serratrice2007; Whitely & Colozzo, Reference Whitely and Colozzo2013): 1) introduction of new referents; 2) maintenance of reference and 3) reintroduction of a previously mentioned referent. We expect bilingual children to produce pragmatically sensible REs in both of their languages (Flores & Rinke, Reference Flores and Rinke2020; Torregrossa et al., Reference Torregrossa, Bongartz and Tsimpli2019; see also the considerations at the end of Section 1.3), and thus adhere to the accessibility hierarchy. Because this is the premise for the later research questions, it needs to be established first.
-
• RQ2. Can distinct profiles of reference management in bilinguals be identified?
Even if children adhere to the accessibility marking hierarchy in both of their languages, there may still be inter-individual variation, especially because REs are flexible and oftentimes there is no one ‘correct’ form (Section 1.1). We will run exploratory cluster analyses to examine whether distinct profiles of reference management can be identified. This data-driven, bottom-up approach allows us to avoid examining children’s reference production based on norms or comparisons with a control group (in line with suggestions by Rothman et al., Reference Rothman, Bayram, DeLuca, Di Pisa, Duñabeitia, Gharibi, Hao, Kolb, Kubota, Kupisch, Laméris, Luque, van Osch, Soares, Prystauka, Tat, Tomić, Voits and Wulff2023).
-
• RQ3. How do these profiles relate to children’s language background and cognitive abilities?
If distinct profiles of reference management can be identified, these will be examined further in relation to children’s language background and cognitive abilities. Based on the background literature, we make different predictions for referent maintenance and referent reintroduction. For referent introduction, no specific predictions emerged from the literature so the analyses will be exploratory. For referent maintenance, we predict children with more limited updating and sustained attention abilities to use explicit forms (e.g., full nouns) in contexts that would allow for the use of reduced forms (e.g., null pronouns or clitics) (in line with Torregrossa et al., Reference Torregrossa, Andreou, Bongartz and Tsimpli2021, see also De Cat, Reference De Cat, Serratrice and Allen2015). However, note that some effects in the opposite direction have been found in that production of underspecified REs increases with lower WM (Serratrice & De Cat, Reference Serratrice and De Cat2020).
For referent reintroduction, we predict children with lower updating abilities to use more reduced forms in contexts where the use of explicit forms would be more appropriate (in line with Lehmkuhle & Lindgren, Reference Lehmkuhle and Lindgren2024; Serratrice & De Cat, Reference Serratrice and De Cat2020; Van Rij et al., Reference Van Rij, Van Rijn and Hendriks2013). Furthermore, we predict children with lower cognitive flexibility to stick to certain referencing strategies to a greater extent than children with higher cognitive flexibility, for instance, associating the expression of certain discourse functions with specific syntactic positions (Bamberg, Reference Bamberg1987; Bongartz & Torregrossa, Reference Bongartz and Torregrossa2020). Finally, we predict children with lower ToM abilities to use underspecified reduced forms (Kuijper et al., Reference Kuijper, Hartman and Hendriks2015). These tendencies should be observed in both of the bilinguals’ languages.
3. Methods
3.1. Participants
Thirty-seven children (Mean age = 9;4, range 7;10–11;6; two dates of birth unknownFootnote 2) attending an Italian school in Athens (Greece) participated in the research. In the school, Italian was the main medium of instruction for both content and language learning with a total number of 24 h of instruction per week. Greek was taught as an additional language for about 5 h per week. Besides attending this school, no additional recruitment/inclusion/exclusion criteria were used. Both the parents and the teachers reported that none of the participants had previously identified speech, hearing or visual impairments. Most of the children (n = 20) were simultaneous bilinguals – being exposed to Greek and Italian from birth; 15 children were successive bilinguals: six first exposed to Italian from age 3 (upon entrance in kindergarten), seven first exposed to Italian from age 6 (upon entrance in primary school) and two first exposed to Greek from age 6. We did not receive the questionnaire from two parents. Three additional children participated but they did not complete the narrative retelling task in both of their languages and were therefore excluded from the analyses. Before conducting the study, the parents provided written informed consent and we reminded the children that they did not have to take part in the activity if they did not want to.
3.2. Tasks
The children performed a narrative retelling task in both Greek and Italian, as well as vocabulary (to establish language dominance), Wisconsin Card Sorting, ToM and 2-back tasks. Examples of the materials used in each task can be found in the Supplementary Material.
The children performed a narrative retelling task, namely the Edmonton Narrative Norms Instrument (ENNI; Schneider et al., Reference Schneider, Dubé and Hayward2005), in both Greek and Italian. The task contains a set of 13 pictures that present a story about four characters (stories ‘A3 – airplane’ and ‘B3 – balloon’ were used). Children listened to a recording of the story, and were then asked to retell it. The retelling mode was used in order to make sure that all children understood the story plot. Otwinowska et al. (Reference Otwinowska, Mieszkowska, Białecka-Pikul, Opacki and Haman2020) showed that the retelling mode benefits the complexity of the story structure but does not affect children’s use of specific structures (e.g., specific REs).
An expressive vocabulary task was done by the children in both languages (Greek and Italian). The lexical items were drawn from the Word Finding Vocabulary Test (Renfrew, Reference Renfrew1995). We administered the task as a series of 50 pictures of objects (taken from the internet, see Supplementary Material) appearing on a PowerPoint presentation. The children got one point for each object that was named correctly either right after the presentation of the image or after providing a semantic cue (using the same cues for all participants). We assigned 0.5 points to those items that were correctly named after providing a phonemic cue (i.e., the first syllable of the corresponding word) (Greek mean = 35.6, range = 7–45; Italian mean = 34.3, range = 20–48). Based on this, their dominant language was identified as the language in which the child scored the highest: this language was used as the language of administration for the cognitive tests described below (25 children were dominant in Greek and 12 children were dominant in Italian).
The 2-back task (Kirchner, Reference Kirchner1958) is a task requiring WM and updating. In this task, children saw changing slides with a single digit and were asked to press a button when the current digit was the same as the digit two slides back. The task contained 60 trials, 20 of which should have been responded to with a button press and 40 of which should have been ignored. A-prime scores (Zhang & Mueller, Reference Zhang and Mueller2005) were calculated based on the number of correct and incorrect responses (hits and false alarms) (mean = 0.81, range = 0.55–0.97). Instructions for this task are verbal, but the task itself relies solely on (language-independent) digits.
The Wisconsin Card Sorting Test (WCST, henceforth; Kongs et al., Reference Kongs, Thompson, Iverson and Heaton2000) was used to measure cognitive flexibility in a relatively language-independent manner. Children were presented with four cards that contain symbols varying in shape, colour and number. They were asked to match another, novel, card to one of the four cards, without receiving instructions what to match it on. Thus, children had to figure out the underlying rule through trial and error. However, the underlying rule changed throughout the task. Perseverative errors occurred when there was a rule change, but the child continued with the same response strategy as before. Such errors were thus an indication of a failure to inhibit a response and switch to a different response strategy, reflecting cognitive flexibility. Perseverative errors are reported as a percentage out of the total of 128 trials (mean = 14.9, range = 0–32; lower scores indicate better performance). Once a correct rule is found, ‘failure to maintain set errors’ reflected cases in which children failed to continue with the same, correct, strategy and thus changed their strategy before this was appropriate (mean = 2.8, range = 0–8; lower scores indicate better performance). This can be taken to indicate distractibility or absence of continued attention (Figueroa & Youmans, Reference Figueroa and Youmans2013; Miles et al., Reference Miles, Howlett, Berryman, Nedeljkovic, Moseley and Phillipou2021). One child did not perform the test because s/he was absent in the corresponding session.
Finally, ToM was tested through a task using silent videos of characters performing actions in a social situation (Devine & Hughes, Reference Devine and Hughes2013). The children were asked questions related to how the characters’ mental states motivated their actions. Six questions were asked about the videos; four targeting first order ToM and two targeting second order ToM. Children were asked the questions in their dominant language and could answer in any language they wanted. Responses were scored as 0 (incorrect or irrelevant), 1 (factually correct answers with no explicit reference to false belief) or 2 (correct answers with explicit reference to false belief), following the coding procedure in Devine and Hughes (Reference Devine and Hughes2013). For each child, we considered the total response score (mean = 5.7, range = 1–10; max score 12).
3.3. Procedure
Children were tested individually in a quiet room in three test sessions. The second session was conducted at least 1 week after the first one, whereas the third session between 1 day and 1 week after the second one. In the first session, we administered the narrative task and the vocabulary task either in Greek or Italian. In the second session, we administered the narrative and the vocabulary task in the other language, counterbalancing the order of the languages between the two test sessions. We made sure that half of the children retold the story A3 in Italian and half of the children did it in Greek. The same holds for story B3.
The narrative task was administered in the form of a PowerPoint presentation. The children listened to a pre-recorded version of the story (either A3 or B3) on headphones, while looking at a series of pictures. The story was told by a female voice. Then, they had to retell the story to the experimenter, while looking at the same series of pictures previously shown. The experimenter did not have access to the pictures. The children were thus encouraged to provide as many details of the story as possible (Torregrossa & Bongartz, Reference Torregrossa and Bongartz2018 for methodology). The narrative task was audio-recorded before being transcribed. The expressive vocabulary task was also administered in the form of a PowerPoint presentation. The experimenter documented whether the children provided a correct answer, a correct answer after a semantic cue, a correct answer after a phonemic cue, a wrong answer or no answer, using a scoring sheet. Children’s answers were also audio-recorded, which allowed us to recheck each scoring sheet, correcting any possible coding error by the experimenter during the administration of the task. Within each session, we counterbalanced the order of administration of the narrative and the vocabulary task, respectively.
In the third session, we administered the cognitive tasks in the language that was identified as the dominant one based on the administration of the vocabulary tests in the first and second session. We counterbalanced the order of administration of the three tasks. The experimenters were a native speaker of Greek and a native speaker of Italian, who were responsible for the Greek and the Italian session, respectively (with the narrative and the vocabulary task). One of the two experimenters also administered the cognitive tasks, depending on the child’s dominant language.
3.4. Coding and statistical analyses
All children were able to perform both of the narrative tasks; the number of clauses they used in their productions as well as the number of words per clause are presented in Table S2 in the Supplementary Material. Clauses were defined based on the occurrence of a finite or nonfinite verb. For each clause, we coded all REs referring to animate referents or the main inanimate referent (the little airplane in story A3 and the balloon in story B3). REs were coded based on their morphosyntactic form and their discourse function. Morphosyntactic forms which were not sufficiently frequent in the dataset (relative pronouns, possessive adjective/pronouns, demonstratives, quantifiers) were excluded; this affected 11.66% of the data. Three thousand two hundred sixty-four REs, of six forms, remained: indefinite full nouns, definite full nouns, full pronouns, clitics, nulls and nonfinites (nonfinites only in Italian; see Section 1.1). Each RE was also coded for its discourse function, which relates to the accessibility of its antecedent in the discourse: 1) introduction of new referents (intro); 2) maintenance of reference (maintain) and 3) reintroduction of a previously mentioned referent (reintro). Some examples of the narrative production data and coding in both Greek and Italian can be seen in Table 1. More details about the coding scheme as well as an example fragment can be found in the Supplementary Material.
Coded referring expressions on each line are underlined. More examples can be found in the Supplementary Material. The full dataset is provided at https://osf.io/m2kwq/?view_only=7fe2391c5b2f4ff5b64828af0f8d155c.
All analyses were carried out using R (version 3.6.2; R Core Team, Reference Team2019). First, Pearson correlations were calculated with the package Hmisc, function rcorr (Harrell Jr, Reference Harrell2019) to examine the relation between the additional measures of vocabulary, WM and updating (2-back task), ToM, cognitive flexibility (perseverative errors in the WCST), and sustained attention (failure to maintain set in the WCST). This was done to check whether they reflect different constructs. This is especially relevant for the various cognitive measures, which are theoretically distinct but all require some type of executive function.
Then, we examined the frequency of each morphosyntactic form as relating to each discourse function in Greek and Italian separately. This will provide an overview of whether the REs that the children used are what would be expected based on their function in discourse. In order to statistically confirm the mapping of morphosyntactic forms to discourse functions, we conducted cross-sectional k-means cluster analyses (one for Greek and one for Italian) with the kmeans function from the factoextra package (Kassambara & Mundt, Reference Kassambara and Mundt2020; see Hamann & Abed Ibrahim, Reference Hamann and Abed Ibrahim2017; Peristeri et al., Reference Peristeri, Silleresi and Tsimpli2022 for more examples of such cluster analyses). A cluster analysis computationally clusters data points into meaningful subsets (i.e., clusters) in an unsupervised manner, that is, without having any bias as to how the dataset should be split up and how many clusters should be present. K-means (MacQueen, Reference MacQueen1967) is the name of the clustering algorithm applied; in this algorithm, each observation belongs to the cluster with the nearest mean. This technique was used to examine whether, from the morphosyntactic forms that the children used, their functions could be accurately predicted. The input was the percentage of usage of each morphosyntactic form for each of the three functions (introduction, maintenance, reintroduction) by each child. Thus, there are three data points per child, that is, one per discourse function per child, but each data point is an array with information about the production of each morphosyntactic form (i.e., type of RE). The output was a clustering of these data points. The optimal number of clusters was calculated based on the elbow method (Cui, Reference Cui2020; using the factoextra package, fviz_nbclust function, Kassambara & Mundt, Reference Kassambara and Mundt2020). For both languages, the elbow method indicated that the optimal number of clusters was 3. Once clustering is complete, because we know the true function of each data point, we can check the quality of the clustering by calculating its accuracy (i.e., a match between the identified cluster and the actual use of REs for a particular function). However, note that cluster analyses are probabilistic and therefore each execution of the algorithm may lead to slightly different outcomes. In this study, to prevent the possibility of fishing for significant or desirable results, we only considered the clusters identified in the first run of the algorithm.
In the next step, we examined whether distinct profiles of reference management by bilinguals could be identified. To this end, the percentage of usage of each morphosyntactic form at three ‘time points’ (for three functions: introduction, maintenance, reintroduction) for each language was used as the input for a k-means cluster analysis (of repeated-measures trajectories, using the kml3d package, kml3d function, Genolini et al., Reference Genolini, Alacoque, Sentenac and Arnaud2015). The output was a clustering of children based on their production profiles in both languages. For this analysis, the Calinski and Harabasz criterion (Calinski & Harabasz, Reference Calinski and Harabasz1974) indicated that the optimal number of clusters was 2. Again, we only considered the clusters identified in the first run of the algorithm. Because in this case, there is no correct or incorrect production profile, no accuracy measure was calculated. Instead, we examined how the two clusters related to children’s language background and cognitive abilities. For this purpose, independent samples t-tests between the groups (clusters) were performed to address RQ3 and, in particular, the hypotheses that 1) the group of children who used more full pronouns or nouns in maintenance contexts have more limited EF abilities (updating or sustained attention, based on respectively the 2-back task and failure to maintain set in the WCST); 2) the group of children who use more null and overt pronouns in reintroduction contexts have more limited updating abilities or lower ToM scores and 3) the group of children who tend to associate discourse functions with certain syntactic positions show lower cognitive flexibility (based on perseverative errors in the WCST). Because of these specific hypotheses, one-sided t-tests were used for these tasks/measures, whereas two-sided t-tests were used for the vocabulary measures.
4. Results
4.1. Compliance with the accessibility marking hierarchy in both languages
We address RQ1 first. The descriptives of children’s use of REs (Figure 2) show that, in both languages, children used mostly indefinite full nouns for introductions of a new referent. This is as was expected based on the accessibility marking hierarchy, as new referents have a low accessibility. Some definite full nouns were also used for this function. In maintenance positions, mostly nulls and clitics were used. Maintenance contexts involve reference to a highly accessible referent, and therefore selection of forms that are to the right of the accessibility hierarchy is appropriate. Note that in Greek, around 20% more null pronouns were used compared to Italian; This may relate to the fact that Greek speakers use null pronouns in contexts in which Italian speakers can use nonfinite forms (see Section 1.1). Finally, in reintroduction contexts the children mostly used definite full nouns. This too is in line with the accessibility marking hierarchy, as reintroductions are found in situations in which reference is made to an antecedent of medium accessibility. Some clitics and null pronouns were also used for reintroductions, which is interesting as these could lead to ambiguous reference.
We used cluster analyses (one for each language) to investigate whether the patterns in Figure 2 are sufficiently different for each discourse function to be identified as a distinct pattern based on each child’s production of REs. For Greek, 103 out of 111 data points were classified as introduction, maintenance or reintroduction correctly, showing an accuracy of 92.8% (i.e., a match between the identified cluster and the actual use of REs for a particular function, reflecting very high classification accuracy, Dalmaijer et al., Reference Dalmaijer, Nord and Astle2022). Four data points were incorrectly identified as reintroductions (these were actually introductions) and four data points were incorrectly identified as maintenance (these were actually reintroductions). For Italian, 104 out of 111 data points were clustered correctly, showing an accuracy of 93.7%. Four data points were incorrectly identified as reintroductions (three were actually introductions and one was maintenance) and three data points were incorrectly identified as maintenance (they were actually reintroductions). We refer to Table S5 in the Supplementary Material for a presentation of the three-cluster output obtained for the datasets of morphosyntactic forms in both languages.
The high classification accuracy shows that the children were very consistent in their production of form–function relationships. The results can further be interpreted as showing that the children mastered the conditions of use of REs in discourse by largely following the accessibility hierarchy, using mostly indefinite full nouns for introductions, null pronouns and clitics for maintenance, and definite full nouns for reintroductions in both languages. There are some small differences between the use of REs in Greek compared to Italian, but for the purposes of the current study it seems that in both languages, children produce largely pragmatically appropriate REs.
4.2. Identifying profiles of reference management in bilinguals
We now address RQ2. Cluster analyses identified two distinct groups of children (A and B; Figure 3 and Table 2), thus reflecting distinct profiles of reference production. Group A contained the majority of the children (n = 28), whereas group B (n = 9) represented a smaller group of children with different behaviour when it comes to reference production. We will describe the general trends in the clustering data below, although some observations are based on small percentage-based differences.
In introduction contexts, cluster A produced more definite full nouns, but fewer indefinite full nouns compared to cluster B in Greek (means of 33.7% vs. 18.5% for definite full nouns and 63.8% vs. 81.5% for indefinite full nouns). Interestingly, the opposite pattern can be observed in Italian, with cluster A producing fewer definite full nouns but more indefinite full nouns compared to cluster B (means of 21.7% vs. 31.2% for definite full nouns and 77.0% vs. 68.8% for indefinite full nouns). Although both forms are used for referent introduction in both languages, it is striking that the groups of children show opposite trends in their two languages. This points towards the possibility that the two groups vary in their language skills in the two languages. This possibility will be investigated in RQ3 (Section 4.3).
In maintenance contexts, some tendencies were observed in only one language. Most notably, children in cluster B used more full pronouns in maintenance position in Italian (5.4% vs. 2.5% in cluster A) and fewer null pronouns (32.9% vs. 43.0%). Furthermore, they used more nonfinite clauses (21.5% vs. 14.8%). Thus, clusters A and B show a preference for different pronominal forms here, although they both use pronominal forms versus full nouns at similar rates. In maintenance contexts, null pronouns are usually pragmatically appropriate because their antecedent should be clear, and thus it could be argued that the use of a full pronoun is unnecessary in maintenance contexts (i.e., it may be an overspecification). In Greek, children in cluster B use more null pronouns in maintenance contexts (66.8% vs. 57.3% in cluster A). In these contexts, they tended to use fewer clitic pronouns (19.8% vs. 25.5% in cluster A). This result seems to stem from a difference in grammatical role of the produced RE (i.e., subject vs. object) rather than any under- or overspecification.
In reintroduction contexts, we can observe some interesting differences between the clusters in both Greek and Italian (even if to a lower extent in Italian). In Greek, cluster B used up to 20% full pronouns (mean = 8.2%) in these contexts, whereas cluster A used fewer full pronouns (max. 8%, mean 0.5%). In contrast, cluster A used more null pronouns (means 26.7% vs. 19.6%). Children in cluster B use a greater number of clitics than the children in cluster A (means 18.7% vs. 24.9%). In Italian, these patterns are repeated although the differences are much smaller. The use of any pronoun in referent reintroduction could be seen as a risky, as it is potentially ambiguous. Therefore, it is interesting and possibly more straightforward to compare the children’s use of noun phrases. Indefinite full nouns were hardly every used for reintroductions by both groups. However, children in cluster A use more definite full nouns than children in cluster B, which is visible in both Greek and Italian (mean = 53.9% in cluster A vs. 46.9% in cluster B in Greek; mean = 55.4% in cluster A vs. 50.9% in cluster B in Italian). As full nouns are typically appropriate in reintroduction contexts, this could be interpreted to indicate that cluster A tends to be more accurate in reintroduction contexts than cluster B (but see Section 5 for an interpretation of the overproduction of null subjects by cluster A based on cognitive variables).
4.3. Language background and cognitive abilities
The final research question (RQ3) asked how the profiles identified in RQ2 relate to children’s language background and cognitive abilities. Table S4 in the Supplementary Material reports the correlational analyses between the linguistic and cognitive measures considered in the present study. Italian vocabulary was significantly positively correlated with WM and updating (i.e., 2-back task performance) and ToM performance. No other significant correlations were found, and thus it seems that the different cognitive measures captured different constructs.
An overview of the children’s scores per cluster is presented in Table 3. Cluster A was on average 4 months older, had higher ToM scores, and lower WCST scores (note that for both perseverative errors and failures to maintain set, lower scores indicate better performance). Of these comparisons, only the attentional measure (failure to maintain set in WCST) reached significance. Note that the two groups of children scored very similarly on their vocabulary measures, and therefore language proficiency and/or dominance cannot account for the identified profiles of reference production in bilinguals.
* p < .05
5. Discussion
5.1. The production of discourse-appropriate referring expressions in Greek and Italian
The first result emerging from the present study is that the children were able to produce appropriate and consistent REs in discourse. In line with the accessibility marking hierarchy, in both languages, they used indefinite and definite full nouns for referent introduction, definite full nouns for referent reintroduction and null pronouns and clitics for referent maintenance (Figure 2). We also noticed some slight differences between Greek and Italian. In Greek, the children used a greater number of null pronouns for referent maintenance than in Italian. This is because Greek does not allow for nonfinite clauses. Note that nonfinites are not frequently examined as REs and one could argue that they should not be considered as such – this may be a confound in the current study.
In addition, children used more null pronouns in referent reintroduction in Greek than in Italian. This is consistent with the tendency observed in the literature that null pronouns in Greek are more inclined to be used in referent reintroduction – for instance, when shifting from object to subject position – than null pronouns in Italian (Andreou et al., Reference Andreou, Torregrossa, Bongartz, Fotiadou and Tsimpli2023; Torregrossa et al., Reference Torregrossa, Andreou and Bongartz2020). The cluster analysis reported in Section 4.1 further supported the conclusion that the children were consistent in their use of REs in both of their languages and that they showed different RE production patterns for each discourse function: The algorithm was very accurate in predicting discourse functions based on the frequency of use of one RE or the other. On the whole, these results suggest that the children had a good mastery of reference in both languages and, on top of that, showed some differences in the production of REs between their two languages.
This first result is somewhat surprising in light of previous literature on the acquisition of reference by bilingual children. However, it should be considered that the bilinguals examined in this contribution had a relatively balanced profile compared to the bilinguals considered in other studies, who tend to be dominant in the societal language (Benmamoun et al., Reference Benmamoun, Montrul and Polinsky2013). For example, we observed that the children of the present study produced narratives of similar complexity in terms of number of clauses (Table S2 in the Supplementary Material) and their vocabulary scores were very similar across the two languages (Table 3). It is very likely that this balanced profile was related to their literacy exposure: the language which was the main medium of instruction at school (Italian) was different from the majority language in society (Greek). This suggests that under certain conditions of language and literacy exposure, bilingual/heritage speakers may be able to exhibit a good mastery even of those structures that have been shown to be particularly difficult to acquire, like REs and their conditions of use (Bongartz & Torregrossa, Reference Bongartz and Torregrossa2020; Rinke & Flores, Reference Rinke and Flores2014; Torregrossa et al., Reference Torregrossa, Eisenbeiß and Bongartz2023). Literacy exposure may be the key reason for the ability shown by the children considered in this study to produce discourse-appropriate REs, given that studies conducted on monolingual adults revealed that literacy exposure has a positive effect on reference production and comprehension (Arnold et al., Reference Arnold, Strangmann, Hwang, Zerkle and Nappa2018). However, it is not excluded that other factors may have played a relevant role for the results shown in this study. For example, as mentioned in Section 1.1, Greek and Italian pattern very similarly in the mapping between REs and discourse functions. This means that cross-linguistic effects are less likely to be visible in this group of children compared to other groups of children speaking combinations of languages that are typologically more distant from each other. The possibility of a cross-fertilization between the two languages is not excluded either, in compliance with observed cases of positive cross-linguistic influence in the domain of reference production (Section 1.3).
5.2. Identifying bilingual profiles of reference production
One of the main aims of this contribution was to provide a methodological approach for identifying reference-production profiles. In this way, we attempted to follow suggestions from the literature (Rothman et al., Reference Rothman, Bayram, DeLuca, Di Pisa, Duñabeitia, Gharibi, Hao, Kolb, Kubota, Kupisch, Laméris, Luque, van Osch, Soares, Prystauka, Tat, Tomić, Voits and Wulff2023) and overcome the bias in previous studies whereby reference production by bilinguals was compared to the golden standard of monolinguals. We did, however, use the accessibility marking hierarchy to provide some indication of pragmatic appropriateness of a RE. We followed a data-driven, bottom-up approach that allowed us to identify clusters of children based on their reference behaviour. In particular, we explored the potential of cluster analyses, considering the three discourse functions (introduction, maintenance and reintroduction) as ‘time points’. This kind of analysis allowed us to consider different mappings between morphosyntactic forms and discourse functions at the same time, improving on previous studies that have examined one discourse function at a time. This methodology has the potential to be applied to assess variation among other types of speakers beyond bilinguals, also considering other types of linguistic phenomena.
The analysis identified two clusters of children, exhibiting specific features in their reference production. In a nutshell, in introduction contexts cluster A produced more definite full nouns, but fewer indefinite full nouns compared to cluster B in Greek, whereas the opposite pattern was found in Italian. In maintenance contexts, the clusters of children varied from each other in their preference for different pronominal forms (nulls vs. clitics vs. full pronouns). These preferences also varied across the two languages. Finally, the children in cluster B produced more clitics and full pronouns, but fewer null pronouns in reintroduction contexts, which held in both languages.
Because of its unsupervised nature, the analysis identified two clusters which were unbalanced in terms of number of children, with one group consisting of 28 children and the other of 9 children. On the one hand, the observation that one cluster of children had a relatively small size is consistent with our previous consideration that the children were homogeneous in their mastery of use of REs. This may be in turn related to their being relatively balanced in terms of language proficiency and experience (Section 5.1). On the other hand, the unbalanced sample size of the two groups may limit the power of the statistical analyses to be conducted in the next steps of the analysis (Section 5.3).
Before proceeding to the next section, some shortcomings of cluster analyses are worth pointing out. Firstly, since they are based on a probabilistic approach, re-running the analysis may lead to slightly different clusters than the ones identified in a previous analysis. In this study, we considered the two clusters identified in the first analysis that we ran. We refer to our OSF-materials the possibility to re-run the analysis and compare any new results with the ones reported in this study. Secondly, as stated before, because analyses of the type applied in this contribution are unsupervised, clusters can vary in size. This can limit the power of any follow-up comparisons, as it did in the current study due to the unbalanced cluster sizes. Finally, because all data points are taken into account regardless of their relation, outcomes of cluster analyses can be complex and difficult to interpret. We therefore focused our interpretation of the results mainly on the REs and discourse functions for which we had formulated concrete predictions.
5.3. Language and cognitive variables involved in bilingual reference production
The analysis of how children’s cognitive abilities related to their reference profiles was limited by the fact that the two clusters of children were unbalanced in their size, which affected the statistical power of the conducted t-tests. However, certain tendencies emerged, which were (at least partly) consistent with our hypotheses related to the connection between reference production and cognitive abilities. Children in cluster B tended to be slightly younger, exhibit lower ToM abilities, make more perseverative errors (based on the WCST) and be less able to maintain set (based on the WCST) than children in cluster A. Children in cluster A had lower updating skills (based on the 2-back task). In particular, the two clusters differed significantly from each other only in their ability to maintain set (based on the WCST; Table 3). We did not find any differences in language proficiency between the clusters.
The primary and only significant result thus concerned children’s ability to maintain set. In Section 2, we hypothesised that reduced sustained attention (failure to maintain set in the WCST) may lead to a rapid decay of the accessibility of a referent. This decay would correspond to the use of more explicit REs than required by the discourse context. The children in cluster B tended to have lower sustained attention than the children in cluster A (Table 3). Crucially, the children in cluster B produced a greater number of full pronouns for referent maintenance than the children in cluster A. This pattern was observed only in Italian. In general, full pronouns are redundant in contexts of referent maintenance, where null pronouns or clitics could be used without leading to ambiguity. Therefore, this result supports our hypothesis related to the relationship between lower sustained attention and the production of overspecified, redundant REs. As of yet no study has investigated this relationship. In this sense, this study managed to fill this gap.
Other results show trends in the expected direction, but no significant differences between the two clusters’ cognitive measures. Therefore, these trends will only be mentioned briefly, accompanied by an explicit call for further research with larger sample sizes. First, we predicted children with lower ToM abilities to use underspecified reduced forms for referent reintroduction (Kuijper et al., Reference Kuijper, Hartman and Hendriks2015). Cluster B showed non-significantly lower ToM scores, which could motivate their tendency to produce a smaller number of definite full nouns in referent reintroduction across their two languages. Hendriks (Reference Hendriks2016) argues that the automatisation of ToM in reference production improves with increasing cognitive maturity. The observation that the children in cluster B tended to be younger than the children in cluster A could support this claim. However, note that age did not differ significantly between the groups either, and that the children had a wide age range, which may be a confound in the current study.
Second, we predicted that children with limited updating abilities would use more reduced forms in contexts were the use of explicit forms would be more appropriate in reintroduction contexts (in line with Serratrice & De Cat, Reference Serratrice and De Cat2020; Van Rij et al., Reference Van Rij, Van Rijn and Hendriks2013). Some evidence for this could be found in the ‘risky’ use of REs by children in cluster A, who had non-significantly lower updating scores (based on the 2-back task) and also produced more null pronouns for referent reintroduction in both Greek and Italian than the children in cluster B. On the contrary, we found no evidence for our hypothesis related to the connection between reduced updating abilities and the production of overspecified REs in referent maintenance (Torregrossa et al., Reference Torregrossa, Andreou, Bongartz and Tsimpli2021 for different results). However, it should be noted that the difference in updating abilities between the children in cluster A and the children in cluster B may not be large enough to affect the production of REs in maintenance contexts. The sample considered in Torregrossa et al. (Reference Torregrossa, Andreou, Bongartz and Tsimpli2021) showed a much larger variation in updating abilities than the sample considered here.
Finally, we predicted children with reduced cognitive flexibility to stick to certain referencing strategies to a greater extent than children with higher cognitive flexibility in referent reintroduction. Children in cluster B exhibited non-significantly reduced cognitive flexibility (perseverative errors in the WCST) compared to the children in cluster A, and they also tended to express referent maintenance in subject position (by using null pronouns) in Greek to a greater extent than the children in cluster A, who expressed referent maintenance also in object position (by using clitics). More research is needed to confirm these trends.
5.4. Conclusion
In conclusion, despite the shortcomings of the methodological approach discussed in Section 5.2, these results show the potential of cluster analysis for the understanding of inter-speaker variation in reference production, without the need for comparison to a norm or monolingual control group. The method identified two groups with distinct reference patterns. Although they scored comparably on language background variables, we found that the group of children with decreased sustained attention tended to produce more overspecified REs. Overall, the results show that the children had a good mastery of reference in both of their languages, but different reference profiles could still be identified. This approach may in the future be extended to other types of speakers and to domains beyond reference production.
Supplementary material
The supplementary material for this article can be found at http://doi.org/10.1017/langcog.2024.48.
Data availability statement
Supplementary materials as well as the data and analysis code are available at https://osf.io/m2kwq/?view_only=7fe2391c5b2f4ff5b64828af0f8d155c.