Introduction
The early years of childhood are marked by rapid changes in communication and language skills. Early language acquisition is associated with socio-emotional, cognitive, and later academic achievement (Hohm et al., Reference Hohm, Jennen-Steinmetz, Schmidt and Laucht2007). For example, childhood language is key to the development of socioemotional regulation abilities, laying the foundation for successful social interactions later in life (Rose et al., Reference Rose, Lehrl, Ebert and Weinert2018). Moreover, infants’ language within the first year of life has implications for cognitive performance and academic achievement at the end of primary school (Hohm et al., Reference Hohm, Jennen-Steinmetz, Schmidt and Laucht2007). Given the importance of language development for later developmental outcomes, as well as for concurrent communication, it is important to understand the factors that predict individual differences in language acquisition from a young age.
A substantial body of literature details how language acquisition is shaped by a combination of external and internal factors (Paradis, Reference Paradis2011; Unsworth et al., Reference Unsworth, Hulk and Marinis2011; Sun et al., Reference Sun, Steinkrauss, Tendeiro and de Bot2016; Sun et al., Reference Sun, Yin, Amsah and O’Brien2018; Sun et al., Reference Sun, Low and Chua2022). External factors, such as family socioeconomic status (SES) and maternal education, are substantial predictors of children’s language development (e.g., Sun et al., Reference Sun, Steinkrauss, Tendeiro and de Bot2016; Sun et al., Reference Sun, Yin, Amsah and O’Brien2018). For instance, children from higher SES households and with higher educated parents tend to have advanced language outcomes (e.g., Hart & Risley, Reference Hart and Risley1995; Hoff-Ginsberg, Reference Hoff-Ginsberg1998). In addition to SES, other external factors, such as birth order and maternal concerns, also predict child language acquisition (Reese et al., Reference Reese, Keegan, McNaughton, Kingi, Carr, Schmidt and Morton2018; Zhang et al., Reference Zhang, Ballard, Lee, Lee, Schmidt and Reese2024; Hoff-Ginsberg, Reference Hoff-Ginsberg1998). Research has shown that first-born children typically excel in vocabulary size and grammatical complexity (Reese et al., Reference Reese, Keegan, McNaughton, Kingi, Carr, Schmidt and Morton2018), whereas later-born children often demonstrate greater pronoun usage and conversational proficiency (Hoff-Ginsberg, Reference Hoff-Ginsberg1998). Furthermore, parental concerns about children’s speech and hearing have been consistently linked to language development outcomes. Children whose mothers expressed concerns were more likely to experience language delays compared to those whose mothers reported no concerns (Klee et al., Reference Klee, Pearce and Carson2000; Zhang et al., Reference Zhang, Ballard, Lee, Lee, Schmidt and Reese2024; Reese et al., Reference Reese, Keegan, McNaughton, Kingi, Carr, Schmidt and Morton2018).
Although the role of external factors in predicting child language is evident, a growing body of research has highlighted the importance of internal child characteristics, such as temperament, in predicting language development (Dixon & Smith, Reference Dixon and Smith2000; Prior et al., Reference Prior, Bavin, Cini, Reilly, Bretherton, Wake and Eadie2008; Peterson et al., Reference Peterson, Waldie, Mohal, Reese, Atatoa Carr, Grant and Morton2017a). Child temperament, which is an individual aspect that relates to infants’ and children’s engagement with the environment, plays an important role in language acquisition (Rothbart & Derryberry, Reference Rothbart and Derryberry1981). Temperament factors are proposed to be related to cognitive processing and to the quantity and quality of language experienced via parent-child interactions, which are critical for language learning (Salley & Dixon, Reference Salley and Dixon2007). Nevertheless, research into the connection between temperament and language has mainly concentrated on monolingual English speakers. Only a small number of studies have examined the associations between temperament and language acquisition for bilingual/multilingual speakers across their languages (see Laake & Bridgett, Reference Laake and Bridgett2018; Kang & Yim, Reference Kang and Yim2022). Extending this research to bilingual/multilingual speakers is crucial because there could be different patterns of prediction for non-English languages, or different patterns across their multiple languages. This extension could provide insights into the unique cognitive processes involved in bilingual/multilingual language learning and development, offering rich implications for parents, clinicians, and policymakers.
Therefore, the main objective of this study was to examine the relationship between early temperament and the language development of bilingual/multilingual speakers across two or more languages. Specifically, we were interested in the role of temperament for bilingual and multilingual English-MandarinFootnote 1 and English-CantoneseFootnote 2 speakers’ vocabulary and syntax development as part of the Growing Up in New Zealand cohort (Morton et al., Reference Morton, Atatoa Carr, Grant, Robinson, Bandara, Bird and Wall2013).
Temperament and Language
Temperament is defined as “constitutional differences in reactivity and regulation influenced by heredity, maturation, and experience” (Rothbart & Derryberry, Reference Rothbart and Derryberry1981, p. 37). This definition emphasises that temperament is to an extent genetically determined but that its characteristics may change as an individual interacts with their environment over time. Like temperament, language development is influenced by social-linguistic contexts as well as biological processes (Laake & Bridgett, Reference Laake and Bridgett2018). As language evolves during early childhood, some of the individual differences in language acquisition may be explained by early temperament.
A widely used approach to temperament conceptualises three main components (Rothbart & Derryberry, Reference Rothbart and Derryberry1981; Putnam et al., Reference Putnam, Helbig, Gartstein, Rothbart and Leerkes2014): positive affect/surgency (PAS), negative emotionality (NEG), and orienting/regulatory capacity (ORC). PAS points to a child’s propensity to seek out and respond positively to environmental stimuli. This dimension encapsulates the regularity of a child’s expressions of joy and laughter, their pleasure derived from high-energy activities like playing on a slide, and their anticipation for future events. NEG is indicative of a child’s inclination towards experiencing various negative emotions, including feelings of anger, fear (including shyness), discomfort, and sadness. Lastly, ORC pertains to a child’s capacity to focus and regulate their attention. It also involves their preference for calm activities such as cuddling or being easily soothed and is linked with the development of effortful control later on (Rothbart & Derryberry, Reference Rothbart and Derryberry1981; Putnam et al., Reference Putnam, Helbig, Gartstein, Rothbart and Leerkes2014).
Emotional expression, tapped by the two dimensions of PAS and NEG, is a significant predictor of toddlers’ language development. Usai et al. (Reference Usai, Garello and Viterbori2009) identified that children aged 2 and 4 with high positive emotionality (e.g., positive mood) tend to have better linguistic skills compared to those who displayed negative emotionality (e.g., difficult temperament). Higher levels of PAS at seven and ten months respectively, such as more smiles and laughter, are also associated with improved receptive language skills in ten-month-olds (Dixon & Smith, Reference Dixon and Smith2000) and better expressive language abilities by age 14 months (Laake & Bridgett, Reference Laake and Bridgett2014). Furthermore, children who experienced increased expressions of joy and displays of extraversion in the first year of life had enhanced expressive language skills at approximately age 2 to 6 (Moreno & Robinson, Reference Moreno and Robinson2005, Pérez-Pereira et al., Reference Pérez-Pereira, Fernández, Resches and Gómez-Taibo2016). In contrast to positive emotionality, infants displaying NEG tend to experience slower language acquisition (Salley & Dixon, Reference Salley and Dixon2007; Usai et al., Reference Usai, Garello and Viterbori2009; Cioffi et al., Reference Cioffi, Griffin, Natsuaki, Shaw, Reiss, Ganiban, Neiderhiser and Leve2021). For instance, Prior et al. (Reference Prior, Bavin, Cini, Reilly, Bretherton, Wake and Eadie2008) observed that sociable/non-shy children outperformed shy children on two outcome measures (expressive vocabulary and pre-linguistic development) at age two, highlighting the negative role of shy temperament for language outcomes. Previous research has also indicated that infants rated as having a more difficult temperament at nine months tended to have lower global language scores at 21 months (Dixon & Smith, Reference Dixon and Smith2000; Salley & Dixon, Reference Salley and Dixon2007). This link has been supported by a recent adoption study, where the parent and child were not genetically related. Children who displayed an increase in negative emotions at nine months had lower language skills at age 2 and 3, which was predictive of their language abilities at age 7 (Cioffi et al., Reference Cioffi, Griffin, Natsuaki, Shaw, Reiss, Ganiban, Neiderhiser and Leve2021).
Despite the weight of the evidence suggesting that positive affect is beneficial for early language development and negative affect is disadvantageous, there is a body of research that yields inconsistent results (Moreno & Robinson, Reference Moreno and Robinson2005; Laake & Bridgett, Reference Laake and Bridgett2014). For example, Wolfe and Bell (Reference Wolfe and Bell2007) found that higher positive affect (impulsivity and high pleasure) at eight months was negatively associated with receptive vocabulary at the age of 4 and 6. Furthermore, some studies showed that more anger and distress were associated with better vocabulary at 18 months and at ages 2 and 7 (Spinelli et al., Reference Spinelli, Fasolo, Shah, Genovese and Aureli2018; Moreno & Robinson, Reference Moreno and Robinson2005). One possible explanation could be that higher emotional expression (both positive and negative) might aid language development by providing opportunities for children to form bonds with others, leading to more language learning opportunities (Dixon & Smith, Reference Dixon and Smith2000). Nevertheless, Bloom (Reference Bloom, Stein, Leventhal and Trabasso1990) argued from a different perspective that more effective language acquisition occurred in the presence of neutral emotional states instead of positive or negative emotional states. Early language learning might be facilitated by more time in a neutral state to enable the reflective stance necessary to construct the meaning of words.
In a similar vein, although some studies have found a positive link between ORC and language development, the overall pattern of findings remains inconsistent. Several studies have reported that higher scores in the duration of orienting and persistence during the first year of life have been associated with larger vocabularies at 21 months (Dixon & Smith, Reference Dixon and Smith2000) and improved syntax skills by age two (Spinelli et al., Reference Spinelli, Fasolo, Shah, Genovese and Aureli2018). In support of this result, Peterson et al. (Reference Peterson, Waldie, Mohal, Reese, Atatoa Carr, Grant and Morton2017a) found that children in the Growing Up in New Zealand cohort whose mothers reported better attention spans (Orienting Capacity; a measure of attentional control from the ORC dimension) concurrently showed higher communication skills at nine months. One explanation could be that children who can pay better attention to objects and people are more likely to focus on linguistic relevant cues and are thus more likely to acquire language more quickly (Spinelli et al., Reference Spinelli, Fasolo, Shah, Genovese and Aureli2018). However, Pérez-Pereira et al. (Reference Pérez-Pereira, Fernández, Resches and Gómez-Taibo2016) reported that the predictive power of different dimensions of attentional control for language acquisition is inconsistent. They found that soothability and low-intensity pleasure (subcategories of ORC; Putnam et al., Reference Putnam, Helbig, Gartstein, Rothbart and Leerkes2014) were predictive of vocabulary production, but attentional control was not predictive of any language outcomes. This disparity in results demonstrates that the associations with language development may depend on the specific aspects of ORC being measured. The current study uses a five-factor model of temperament measure (Peterson et al., Reference Peterson, Waldie, Mohal, Reese, Atatoa Carr, Grant and Morton2017a), which separates attentional control and soothability/low-intensity pleasure into two categories: Orienting Capacity (OC) and Affiliation/Regulation, respectively. This approach allows us to examine the associations between these specific aspects of temperament and early vocabulary acquisition and word combination skills.
Moreover, studies on the relationship between temperament and language have primarily focused on monolingual English speakers, with few studies investigating links between bilingual children’s temperament and language development. In this small body of research, however, temperament has been associated with variation in bilingual children’s language outcomes. For instance, Laake and Bridgett (Reference Laake and Bridgett2018) noted that positive affect showed a trend toward improved language development in bilingual children, but the relationship was not statistically significant due to a small sample size for bilingual children (only 14.41% of the participating families were bilingual and the dominant language of these children was not specified). Therefore, their final analyses focused on temperament in relation to children’s English receptive and expressive vocabularies, controlling for bilingualism. In a more nuanced analysis, Kang and Yim (Reference Kang and Yim2022) investigated the association between temperament and vocabulary development of 3–6-year-old Korean monolingual and English-Korean bilingual children. Bilingual children were defined as children raised in families where the mother’s first language was English and the father’s first language was Korean, without specifying children’s dominant language. The study revealed a positive correlation between effortful control – associated with infants’ orienting capacity – and the size of the children’s Korean vocabulary. However, no such correlation was found with the children’s English vocabulary skills. Considering the disparate findings from prior studies on different dimensions of temperament for language development, along with the research gap for bilingual samples, it is important to note that research in this area is still emerging. Therefore, this study sought to contribute to the complex relationships between temperament and language across children’s multiple languages.
The Present Study
The present study draws upon data from Chinese-speaking children from the Growing Up in New Zealand cohort of infant temperament at nine months and children’s vocabulary and word combinations at age two to investigate how early temperament contributes to bilingual and multilingual children’s language acquisition across languages. Our first aim was to establish associations between infant temperament and children’s later vocabulary in their Chinese language, either Mandarin or Cantonese, and their word combinations. The word combination measure was across all of the children’s languages. Given that a sizeable subset of children in each group spoke English as another language (68% of Mandarin speakers and 70% of Cantonese speakers), our second aim was to investigate the correlations separately between children’s temperament and their English vocabulary. Our hypothesis, associated with both aims, was thus that Mandarin- and Cantonese-speaking children’s vocabulary scores in their Chinese language and in English would differ as a function of their temperament. Based on previous research on the role of orienting capacity and emotionality in language development (Dixon & Smith, Reference Dixon and Smith2000; Prior et al., Reference Prior, Bavin, Cini, Reilly, Bretherton, Wake and Eadie2008; Laake & Bridgett, Reference Laake and Bridgett2018; Peterson et al., Reference Peterson, Waldie, Mohal, Reese, Atatoa Carr, Grant and Morton2017a), we predicted that children with higher PAS, lower NEG, and higher OC at nine months of age would have larger vocabularies and more advanced word combinations at age two across languages. We had two exploratory hypotheses, also related to both aims, that higher levels of affiliation/regulation and higher levels of fear (see Peterson et al., Reference Peterson, Waldie, Mohal, Reese, Atatoa Carr, Grant and Morton2017a) would also be associated with more advanced vocabulary and word combinations across languages.
Method
Participants
The Growing Up in New Zealand study is a comprehensive longitudinal study involving 6853 children and their families, representing a diverse range of ethnicities and socioeconomic status (Morton et al., Reference Morton, Atatoa Carr, Grant, Robinson, Bandara, Bird and Wall2013). The recruitment stage started in 2009–2010, with mothers and their partners participating before the birth of their children. When the children reached approximately two years of age, data was collected from 6327 mothers. Out of the total mother respondents who were able to fill out the Mandarin/Cantonese checklist without the need for an interpreter or interviewer, 196 (3% of the larger sample) indicated that their child understood Mandarin, whereas 71 (1%) reported that their child understood Cantonese. Analyses were performed on 158 children (81% of the Mandarin-speaking sample) and 57 children (80% of the Cantonese-speaking sample) whose mothers also reported on the temperament questionnaire (IBQ-R VSF) at a previous nine-month data wave. To classify SES in this study, we adopted the widely recognised 2006 New Zealand Index of Deprivation, which draws upon census data to assess eight key aspects of socioeconomic well-being (Salmond et al., Reference Salmond, Crampton and Atkinson2007). Scores on this scale range from 1 (least deprived) to 10 (most deprived), and deprivation levels were classified as low (≤3), medium (4–7), and high (8–10).
Of Mandarin speakers, 46 (29%) were monolingual, 79 (50%) were bilingual, and 33 (21%) were multilingual. Among the bilingual and multilingual children, 108 (68%) were English-Mandarin speakers, 20 (13%) Cantonese, 3 (2%) Korean, 3 (2%) Wu, 2 (1%) Te reo Māori, 2 (1%) Japanese, 2 (1%) German, and 1% each of other languages such as Hindi, Arabic, Bahasa Indonesian, Shan, French, Min, and Bengali.
Of Cantonese speakers, 13 (23%) were monolingual, 23 (40%) were bilingual, and 21 (37%) were multilingual. Among the bilingual and multilingual children, 40 (70%) were English-Cantonese speakers, 20 (35%) in Cantonese, 2 (4%) in Spanish, and other languages such as Japanese (2%), Hindi (2%), Arabic (2%), Wu (2%), and Khmer (2%). Note that the 20 Mandarin-Cantonese speakers were the same children in each subsample.
Procedure
Mothers reported on children’s temperament at nine months in a Computer Assisted Personal Interview (CAPI) during a home visit, and on children’s language development at age two years in a CAPI during a home visit. Demographic variables, such as gender, socioeconomic status (area-level deprivation), maternal education, birth order, language status, and maternal concerns, were reported at antenatal and two-year data waves (see Zhang et al., Reference Zhang, Ballard, Lee, Lee, Schmidt and Reese2024 for more detail).
Temperament. Child temperament was measured using the Infant Behavior Questionnaire-Revised Very Short form (IBQ-R VSF; Putnam et al., Reference Putnam, Helbig, Gartstein, Rothbart and Leerkes2014) when children were approximately nine months old. Mothers were presented with separate show cards with possible responses for each question during these interviews, and then the interviewer read out each question and recorded the mother’s answers on a computer. The IBQ-R VSF takes approximately 10–15 minutes to complete and has been widely used in large longitudinal studies. The IBQ-R VSF contains 37 items categorised into three broad scales: (1) Positive affect/Surgency (Activity Level, Smiling and Laughter, Vocal Reactivity, Approach, High-Intensity Pleasure, and Perceptual Sensitivity); (2) Negative Emotionality (Fear, Distress to Limitations, Sadness, and negatively loading Falling Reactivity), and (3) Orienting and Regulatory Capacity (Duration of Orienting, Soothability, Cuddliness/Affiliation, and Low-Intensity Pleasure) (Putnam et al., Reference Putnam, Helbig, Gartstein, Rothbart and Leerkes2014).
Thus, the original IBQ-R VSF (Putnam et al., Reference Putnam, Helbig, Gartstein, Rothbart and Leerkes2014) conceptualised temperament using three factors: PAS, NEG, and ORC. However, Peterson et al. (Reference Peterson, Waldie, Mohal, Reese, Atatoa Carr, Grant and Morton2017a, Reference Peterson, Mohal, Waldie, Reese, Atatoa Carr, Grant and Morton2017b) found that a five-factor model, which included two additional factors (Affiliation/Regulation and Fear), is statistically and conceptually a better fit for the broad and ethnically diverse Growing Up in New Zealand cohort when examining language outcomes of infants between the ages of 23 and 52 weeks than the three-factor model used in the original IBQ-R VSF. The five factors extracted were: PAS, NEG, OC, Affiliation/Regulation, and Fear. Specifically, the two new dimensions of Affiliation/Regulation and Fear were distinct from ORC and NEG on the original IBQ-R VSF. The Affiliation/Regulation factor is composed of six items associated with the child’s soothability, cuddliness, and low-intensity pleasure, thus largely encompassing the regulatory aspect of the original ORC factor. The Fear factor is composed of nine items that were initially part of the NEG factor of the original IBQ-R VSF. Each of these elements corresponds to the child’s negative reaction when encountering an unfamiliar adult (Peterson et al., Reference Peterson, Waldie, Mohal, Reese, Atatoa Carr, Grant and Morton2017a). Furthermore, no ethnic differences were found on the IBQ-R VSF between Asian and non-Asian children in the Growing Up in New Zealand sample, indicating that the five-factor temperament model was validated for New Zealand Asian children (Peterson et al., Reference Peterson, Mohal, Waldie, Reese, Atatoa Carr, Grant and Morton2017b).
Therefore, final scores were calibrated from the five factors (PAS, NEG, OC, Affiliation/Regulation, and Fear) that Peterson et al. (Reference Peterson, Waldie, Mohal, Reese, Atatoa Carr, Grant and Morton2017a) identified for the Growing Up in New Zealand sample when children were approximately nine months old (see Table 1). The questions were the same as for the original three-factor IBQ-R VSF. Using these original questions, the updated five-factor structure was found to be a better fit for the Growing Up in New Zealand cohort’s data and was longitudinally validated (see Peterson et al., Reference Peterson, Waldie, Mohal, Reese, Atatoa Carr, Grant and Morton2017a, Reference Peterson, Mohal, Waldie, Reese, Atatoa Carr, Grant and Morton2017b for validation for Growing Up in New Zealand). Mothers were asked to indicate how often their baby displayed a particular behaviour over the past seven days. The temperament scores were treated as continuous variables for the purpose of our analysis. Despite technically being ordinal in nature, responses were recorded for each item on the following scale: 1 (does not apply), 2 (never), 3 (very rarely), 4 (less than half the time), 5 (about half the time), 6 (more than half the time), 7 (almost always), and 8 (always). The scale exhibited a clear order and consistent gaps between values, allowing us to assume continuity (see Williams, Reference Williams, Atkinson, Delamont, Cernat, Sakshaug and Williams2020).
Expressive language. The Growing Up in New Zealand study included an evaluation of children’s expressive language skills when they were around two years old. Initially, mothers were asked to list all the languages their children could comprehend, including New Zealand English. Then for each language listed, mothers were shown a card displaying a numbered list of words specific to that language. Child vocabulary in Mandarin and Cantonese, and word combination skills (an indicator of syntax development) in any language, were measured using the two new adapted versions of the MacArthur-Bates Communicative Development Inventory (CDI) short form in Mandarin and Cantonese for New Zealand children (Zhang et al., Reference Zhang, Ballard, Lee, Lee, Schmidt and Reese2024). For children who also spoke English, their English vocabulary was also assessed with the NZ CDI:II short form (Reese et al., Reference Reese, Keegan, McNaughton, Kingi, Carr, Schmidt and Morton2018). The vocabulary assessment comprised 100 questions, each one asking parents if their child used a specific word (for example, the child says ‘water’ in Mandarin). A score of one was assigned if the child expressed the word, and a zero was assigned if they didn’t. The total score was calculated by adding up all individual word scores.
The measure for syntax development consisted of one question asking parents whether their children could combine words yet in any language (i.e., Has your child begun to combine words yet (in any language), such as “more banana” or “doggie bite”?). A one was given when the mother reported the child was not yet combining words, a two was given when the child sometimes combined words, and a three was given when the child combined words often.
Analysis Plan
We used regression models to examine the relationships between temperament (PAS, NEG, OC, Affiliation/Regulation, Fear) and language outcomes. Zhang et al. (Reference Zhang, Ballard, Lee, Lee, Schmidt and Reese2024) investigated how various demographic predictors (e.g. gender, socioeconomic status, birth order, maternal education, language status, maternal concerns) influence vocabulary and grammatical development in these children. To ensure the robustness of our findings, we controlled for the exact same significant demographic predictors (p < .05) as identified by Zhang et al. (Reference Zhang, Ballard, Lee, Lee, Schmidt and Reese2024) in each of the final models, in the same sample of New Zealand Mandarin- and Cantonese-speaking children.
Vocabulary predictors for Mandarin speakers included language status (monolingual vs. bilingual) and maternal concerns, and for Cantonese speakers included language status (monolingual vs. bilingual), maternal education, and maternal concerns. Vocabulary predictors for English-Mandarin speakers included maternal education, and no significant predictors for English-Cantonese speakers. Word combination predictors for Mandarin speakers included birth order, area-level deprivation, and maternal concerns, and for English-Mandarin speakers included maternal education. For the Cantonese and English-Cantonese speakers, none of the variables emerged as significant predictors of word combinations. Not all demographic variables included in the regression analyses were treated as control variables. Language status, for example, was a significant predictor of vocabulary outcomes but was not included as a control variable in the ordinal regressions because it did not significantly predict word combination scores. Similarly, deprivation level and birth order were significant predictors of word combinations but not vocabulary outcomes, and thus were not included as control variables in the linear regressions predicting vocabulary.
Firstly, we employed multiple linear regression models to examine the relationships between temperament variables and vocabulary scores, given that the vocabulary scores are a continuous variable. Our hierarchical linear regression analyses involved entering demographic variables first (Step 1), followed by temperament variables (Step 2). Given these variables were entered hierarchically, we reported R 2, change in R 2, and F tests to determine the added contribution for each step of the analysis (Hahs-Vaughn, Reference Hahs-Vaughn2017). All analyses used a significance level (α) of < 0.05.
Secondly, given that the word combination variable was ordinal, multivariate ordinal regression models were employed to determine the logit of higher levels of word combinations as a function of temperament variables (Osborne, Reference Osborne2015). Our hierarchical ordinal logistic regression analyses involved entering demographic predictors and maternal concerns first (Model 1), followed by estimating the logit of word combination scores as a function of temperament variables (Model 2). An odds ratio (OR) above 1 suggests a higher likelihood of the event in the first group, while an OR below 1 indicates a lower likelihood. Since the variables were entered in blocks, we report the Nagelkerke pseudo-R 2 and Wald χ2 tests, to determine the added contribution at each step of the analyses (Hahs-Vaughn, Reference Hahs-Vaughn2017).
Results
Results of the regression models predicting vocabulary and word combinations are first presented for children’s Chinese language (Mandarin or Cantonese) and second for children’s English language if they are English-Mandarin or English-Cantonese speakers.
Temperament as a Predictor of Children’s Chinese Oral Language
The first hypothesis of this study was that Mandarin- and Cantonese-speaking children’s vocabulary and word combination scores would vary as a function of their temperament. Four separate models were computed, with two separate hierarchical linear regression models for Mandarin and Cantonese vocabulary (see Table 3 for a summary) and two separate hierarchical ordinal regression models for Mandarin and Cantonese speakers’ word combinations (see Table 4 for a summary). The word combination variable was assessed across all of the children’s languages.
Prior to conducting the regression analyses, we checked the assumptions of linearity, independence of errors, and homoscedasticity for linear regression, and the proportional odds assumption for ordinal logistic regression. No serious violations were detected. All variables in the model had a Variance Inflation Factor (VIF) of less than 5, indicating no problematic multicollinearity. Furthermore, we conducted a test for parallel lines and found no issues. Therefore, all predictors were retained and included in the regression models.
Regression analyses for Mandarin speakers. Only Mandarin speakers whose mothers responded to the temperament questionnaires (N = 158) were included in this analysis. Table 2 contains descriptive statistics for children’s Mandarin vocabulary and overall word combination skills. A hierarchical linear regression analysis was performed on Mandarin vocabulary, with language status (monolingual) and maternal concerns in the first step, and temperament in the second step (see Table 3). The first step was significant, with monolingual language status and maternal concerns significantly associated with Mandarin vocabulary, accounting for approximately 30% of the variance in Mandarin vocabulary. Thus, for Mandarin speakers, this sample size was sensitive enough to detect a large effect (Cohen’s f2=0.31). Children who were monolingual Mandarin speakers and whose parents had fewer concerns had larger Mandarin vocabularies. None of the temperament variables made significant contributions to the model.
a English-Mandarin and English-Cantonese analyses refer to English vocabulary of the Mandarin and Cantonese samples.
Note. The significant demographic variables for Mandarin and Cantonese speakers were reported in Zhang et al. (Reference Zhang, Ballard, Lee, Lee, Schmidt and Reese2024). PAS = positive affect/surgency, NEG = negative emotionality, and OC = orienting capacity. β = standardised coefficient beta for each variable.
* p < .05.
** p < .01.
To investigate the relationship between Mandarin speakers’ temperament and word combination scores, a hierarchical ordinal regression analysis was conducted (see Table 4), with birth order, maternal concerns, and deprivation level (most deprived) entered first, and temperament variables added in a second step. The chi-square test of model fit indicated that both models were good fits to the data. Birth order (Wald χ2 (1) = 6.73, p = .01), maternal concerns (Wald χ2 (1) = 7.20, p = .01), and OC remained significant in the final model. An increase in OC scores was associated with a 45% increase in the odds of children combining words in any language, with an odds ratio of 1.45 (95% CI [1.09, 1.93]), Wald χ2 (1) = 6.32, p = .01. The CI suggests that the true effect is likely to be positive and the effect size could range from medium to large. Additionally, the relatively narrow width of the CI (0.84) indicates good precision in the estimate of the effect size.
Note. The significant demographic variables for Mandarin and Cantonese speakers were reported in Zhang et al. (Reference Zhang, Ballard, Lee, Lee, Schmidt and Reese2024). PAS= positive affect/ surgency, NEG = negative emotionality, and OC = orienting capacity. Est = parameter estimate; SE = standard error; OR (95%CI) = An estimate of the odds ratio with a 95% Confidence Interval for each parameter.
* p < .05.
** p < .01.
Regression analyses for Cantonese speakers. Only Cantonese speakers whose mothers responded to the temperament questionnaires (N = 57) were included in this analysis. Table 2 contains descriptive statistics for children’s Cantonese vocabulary and overall word combination skills. To investigate the relationship between temperament and Cantonese vocabulary, a hierarchical linear regression analysis was performed (see Table 3), with monolingual language status, maternal concerns, and maternal education level in the first step, and temperament in a second step. The first model was significant, with monolingual language status, maternal concerns, and maternal education accounting for approximately 43% of the variance in Cantonese vocabulary. Temperament, specifically Fear at nine months, explained an additional 13% of the variance in Cantonese vocabulary at two years. For Cantonese speakers, this sample size was sensitive enough to detect a large effect (Cohen’s f2= 0.56). This result suggests that higher levels of fear at nine months are associated with better Cantonese vocabulary at two years, after controlling for monolingual language status, maternal concerns, and maternal education.
To investigate the relationship between Cantonese speakers’ temperament and word combination score, a non-hierarchical multivariate ordinal regression analysis was conducted (see Table 4). This different statistical method for analysing word combination scores from the Mandarin and Cantonese speakers was based on the validation findings of Zhang et al. (Reference Zhang, Ballard, Lee, Lee, Schmidt and Reese2024). Since there were no significant demographic or maternal concern predictors for word combinations among Cantonese speakers, we added all temperament variables simultaneously to the regression model. NEG emerged as a significant predictor in the ordinal regression model, with an odds ratio of 0.50 (95% CI [0.27, 0.93]), Wald χ2 (1) = 4.78, p = .03). For every one-point increase in NEG, the odds of having more advanced word combination in any language were reduced by 50%. The CI suggests that the true effect is likely to be negative and could range from a small to a large effect size. The CI of NEG (0.66) is relatively narrow, indicating good precision in effect size estimation. However, the model was not statistically significant, indicating that it did not significantly improve the prediction of word combinations compared to a model with no predictors. Given the sample size of 57, these results should be interpreted with caution as they may not be robust across larger samples of Cantonese speakers.
English-speaking Subsample: Temperament and Oral Language
Given that a sizeable subset of bilingual and multilingual children in each group spoke English as another language (68% of Mandarin and 70% of Cantonese), we then examined the associations separately between these children’s temperament (PAS, NEG, OC, Affiliation/Regulation, Fear) and their English vocabulary and word combination scores. We predicted that variation in language scores due to temperament would also be observed among English-speaking children within the Mandarin and Cantonese samples. Again, we used two different types of regression models to examine the relationships between temperament and language outcomes: Two linear regression models were used to explore the correlation between temperament and English vocabulary scores (see Table 5 for a summary), and two multivariable ordinal regression models were used to calculate the odd ratios between temperament and word combinations (see Table 6 for a summary). No serious VIF, collinearity tolerance, or parallel lines issues were observed.
Note. The significant demographic variables for Mandarin and Cantonese speakers were reported in Zhang et al. (Reference Zhang, Ballard, Lee, Lee, Schmidt and Reese2024). PAS= positive affect/ surgency, NEG = negative emotionality, and OC = orienting capacity. β = standardised coefficient beta for each variable.
* p < .05.
** p < .01.
a. English-Mandarin and English-Cantonese analyses refer to English vocabulary of the Mandarin and Cantonese samples.
Note. The significant demographic variables for Mandarin and Cantonese speakers were reported in Zhang et al. (Reference Zhang, Ballard, Lee, Lee, Schmidt and Reese2024). PAS= positive affect/ surgency, NEG = negative emotionality, and OC = orienting capacity. Est = parameter estimate; SE = standard error; OR (95%CI) = An estimate of the odds ratio with a 95% Confidence Interval for each parameter.
* p < .05.
** p < .01.
Regression analyses for English-Mandarin speakers. Only Mandarin speakers who used English as an additional language (N = 108) were included in the analysis. Table 2 contains descriptive statistics for children’s English vocabulary and overall word combination skills. A hierarchical linear regression analysis was performed to examine the associations between temperament and English vocabulary (see Table 5), with maternal education in the first step, and temperament variables in the second step. The first model was significant, with maternal education accounting for approximately 7% of the variance in English vocabulary. In the final model, maternal education remained a significant predictor. This model accounted for approximately 17% of the variance in English vocabulary scores, an increase of 10% from the first model. For English-Mandarin speakers, this sample size was sensitive enough to detect a medium effect (Cohen’s f2 = 0.17). Of the temperament factors, only Fear contributed uniquely to the variance in English vocabulary (β = .22, p < .05). Children who displayed more fear have larger English vocabularies than those who displayed less fear, after controlling for maternal education.
To investigate the relations between English-Mandarin speakers’ temperament and word combination score, a hierarchical ordinal regression analysis was conducted (see Table 6), with maternal education, birth order, maternal concerns, and deprivation level (most deprived) entered first (Model 1), and temperament variables added separately (Model 2). The chi-square test of model fit indicated that the model was a good fit for the data. Deprivation level (Wald χ 2 (1) = 7.75, p = .01), maternal concerns (Wald χ 2 (1) = 5.18, p = .02), and maternal education (Wald χ 2 (1) = 7.28, p = .03) all remained significant in the final model. The effect of OC was uniquely associated with a 79% increase in the odds of children combining words in any language, with an odds ratio of 1.79 (95% CI [1.20, 2.67]), Wald χ2 (1) = 8.12, p = .00. The CI suggests that the true effect is likely to be positive and could range from a medium to large effect size. The moderate width of CI of OC (1.47) suggests a tolerable amount of uncertainty in effect size estimation.
Regression analyses for English-Cantonese speakers. Only Cantonese speakers who used English as an additional language (N = 40) were included in the analysis. We performed a linear regression analysis to investigate the correlation between temperament and English vocabulary (see Table 5). Given that Zhang et al. (Reference Zhang, Ballard, Lee, Lee, Schmidt and Reese2024) reported no demographic and maternal concerns for English vocabulary among English-Cantonese speakers, we added all temperament variables simultaneously to the linear regression model. For English-Cantonese speakers, this sample size was sensitive enough to detect a small effect (Cohen’s f2 = 0.05). However, the model was not significant, and no individual variables were significant predictors.
To investigate the relationship between English-Cantonese speakers’ temperament and word combination score, a multivariate ordinal regression analysis was conducted (see Table 6). As Zhang et al. (Reference Zhang, Ballard, Lee, Lee, Schmidt and Reese2024) reported no demographic and maternal concern predictors for English word combination among English-Cantonese speakers, we added temperament variables simultaneously to the model. Both PAS and NEG contributed uniquely to children’s word combinations. An increase in PAS was associated with a 3.29 times increase in the odds of children combining words in any language, Wald χ2 (1) = 4.10, p = .04. The CI for PAS (9.36) is extremely wide, indicating a lack of precision in the estimate of the effect size. An increase in NEG was associated with a 49% decrease in the odds of children combining words in any language (95% CI [0.25, 0.94]), Wald χ2 (1) = 4.62, p = .03. The true effect size is likely non-trivial in magnitude, ranging from a small to potentially large negative effect. The CI of NEG (0.69) is relatively narrow, suggesting good precision in the effect size estimation, but we need more data to get a more precise estimation of the true effect size. However, the chi-square test of model fit indicated that the model was not a good fit to the data. Despite the contributions of PAS and NEG, the model did not adequately represent the data, possibly due to a small sample size. Therefore, these results should be interpreted with caution as they may not be robust across larger samples of Cantonese speakers.
Discussion
The first aim of this study was to examine the relationship between early temperament and Chinese language acquisition among NZ Mandarin- and Cantonese-speaking children. The second aim was to investigate associations between children’s temperament and their English vocabulary, specifically focusing on the subsets of bilingual and multilingual children within the sample. The hypothesis of this study was that Mandarin- and Cantonese-speaking children’s vocabulary and word combination skills (assessed by a single word-combination item) would differ as a function of their temperament, and this variation in language scores due to temperament would also be observed among English-speaking children within the Mandarin and Cantonese samples. We predicted that children with more positive affect/surgency, less negative emotionality, and higher orienting capacity at nine months of age would be more likely to acquire advanced language at age two, within and across languages. Overall, our results largely supported hypotheses for children’s word combination skills but differed for their vocabulary development.
Specifically, we found that children with higher levels of positive affect/surgency and lower levels of negative emotionality had more advanced word combination skills across all their languages in the Cantonese sample, including the English speakers in the sample. Our findings align with previous research on monolingual English speakers, suggesting that children exhibiting high levels of positive emotionality often demonstrate better linguistic skills compared to those who displayed high levels of negative emotionality (Usai et al., Reference Usai, Garello and Viterbori2009; Dixon & Smith, Reference Dixon and Smith2000; Laake & Bridgett, Reference Laake and Bridgett2014). For instance, Bruce et al. (Reference Bruce, McFayden, Ollendick and Bell2022) found that infants who expressed greater negative affect produced significantly less complex syntax (measured by mean length utterances) in toddlerhood. Interestingly, our data further revealed that negative emotionality was a more consistent negative predictor of word combinations (an indicator of syntax) than was positive affect/surgency as a positive predictor of word combinations. Perhaps the quantity and quality of parental input are shaped more significantly by children’s displays of negative than positive emotionality (Hoff-Ginsberg, Reference Hoff-Ginsberg1998). Positive emotions are often seen as a sign that everything is going well, so they might not prompt parents to alter their input significantly. Nevertheless, this study relies on mothers’ reports of their children’s temperament, rather than observations made in a real-world setting. Future research should consider observing negative/positive emotionality during parent-child interactions in both naturalistic and laboratory environments to gain a more comprehensive understanding of the relationship between children’s early emotionality and language development.
Our results also supported our hypothesis that children who showed more robust orienting capacity in infancy had better word combination skills across languages in toddlerhood. Specifically, orienting capacity positively predicted word combination development among Mandarin speakers, as evidenced in both the full sample and the English-Mandarin subsample. This finding is consistent with previous research that infants with higher levels of orienting skills showed greater language development (Dixon & Smith, Reference Dixon and Smith2000; Peterson et al., Reference Peterson, Waldie, Mohal, Reese, Atatoa Carr, Grant and Morton2017a). One possible explanation could be that these children are paying better attention to adult input, facilitating children’s language development (Bloom, Reference Bloom, Stein, Leventhal and Trabasso1990; Spinelli et al., Reference Spinelli, Fasolo, Shah, Genovese and Aureli2018). This finding also aligns with the earlier findings on the full Growing Up in New Zealand cohort that infants’ orienting capacity positively correlated with their communicative gestures at nine months (Peterson et al. Reference Peterson, Waldie, Mohal, Reese, Atatoa Carr, Grant and Morton2017a). However, we did not find a positive role in orienting capacity for Cantonese speakers’ word combinations, so future research needs to establish why this association varies across samples.
Moreover, we failed to identify a significant role as predicted of positive affect/surgency, negative emotionality, or orienting capacity for children’s vocabulary acquisition. Children’s temperament in relation to vocabulary acquisition has been a field of controversy, with some studies suggesting that positive/negative emotionality might not play a significant role in vocabulary learning (Bloom, Reference Bloom, Stein, Leventhal and Trabasso1990). It is possible that the associations between temperament and vocabulary are more dependent on the content of the lexical items. For example, previous research shows that children who frequently use emotional language tend to have lower attention scores, whereas those who often use words related to perception exhibited higher scores in both positive and negative emotionality (Rollo & Sulla, Reference Rollo and Sulla2016). However, our study used short parental reports of only 100 words in each language. Although these short-report instruments are valid indicators of children’s overall vocabulary growth, future analyses could explore the temperament-vocabulary associations with larger samples of children’s lexicons.
One of our exploratory hypotheses was that fear would also be associated with more advanced vocabulary and word combinations across languages. In line with this hypothesis, fearful temperament was positively correlated with vocabulary in the Cantonese full sample and positively predicted English vocabulary acquisition in the English-Mandarin subsample. This result aligned with that of the full Growing Up in New Zealand cohort at the nine-month timepoint in which infants’ fearful temperament was concurrently associated with greater child communication development reported by parents across various cultural backgrounds (Peterson et al., Reference Peterson, Waldie, Mohal, Reese, Atatoa Carr, Grant and Morton2017a). The items measuring the Fear factor on the IBQ-R VSF comprised the infant’s reaction to unfamiliar adults and the frequency with which the infant clung to a parent or resisted an unfamiliar person. In the Growing Up in New Zealand cohort, this association between fear and vocabulary was observed concurrently in infancy across various cultures. Our current study extends this finding, demonstrating that the association persists in vocabulary acquisition at age two in the Chinese-speaking subsample of the larger study. Recall that the five-factor solution for the IBQ-R VSF for this large New Zealand sample separated these items indexing fear of strangers from the original Negative Emotionality factor that had been validated with US samples (e.g., Putnam et al. Reference Putnam, Helbig, Gartstein, Rothbart and Leerkes2014). It is possible that researchers of other non-US samples and bilingual speakers will need to explore this differentiation of negative emotionality into anger versus fear components, given that there were different associations with children’s language for the separate Fear factor in our sample.
We will also need to conduct further research to better understand why a fearful temperament was positively associated with a larger vocabulary in our Chinese-speaking sample, and whether this pattern replicates for children of other ethnicities and languages in the full Growing Up in New Zealand cohort. This relationship may vary based on cultural context, which shapes how parents regulate activities and interact with their children (Hwa-Froelich & Vigil, Reference Hwa-Froelich and Vigil2004). It is possible that children who hear many fear-related directives from parents – ‘Don’t touch that!’ or ‘Stay here, I don’t want you to fall.’ – are more likely to acquire negative emotion words specifically. If prohibited from more active exploration, children may also be more likely to engage in quiet activities, such as book-reading and conversations, that are known to promote language development (Reese, Reference Reese, Horst and Torkildsen’s2019).
Strength, Limitations, and Future Directions
The main strength of this study is that we extended findings from previous research with monolingual children (e.g., Dixon & Smith, Reference Dixon and Smith2000) of the associations between higher positive affect/surgency, lower negative emotionality, and greater orienting capacity with more advanced language development (in terms of word combinations) to bilingual and multilingual children. We also found a novel positive association between Chinese-speaking children’s fearful temperaments and their vocabularies across languages. Critically, our study also revealed the importance of the five-factor structure of the popular temperament measure – the IBQ-R VSF – for New Zealand children. This structure provided a more nuanced understanding of orienting/regulatory capacity as containing both attentional and regulatory dimensions, and of negative emotionality as containing both anger and fear. These finer-grained dimensions of temperament exhibited different patterns of association with bilingual and multilingual children’s language development.
However, several limitations also need to be addressed. Firstly, one limitation is the use of adapted language inventories, which may not align perfectly with their original versions when used for bilingual and multilingual children. However, Peña (Reference Peña2007) highlighted the advantages of adapted inventories over direct translations, emphasising their ability to identify unique lexical knowledge across languages, which is essential for evaluating vocabulary in bilingual and multilingual children. Furthermore, the use of a single question about word combinations to assess syntax is a limited measure of children’s grammatical skill, perhaps especially when evaluating bilingual and multilingual children across all of their languages. This approach was unable to identify specific associations between syntax in each of the child’s languages in relation to temperament. Ideally, this question should be administered separately in each of the child’s languages and supplemented by mean length of utterance measures in each language, to provide a more accurate assessment of their syntax development. Finally, we acknowledge that the mere presence of word combinations does not necessarily indicate grammatical correctness in the target language. For example, while “more banana” may adhere to the grammatical rules of English, “banana more” would be considered incorrect. Therefore, equating word combinations with syntax development may be problematic, since children can acquire language through diverse grammatical trajectories (Bates & Goodman, Reference Bates, Goodman and MacWhinney1999). Future research could supplement parent-report measures with that of mean length of utterance in observed samples across all of the bilingual and multilingual children’s languages.
Thirdly, because of our small sample sizes, the associations we found between temperament and language should be interpreted with caution. This is particularly the case in the English-Cantonese sample with only 40 participants, which restricts our ability to confidently associate temperament with language outcomes. Based on previous studies by Spinelli et al. (Reference Spinelli, Fasolo, Shah, Genovese and Aureli2018), a sample size of 61 (six predictors) is considered appropriate for detecting medium effect sizes (Cohen’s f2 = 0.15). Given our smaller sample size, our ability to detect meaningful effects in the English-Cantonese subsample (Cohen’s f2 = 0.05) was especially limited. Moreover, with only 13 monolingual, 23 bilingual, and 21 multilingual participants in the Cantonese sample, separate analyses for these groups would not be powerful enough to detect the expected small- to medium-sized effects. This underscores the need for larger sample sizes of bilingual and multilingual speakers in future studies to enable more robust analyses of these separate groups. On a positive note, we observed medium to large effect sizes for the Mandarin, Cantonese, and English-Mandarin samples, indicating robust findings.
Our findings have important implications for parents, educators, and clinicians working with Chinese-speaking children of different temperaments and language backgrounds in New Zealand. Before proceeding with further assessments and referrals, clinicians should identify children’s linguistic backgrounds and temperaments. Particular attention should be given to children’s orienting capacity and negative emotionality (with a separate assessment of fear), as these temperament dimensions may have the greatest impact on children’s language development. Furthermore, if the current results are replicated more widely, they could be used to assuage parents’ and educators’ concerns about children who experience higher levels of fear specifically; this tendency is not necessarily harmful to children’s language development.
Conclusions
In conclusion, we sought to explore the associations between temperament and language development in monolingual, bilingual, and multilingual Mandarin and Cantonese speakers in New Zealand. Our main findings were that the temperament-language association varied in terms of vocabulary acquisition and word combination development in these bilingual/multilingual children. A fearful temperament positively predicted vocabulary size for both Mandarin and Cantonese speakers. Orienting capacity positively predicted word combination development for Mandarin speakers, whereas higher positive affect/surgency and lower negative emotionality predicted word combination development for Cantonese speakers. We hope that these results will extend current theories of the role of temperament in children’s language development, and will also support clinicians and practitioners working with Mandarin- and Cantonese-speaking children living in New Zealand and worldwide.
Acknowledgement
We express our sincere appreciation to the families participating in the Growing Up in New Zealand study. The study received financial support from various organisations, including New Zealand government ministries and agencies such as the Ministry of Social Development, Ministry of Health, University of Auckland, Social Policy and Evaluation Research Unit (Superu, formerly known as Families Commission), Ministry of Business, Innovation and Employment, Ministry of Education, Ministry of Justice, Sport New Zealand (previously Sport and Recreation New Zealand), New Zealand Police, Te Puni Kokiri, Housing New Zealand, Ministry of Pacific Island Affairs, Ministry of Women’s Affairs, Department of Corrections, and the Mental Health Commission.
Competing interest
The authors declare none.