Introduction
The household in which children grow up has been found to affect many spheres of their development, including language development. Language skills in turn affect academic outcomes, along with cognitive skill development and later career success (Schoon et al., Reference Schoon, Nasim and Cook2021). Due to the importance of language development for success later in life, language development research often focuses on possible influencing factors, in an attempt to establish which factors affect language development positively and which negatively. Previous studies (e.g., Attig & Weinert, Reference Attig and Weinert2020 for children in Germany; Shaomei et al., Reference Shaomei, Yu and Yuxin2023 for kindergarteners in western China) established that children’s language and social skill development is affected by whether or not they have a stimulating home learning environment. However, the ability of parents to create such a home environment is often affected by household composition – which includes the number of adults and children in the household – as well as household resources, amongst other factors.
Past research has focussed mainly on a single one of these factors, or on a small subset, without deliberately considering that the household is an interconnected system (Lee et al., Reference Lee, Kubik, Fulkerson, Kohli and Garwick2020). When researchers included more factors in an exploratory manner, patterns emerged that were not evident before (see, e.g., Lee et al., Reference Lee, Kubik, Fulkerson, Kohli and Garwick2020 in the United States of America (USA)).
Most research on child language development has been conducted amongst speakers of English (see Kidd & Garcia, Reference Kidd and Garcia2022) and in minority-world contexts, i.e., those countries where the minority of the world’s population resides but which are generally well-resourced. Taking into account the impact of linguistic input on children’s language development (e.g., Hoff, Reference Hoff2003), which varies across culturesFootnote 1 and is shaped by home environment factors, it is essential to include majority-world countries, i.e., those countries where the majority of the world’s population resides but which are generally under-resourced (see Alam, Reference Alam2008), when investigating the connection between home environment and children’s language skills. As will be seen from the studies discussed below, research specifically on the topic of home environment factors and child language has also centred on minority-world contexts. The present exploratory study, based in South Africa, aims to address this gap in the literature on majority-world countries.
Household composition and resources
Household composition
Children growing up in a two-parent household (referred to hereafter as a nuclear household) show better health and well-being later in life than those who grow up in households with other compositions, such as single-parent households (for an overview, see Golombok, Reference Golombok2000). In contrast, being brought up in a multigenerational household, i.e., having grandparents present in the home, can be beneficial for children’s cognitive outcomes and brain development, whether the grandparents are in the home of single parents or are raising their grandchildren without the presence of parents in the household (Lee et al., Reference Lee, Ryan, Ofstedal and Smith2021). Lee et al. (Reference Lee, Ryan, Ofstedal and Smith2021) found better cognitive functioning in adults who grew up in multigenerational households compared to those who grew up in single-parent-only homes, with this being the case regardless of the adult’s socioeconomic status (SES) and health outcomes. Grandparents also contribute to increased social interactions and language input, which can enhance children’s language development (Romeo et al., Reference Romeo, Leonard, Robinson, West, Mackey, Rowe and Gabrieli2018).
However, a household with too many adults can be detrimental to child development: Evans et al.’s (Reference Evans, Lepore, Shejwal and Palsane1998) study (conducted in the USA) showed that household crowding directly affects children’s language development, even after controlling for SES. Matheny et al. (Reference Matheny, Wachs, Ludwig and Phillips1995), in their study in the USA, found that household density and noise in the environment were related to poorer early language development. Explanations for these findings include that children may cope with the sensory overstimulation caused by overcrowding by withdrawing, thereby lowering their chances of engaging in adult–child interaction in the household, and effectively limiting their language learning opportunities. Moreover, parents in these households may have fewer opportunities to provide their children with what is often termed “rich” language input (see MacLeod & Demers, Reference MacLeod and Demers2023). In particular, poorer maternal language input was seen in households with a higher density, and this resulted in children having less complex language skills (Evans et al., Reference Evans, Maxwell and Hart1999).
Note, however, that it is not only the number of adults in the household but also the characteristics of these adults that could affect child language development. Children of mothers with higher levels of education have been shown to exhibit better language skills than children of mothers with lower levels of education (see, e.g., McNally et al., Reference McNally, McCrory, Quigley and Murray2019 for Ireland; Reilly et al., Reference Reilly, Wake, Ukoumunne, Bavin, Prior, Cini, Conway, Eadie and Bretherton2010 for Australia; Tomblin et al., Reference Tomblin, Hardy and Hein1991 for the USA; Vogt et al., Reference Vogt, Mastin and Aussems2015 for Mozambique), although this may not be a universal finding (e.g., see the South Africa-based study of Southwood et al., Reference Southwood, White, Brookes, Pascoe, Ndhambi, Yalala, Mahura, Mössmer, Oosthuizen, Brink and Alcock2021 for no effect). Maternal age at the time of the child’s birth is also related to language skills later in the child’s life: Children with an older mother show better social, emotional, cognitive, and language skills than peers whose mothers gave birth in their twenties (e.g., Duncan et al., Reference Duncan, Lee, Rosales-Rueda and Kalil2018 for the USA; Goisis et al., Reference Goisis, Schneider and Myrskylä2017 for the United Kingdom (UK); Tearne, Reference Tearne2015). For instance, in a study in the UK, Sutcliffe et al. (Reference Sutcliffe, Barnes, Belsky, Gardiner and Melhuish2012) found that the vocabulary scores of children increased as maternal age at birth increased.
Children in the household
Several studies have found that firstborns have better language skills, including larger vocabularies at the same age, than children born into the family later. For example, Berglund et al. (Reference Berglund, Eriksson and Westerlund2005) found that birth order was significantly related to vocabulary comprehension and production in Swedish-speaking 18-month-olds, with firstborns obtaining higher scores than laterborns. These results are in line with the resource dilution model, which proposes that siblings compete for their parents’ finite resources (their time, energy, and finances) and therefore, as the number of children in the family increases, the resources assigned to any one child necessarily decreases (see, e.g., Downey, Reference Downey1995 for sibling size and educational success); this model is, however, not without controversy (see, e.g., Guo & VanWey, Reference Guo and VanWey1999, for findings to the contrary).
Havron et al. (Reference Havron, Ramus, Heude, Forhan, Cristia and Peyre2019) in a France-based study investigated whether older siblings could benefit a child in terms of language skills, hypothesising that older siblings with whom there is a larger age gap provide additional adult-like language input, thereby compensating for the diluted parental attention. Their hypothesis was refuted; the age gap between siblings did not correlate positively with language scores. In fact, Gurgand et al. (Reference Gurgand, Lamarque, Havron, Bernard, Ramus and Peyre2022) found in their France-based study that the expressive vocabulary score of children with only one older sibling correlated negatively with the age gap between them and their siblings; and Havron et al. (Reference Havron, Ramus, Heude, Forhan, Cristia and Peyre2019) found that children spaced closer to their older siblings had higher language scores than children whose older sibling was spaced further from them.
Household resources
The finite resources available to a household can influence child development; for instance, lower household income is related to smaller vocabulary size (McNally et al., Reference McNally, McCrory, Quigley and Murray2019). Specifically, McNally and colleagues found that household income in Ireland was a strong mediator of maternal education’s effect on children’s vocabulary size. The authors also found that when the family size was increased, the children’s vocabulary size was lower, and argued that this may be a result of less available resources (financial but also non-material), in line with the resource dilution hypothesis.
Experiencing food insecurity in their household can negatively influence a child’s health and development in direct and indirect ways. For instance, malnourishment in the early years can affect brain development (Tanner & Finn-Stevenson, Reference Tanner and Finn-Stevenson2002), or parental stress (Johnson & Markowitz, Reference Johnson and Markowitz2018a) and/or psychological distress (Myers, Reference Myers2020) due to food insecurity can lead to an impaired parent-child relationship. Hobbs and King (Reference Hobbs and King2018) found in their study of five-year-olds in the USA that performance on a vocabulary task was negatively associated with food insecurity. The authors also found some evidence, although mixed, suggesting an association between food insecurity and outcomes on a word-recognition task. A large-sample study conducted in the USA found that children from food-insecure households had lower reading scores in kindergarten (Johnson & Markowitz, Reference Johnson and Markowitz2018b).
Household composition in South Africa
Statistics South Africa shares data on household composition in the country, where “household” refers to people who live in the same structure and share food, money, or other resources (Statistics South Africa, 2022): The average number of individuals per household varies across the country, but national statistics show that 59 % of households consist of three people, and 27% of four to five people, while households with six or more people comprise 14%. The make-up of these households varies: There are more nuclear households (42 %) than extended households (34 %) nationally, with differences in distribution across rural and urban areas. Although comparatively high percentages of nuclear households exist in South Africa, there is still a clear deviation from nuclear families being the norm, in contrast to many minority-world countries (see Sukach et al., Reference Sukach, Gonzalez, Shen, Perkins, Soloski, Lebow, Chambers and Breunlin2019).
Southwood et al. (Reference Southwood, White, Brookes, Pascoe, Ndhambi, Yalala, Mahura, Mössmer, Oosthuizen, Brink and Alcock2021), in their South Africa-based study, looked at two household factors, namely the number of children in the household and number of adults in the household, in the context of language acquisition in monolingual Afrikaans, isiXhosa, South African English (SA English), and Xitsonga toddlers. The total number of adults in the household had no bearing on the total vocabulary score for any of the four languages. However, the total number of children in the household was correlated with total vocabulary score, although only in the Afrikaans and SA English samples. Interestingly, the direction of the correlation was negative in the Afrikaans sample and positive in the SA English sample, indicating that there are possibly different processes at work across these two languages. It warrants further investigation whether household composition is different in these two language groups and what other home environment factors, apart from the number of adults and number of children, may play a role in early language development, particularly early grammar development. Below, we discuss the grammar of Afrikaans, before turning to the early acquisition of English grammar, given that there is a dearth of information on the acquisition of Afrikaans grammar by very young children. Before doing so, we provide some information on speakers of Afrikaans and English in South Africa.
Due to the Group Areas Act of 1950 (Parliament of South Africa, 1950), which decreed that urban areas may no longer be racially mixed, South Africans who have Afrikaans or English as home language,Footnote 2 lived intermingled (but note that Coloured, Indian, and WhiteFootnote 3 South Africans were grouped in different areas, by race) but segregated from Black Africans, from at least 1950 until the Act was repealed in 1991. Sharing neighbourhoods implied being of the same SES (i.e., having a similar standard of living), with access to the same facilities (therefore, for instance, attending schools in the same neighbourhoods and frequenting the same places of leisure). Unlike most South Africans with an African language as home language – who could be said to have a community-centred, holistic worldview – speakers of Afrikaans and English are generally thought to have a Western worldview (see Leister, Reference Leister1994). The shared cultural lifestyle of most Afrikaans-speaking and English-speaking South Africans cause Afrikaans-speaking and English-speaking households to be regarded as more similar to each other than to households in which African languages are the home language.
Grammar
The grammar of Afrikaans and SA English
Afrikaans and SA English are both analytic, morphologically sparse West Germanic languages with few agreement structures. SA English is highly similar to the English spoken in the USA and UK but shows some influence from Afrikaans on its phonology, syntax, and lexicon (Kruger & van Rooy, Reference Kruger and van Rooy2016). Because it is similar to other varieties of English in terms of grammar, SA English grammar will not be discussed here (see Lass, Reference Lass and Mesthrie2002 and Bowerman, Reference Bowerman and Hickey2012 for some characteristics of SA English and its grammar).
In Afrikaans, few grammatical features are realised overtly: Semantic gender (Afrikaans does not have grammatical gender) and the grammatical features number, person, case (only on singular personal and possessive pronouns), and past tense are, but agreement (in terms of number, person or grammatical gender) is not. This differs from SA English, which has overt third-person singular agreement on verbs in the present tense.
In terms of singular-plural, like in SA English, there are no bound morphemes to indicate the cardinal one, and, unlike in English, there is no single default rule for forming the plural of any noun: The occurrence of the two regular plural suffixes (–e and –s) are indeed rule-based, but there are many rules determining which suffix should be used (cf. Donaldson, Reference Donaldson1993, 69–84), and there are also many exceptions.
Present tense is indicated on modal auxiliaries in those constructions that contain these auxiliaries, which may co-occur with the infinitival form of the main verb, as in kan dit sien “can see it” (versus kon dit sien “could have seen it”). In constructions without modal auxiliaries, present tense is carried by the main verb, which (unlike in SA English) has the same form as the infinitive, regardless of the person and number features of the subject (hê “to have” and wees “to be” being the exceptions, with finite and infinitive forms realised differently), as in ek/dit/hulle sien “I/it/they see”. Past tense, in contrast, is expressed by the obligatory temporal auxiliary het in constructions not containing modal auxiliaries, as in het gesien “saw” or, emphatically, “did see”. This het co-occurs with the past participial form of the main verb which resembles the infinitive but has the prefix ge- (except in the case of verbs beginning with the derivational morphemes be-, ge-, her-, er-, ont-, or ver-, or another unstressed prefix (cf. Donaldson, Reference Donaldson1993), as in het begin/herken/erken/onthou/verloor “started/recognised/acknowledged/remembered/lost”). When expressing past tense in constructions containing a modal auxiliary, the use of het and the past participle (ge-) form of the main verb is optional, as in kon sien “could see” versus kon gesien het “could see”, which differs from SA English in which could see cannot also be interpreted as could have seen. If these are not used, the main verb remains in its infinitival form, with the modal auxiliary taking its past tense form, as in kon dit sien (“could see it”). In other words, ek kon dit sien, ek kan dit gesien het and ek kon dit gesien het “I could see it” could have the same temporal reference.
Afrikaans has a possessive construction consisting of a determiner phrase with the structure XP se DP, as in die kind / die mense / die kinders wat daar staan se appel “the child/the people/the children who are standing there POSSESSIVE MARKER apple” (Oosthuizen & Waher, Reference Oosthuizen and Waher1994). The se particle is to some extent equivalent to the English possessive -’s.
Grammar development in young English-speaking children
There are no traceable studies on the grammar development of Afrikaans or SA English-speaking toddlers; therefore, the grammar development of other English-speaking toddlers is discussed here, specifically those aspects of early grammar contained in the MacArthur-Bates Communicative Development Inventory (CDI) (see https://mb-cdi.stanford.edu), which is the data collection instrument employed in the current study.
The use of suffixes to convey grammatical information (such as pluralisation or past tense) is “a clear sign of linguistic growth” (Fenson et al., Reference Fenson, Dale, Reznick, Bates, Thal, Pethick, Tomasello, Mervis and Stiles1994, p. 45). Suffixes appear early on in child language: Fenson et al. (Reference Fenson, Dale, Reznick, Bates, Thal, Pethick, Tomasello, Mervis and Stiles1994), in a large-scale study of 1130 toddlers aged 16–30 months, found that only a few children could use grammatical suffixes at 16 months. A rapid growth was reported in this respect in the latter part of the second year of life, with most 22-month-olds and more than 90 % of 30-month-olds able to use progressive tense marking -ing, pluralisation suffixes, and possessive marking -’s. Based on parent reports (Fenson et al., Reference Fenson, Dale, Reznick, Bates, Thal, Pethick, Tomasello, Mervis and Stiles1994), the acquisition order appears to be possessive, plural, progressive and past tense marking, with simple past tense marking -ed appearing more slowly. Other studies rendering similar findings include Tomasello’s (Reference Tomasello1998) single-participant study, which found that the possessive -’s appeared at 18 months; and Graves and Koziol (Reference Graves and Koziol1971), who found that whereas English segmental [s] as in cats and [z] as in dogs appear early, production of syllabic [ǝz] in all contexts can still be a challenge at around seven years of age.
Overregularisation of plural and tense marking (e.g., foots or runned) is often observed in the grammar of young children (see, e.g., Maratsos, Reference Maratsos2000) and has been viewed for decades as a sign of progress in the acquisition of grammatical rules (see, e.g., Berko, Reference Berko1958). In the Fenson et al. (Reference Fenson, Dale, Reznick, Bates, Thal, Pethick, Tomasello, Mervis and Stiles1994) parent-report study, this overregularisation had a low level of occurrence, especially in children younger than 23 months. By 30 months, more than half of the toddlers used four or less of the 45 noun- and verb-related overregularisation in the parent-report form.
In terms of irregular plural and tense forms (e.g., teeth and ate), Fenson et al. (Reference Fenson, Dale, Reznick, Bates, Thal, Pethick, Tomasello, Mervis and Stiles1994) found that some children use adult-like irregular forms in their second year of life, but these forms are initially used infrequently, with their use increasing fairly rapidly after the second birthday: At 30 months, the toddlers were on average using 13 of the 25 items.
Moving from grammatical morphemes to utterances of more than one lexical item: Fenson et al. (Reference Fenson, Dale, Reznick, Bates, Thal, Pethick, Tomasello, Mervis and Stiles1994) found that 73% of toddlers had not yet started producing utterances of two or more words by 16 months. In contrast, 94% and 6% of 30-month-olds combined words into utterances often and sometimes, respectively. Those parents who indicated that word combinations are present in their child’s language were requested to provide the longest three utterances they had recently heard their child say, and the mean length of these were calculated. Whereas utterance length increased with an increase in age from 16 to 30 months, there was great variability within age bands after 18 months of age. Fenson et al. (Reference Fenson, Dale, Reznick, Bates, Thal, Pethick, Tomasello, Mervis and Stiles1994) used three indices of utterance complexity (presence of bound morphemes, function words and early complex sentence forms), and found great variation amongst the participants: Those at the 90th percentile had at least one of these three present in their productive language by 17 months but those at the 10th percentile only at 27 months.
Relationship between vocabulary and grammar development
Many scholars have reported a relationship between vocabulary and grammar development. Bates et al. (Reference Bates, Marchman, Thal, Fenson, Dale, Reznick and Hartung1994) found that American English-speaking toddlers with larger vocabulary sizes had more sophisticated grammar. Bates and colleagues (Bates et al., Reference Bates, Bretherton and Snyder1988; Marchman & Bates, Reference Marchman and Bates1994; Bates & Goodman, Reference Bates and Goodman1997) obtained similar results, again for English-speaking children. This has also been found for monolingual child speakers of languages other than English (e.g., Hebrew – Maital et al., Reference Maital, Dromi, Sagi and Bornstein2000; Italian – Caselli et al., Reference Caselli, Casadio and Bates1999; Icelandic – Thordardottir et al., Reference Thordardottir, Weismer and Evans2002), as well as for bilinguals. For instance, Blom et al. (Reference Blom, Paradis and Duncan2012) found among school-aged children, from a range of home language backgrounds (Cantonese, Mandarin, Romanian, Spanish), learning English as a second language, those with a larger vocabulary also performed better in producing the third-person singular –s. Similarly, Xu Rattanasone and Kim (Reference Xu Rattanasone and Kim2024) found that better grammar skills correlated with a larger vocabulary in each of the languages of 4- to 6-year-old English-Mandarin bilinguals, but not across the two languages.
Does language input affect grammar development as it does vocabulary development?
Most studies examining the relationship between home environment factors and language development used vocabulary size as a measure of language development (e.g., Berglund et al., Reference Berglund, Eriksson and Westerlund2005; Gurgand et al., Reference Gurgand, Lamarque, Havron, Bernard, Ramus and Peyre2022; McNally et al., Reference McNally, McCrory, Quigley and Murray2019; Sutcliffe et al., Reference Sutcliffe, Barnes, Belsky, Gardiner and Melhuish2012 – all referred to above). One other such study referred to above is Southwood et al. (Reference Southwood, White, Brookes, Pascoe, Ndhambi, Yalala, Mahura, Mössmer, Oosthuizen, Brink and Alcock2021), in which the influence of a range of individual and sociocultural factors on expressive vocabulary size of toddlers, including speakers of Afrikaans and SA English, was investigated. In this majority-world context, the maternal level of education and a composite SES score did not predict the 16- to 32-month-olds’ expressive vocabulary size. The Afrikaans and SA English data sets of Southwood et al. (Reference Southwood, White, Brookes, Pascoe, Ndhambi, Yalala, Mahura, Mössmer, Oosthuizen, Brink and Alcock2021) formed the basis of the current study, but the authors used expressive vocabulary size and composition as measures of language development, where we consider grammar development.
Researchers have found that aspects of English-speaking children’s grammar are related to the language input the children receive. For instance, Barnes et al. (Reference Barnes, Gutfreund, Satterly and Wells1983) found that the quantity of speech directed to two-year-olds (with an MLU of 1.5) was strongly related to their MLU growth. Similarly, Hoff-Ginsberg (Reference Hoff-Ginsberg1998) concluded that firstborns (18 to 29 months old) show more rapid syntactic growth than their siblings in the early stages of word combination, due to their parents directing longer utterances at them than at their siblings in these early stages of word combination. However, as noted by Huttenlocher et al. (Reference Huttenlocher, Vasilyeva, Cymerman and Levine2002), family size for firstborns and laterborns was not equated in the Hoff-Ginsberg (Reference Hoff-Ginsberg1998) study, and therefore, parental input is not definitively implicated in the rate of syntactic growth. Serratrice et al. (Reference Serratrice, Joseph and Conti-Ramsden2003) found a positive correlation between the most frequently used past tense forms by young participants and their mothers. In contrast, de Villiers and de Villiers (Reference De Villiers and de Villiers1973) found that the order of emergence of 14 grammatical morphemes in the 16- to 40-month-olds in their study did not correlate with the frequency of occurrence of these morphemes in parental speech. Given that the language input a child receives is associated with some home environment factors, we wanted to investigate whether these factors are related to grammar development as well.
Current study
As discussed above, it is clear that there are home environment factors that influence child language acquisition, but the available research is not necessarily representative of a wide range of contexts, including those in which a nuclear family is not the norm. Also, even in some contexts in which nuclear families used to be the norm, family relations are becoming increasingly diverse (see, e.g., Bengtson, Reference Bengtson2001 for the USA). As Lee et al. (Reference Lee, Kubik, Fulkerson, Kohli and Garwick2020) point out, research needs to consider the diversity and complexity of the home environment to allow for generalisable conclusions to be drawn.
The current study aims to expand our knowledge of the relationship between home environment factors and the language acquisition of young children, by studying grammar development amongst Afrikaans-speaking and English-speaking children in South Africa. The study forms part of an ongoing, overarching cross-linguistic, multidisciplinary, interinstitutional project involving the creation of, and data collection with, the CDI for all 11 official spoken languages of South Africa (see Southwood et al., Reference Southwood, White, Brookes, Pascoe, Ndhambi, Yalala, Mahura, Mössmer, Oosthuizen, Brink and Alcock2021; White et al., Reference White, Southwood and Yalala2024 for a description of the methodology employed in the project). In the current study, we set out to describe the home environment factors of the participants from the Afrikaans and SA English CDI data sets, knowing from Southwood et al. (Reference Southwood, White, Brookes, Pascoe, Ndhambi, Yalala, Mahura, Mössmer, Oosthuizen, Brink and Alcock2021) that there are differences in vocabulary composition and size between toddlers learning these two languages. Our goal is to elaborate further on the home environment factors that were briefly touched upon in their vocabulary-based study, by looking specifically at toddlers’ grammar development. As described above, household composition and its relationship to vocabulary varied across languages in Southwood et al. (Reference Southwood, White, Brookes, Pascoe, Ndhambi, Yalala, Mahura, Mössmer, Oosthuizen, Brink and Alcock2021), not only in whether there was a correlation between household factors and vocabulary but also in its direction – for instance, the total number of children in the household correlated negatively with vocabulary size in the Afrikaans sample but positively in the SA English sample. The current study therefore also looks at whether home environment factors are different across the two language groups, not only in relation to grammar skills but also independently.
Research questions
We ask two exploratory research questions (RQs), using the term “home environment factors” to encompass household composition as well as household resources.
-
1. Is there a relationship between home environment factors and toddlers’ grammar development?
-
2. Do the Afrikaans and SA English samples form distinct groups in terms of their home environment factors?
Methodology
Participants
Caregivers of 219 toddlers aged 16 to 33 months were recruited via personal and professional networks of the authors and their colleagues, and via social media: 117 toddlers for Afrikaans and 102 for SA English (see Table 1 below for descriptive statistics). Caregivers were birth or adoptive parents, grandparents, or other family members (such as aunts). The children were raised by South Africans in South Africa in their caregiver’s mother tongue. If a child had more than four hours per day of exposure to any other language, they were excluded from participation in the study.Footnote 4
Note:
a Mother’s level of education: 1 = no formal education, 2 = primary school incomplete, 3 = completed primary school, 4 = high school incomplete, 5 = completed high school, 6 = studied beyond high school.
b Employment level: 1 = not working, 2 = employed, 3 = self-employed without employees, 4 = self-employed with employees.
c A nuclear family consists of only two parents and their children.
d 0 = No income, 1 = R1–R600, 2 = R601–R1 200, 3 = R1 201–R2 400, 4 = R2 401–R5 000, 5 = R5 001–R10 000, 6 = R10 001–R20 000, 7 = R20 001–R40 000, 8 = R40 001–R80 000, 9 = R80 001–R150 000, 10 = R150 001–R300 000, 11 = R300 001 or more.
e 0 = none, 1 = R1 – R600, 2 = R601 – R1 200, 3 = R1 201 – R2 000, 4 = R2 001 – R3 000, 5 = R3 001 – R5 000, 6 = R5 001 – R8 000, 7 = R8 001 – R12 000, 8 = R12 001 or more.
Materials
All participating caregivers completed a family background questionnaire and the relevant language version of the toddler form of the MacArthur-Bates CDI, which is a parent report that collects data on the vocabulary and early grammar of toddlers 16 to 36 months (https://mb-cdi.stanford.edu/). The Afrikaans and SA English CDIs form part of the ongoing cross-linguistic project involving the creation of, and data collection with, the CDI for all 11 official spoken languages of South Africa (see Southwood et al., Reference Southwood, White, Brookes, Pascoe, Ndhambi, Yalala, Mahura, Mössmer, Oosthuizen, Brink and Alcock2021 for a description of the methodology employed in the adaptation of the CDIs). At the time of data collection, the CDIs had both been piloted with 40 participants each for Afrikaans and SA English. The data sets used for the current study are only those of the main CDI study conducted after piloting. The data collection instrument was thus intended to measure early grammar knowledge and use but is not yet a standardised, normed tool.
The family background questionnaire included questions on child health and development; childcare arrangements; household composition (asking how many people lived in the household and who they were); household income; household grocery expenditure; and parental level of education and employment (in this case, the questions asked specifically about the parent/s and not about potential other caregivers). The family background questionnaire was developed based on (i) the literature on demographic and other factors influencing language development in young children, (ii) the results of the 2011 South African census (Statistics South Africa, 2012), and (iii) the feedback provided by parents, caregivers, and fieldworkers about the clarity, ease of reading, and cultural appropriateness of the questions in each of the two languages concerned.
For both the Afrikaans and the SA English versions of the CDI, caregivers indicated on a checklist which words the toddler could produce, and which early grammatical constructions (see below for details) the toddler could produce. For the purposes of this paper, the Words section was not considered.
The Words and Sentences version of the two CDIs has grammar checklists for morphology, word combinations, and sentence complexity. We created comparable grammar sections across the two languages by including both early developmental grammatical features commonly reported for English, and those grammatical features unique to the adult form of each of the languages. We followed the American English and Lincoln toddler CDIs (Fenson et al., Reference Fenson, Dale, Reznick, Thal, Bates, Hartung, Pethick and Reilly1993; Meints & Fletcher, Reference Meints and Fletcher2001), and for Afrikaans, we also consulted published literature on older children’s grammar (e.g., Southwood, Reference Southwood2007) and an internationally respected morphosyntactician specialising in Afrikaans (J. Oosthuizen, personal communication, January 2017). We noted the morphosyntactic structures used in natural language recordings from six children per language. In the prepilot phase, we had a longer list of items in the grammar section because we had no evidence for which items would and would not work well. This list was shortened and refined after the first pilot, before data collection for the current study commenced. For instance, in the Pilot 1 version, several items on diminutives in Afrikaans were included, given their high frequency of occurrence in the language (see Donaldson, Reference Donaldson1993); however, all but one were omitted from the Pilot 2 version.
The Afrikaans and SA English grammar sections were harmonised before and after Pilot 1 to ensure comparability between the two languages. Each language version’s grammar section was structured similarly and had four parts, like many other language versions of the CDI (Frank et al., Reference Frank, Braginsky, Yurovsky and Marchman2021). Part A (Small Parts of Words) firstly asked, in the form of yes/no questions, whether the child had started using singular/plural distinctions; past and present tense marking; and progressive aspect marking (for SA English) and diminutisation (for Afrikaans). For example, for SA English, we asked, “To talk about activities we sometimes add ‘ing’ to verbs. Examples include ‘looking’, ‘running’, and ‘crying’. Has your child begun to do this?; and for Afrikaans, Om oor iets te praat wat klein is, voeg ons dikwels ‘ie’ agteraan woorde. Voorbeelde hiervan sluit in ‘boekie’ (vir ‘n klein boek), ‘huisie’, en ‘poppie’. Het u kind begin om dit te doen? “To talk about something that is small, we often add ‘ie’ to the end of words. Examples of these include ‘book-DIMINUTIVE’ (for a small book), ‘house-DIMINUTIVE’, and ‘doll-DIMINUTIVE’. Has your child started doing this?”
Part B (Word Complexity) asked about regular and irregular past tense forms in SA English, reduplications in Afrikaans, and irregular plural forms in both languages. Specifically, we asked caregivers to mark off on the list provided which adult-like multimorphemic words they had heard the child say, for example, ate and mice for English; and gou-gou “quick-quick” and beddens “beds” for Afrikaans.
Parts C (Word Combinations) and D (Sentence Complexity) asked about sentence formation. Part C asked, Has your child begun to combine words yet, such as “nother biscuit”, or “doggie bite”? and the equivalent in Afrikaans Het u kind al begin om woorde saam te voeg, soos in “nog koekie” of “woefie byt”?. Part D asked about the length and complexity of sentences by asking caregivers to choose which of two utterances sounded most like the way the child talked at that moment, for example Where mommy go? vs Where did mommy go? for SA English and the equivalent in Afrikaans, Waar mamma gaan? vs Waar mamma gegaan?.
Data collection procedure
All data were collected electronically on Qualtrics (Qualtrics, Provo, UT, https://www.qualtrics.com). For this purpose, an informed consent form, the family background questionnaire, and the CDI were combined into one online form per language. Those caregivers who could and wanted to complete the Qualtrics form independently did so. In all other cases, data were collected by trained fieldworkers who assisted the caregivers to complete the Qualtrics form on a smartphone or tablet, or the fieldworker entered the caregivers’ responses into Qualtrics on their behalf.
It took 40–60 minutes to complete the Qualtrics form, which could be completed across multiple sessions but had to be submitted within a week of commencement. Unsubmitted forms were automatically closed and submitted by Qualtrics a week after commencement but were then discarded during data cleaning if they were incomplete.
Ethical considerations
Ethical clearance for the study was obtained from the Research Ethics Committee: Social, Behavioural and Education Research at Stellenbosch University. Participation was voluntary and anonymous. Information on the study and informed consent forms were available on Qualtrics, and if consent for participation was not granted, Qualtrics did not allow the potential participant to proceed to the family background questionnaire and the CDI. Caregivers had the option to not answer certain questions and remain part of the study.
Analytic strategy
To measure home environment factors, our background questionnaire yielded 19 variables (see Table A in the Supplementary Materials). Principal component analysis (PCA) was employed as a dimensionality reduction technique for ensuing analyses. All analyses were run in R version 4.2.2 (Team, Reference Team2022).
RQ1 was addressed by a regression analysis which estimated whether measures of home environment, as components extracted from the PCA, predict performance on grammar measures. The following categories were included, based on the CDI data: Total Grammar score, which was composed of the correct use of plurals, possessives, diminutives, past tense, and manner of expression; combining words, i.e., whether the child always, sometimes or never combines two or more words (coded as 0 = not yet, 1 = sometimes, 2 = often); and complex phrases, of which (as explained above) the items required the caregiver in each case to choose which of two examples of varying complexity best resembled the child’s utterances at the time of data collection. As the number of grammar items varied across languages, grammar scores were standardised for each language first and then used in the regression models with the principal components as predictors along with child’s age, sex, and language. The categorical variables sex and language were coded as factors in the regression analysis. Interaction terms for the principal components and language were also included to account for possible differences between language groups.
To address RQ2, hierarchical cluster analysis was conducted on the individual principal component scores that were extracted for both language data sets. These scores were used instead of the raw data to avoid any single variable forming a cluster unless it had some shared variance. Hierarchical cluster analysis was performed to determine whether home environment factors were grouped based on the language of the sample, i.e., Afrikaans or SA English. The optimal number of clusters was determined based on the average silhouette method. Squared Euclidean distance was implemented as the measure of similarity, and the between-groups linkage method was used. Chi-square tests were then run to test the significance of any observed association between the language of the sample and the identified clusters.
Results
Structure of home environment factors
Table 1 shows the descriptive statistics for the measures that were to be included in the PCA. These 19 variables served to provide an overall picture of the child’s home environment. The PCA was used to extract components from these 19 variables, i.e., the latent variables that underlie them, to reduce the number of variables but also to examine the relationships between these variables. All 19 variables were standardised before further analysis.
Some data were missing, especially those pertaining to household finances, i.e., expenditure on groceries (21%) and income (18%). To account for this, a Pearson’s correlation matrix that excluded cases pairwise was created; the PCA was conducted on this matrix.
To determine whether PCA was viable for these data, Bartlett’s test was run on the correlation matrix which included the initial 19 variables in Table 1. Bartlett’s test showed that PCA was viable, as our correlation matrix was shown not to be an identity matrix (χ2 (153) = 2604, p < .001). A Kaiser-Meyer-Olkin test was conducted to determine the Measure of Sampling Adequacy (MSA). Individual MSAs should be above 0.5 for inclusion in the PCA, therefore household income was excluded from the analysis (MSA=0.49). This retained 18 variables, which led to an overall MSA of 0.75Footnote 5.
As a result of the sum of squares for all loadings necessarily amounting to one for each principal component, we calculated what the value would be if each variable contributed equally to the principal component. If any variable was above that value, in this case 0.24, we regarded it as important for that principal component and indicated it in boldface in Table 2. The principal components (PC) to be retained were determined based on scree plots, where the horizontal line indicates the values mentioned above (see Figure 1 further below), and the amount of variance explained summing to above 80 % (see Table 2 for factor loadings and proportion of variance explained). Three PCs were retained, explaining 87 % of the variance.
Note: The strongest loading per variable is indicated in bold. All loadings above .24 are indicated with an asterisk (*).
The three PCs that were yielded are the three latent variables that underlie these 19 home environment variables. We will label PC1 as the family stability factor because it captures aspects related to the stability and SES of the household, such as factors indicative of a two-parent household (positive loadings for nuclear family presence, father presence), SES, and resource availability (positive loadings for grocery expenditure and mother’s level of education). PC2 will be labelled the resource competition factor as it includes negative loadings for the number of older children in the household, number of people per bedroom, and father’s employment. PC3 will be referred to as household dynamics due to multiple factors of household size (negative loadings for total number of adults, total number of people, number of bedrooms, and other family members present) and household management (negative loadings for grocery expenditure, mother’s employment, and positive loadings for mother as the primary caregiver).
RQ1: Prediction of grammar measures by home environment factors
The three components extracted from the PCA, namely family stability, resource competition, and household dynamics, were used as predictors in the subsequent regression models with total grammar, word combining, and complex phrases as the dependent variables. The descriptive statistics for the three grammar variables can be seen in Table 3 along with the maximum achievable score. Child’s age, sex, and language were included as predictors with interaction terms for language and the three aforementioned components. The full results from the regression analyses can be found in Table 4.
Note. Total grammar: F(9,150) = 4.04, p < .001, r2 = .15.
Word combining: F(9,150) = 4.66, p < .001, r2 = .17.
Complex phrases: F(9,150) = 5.28, p < .001, r2 = .20.
CI = confidence interval
SE = standard error
The first regression model with total grammar as the dependent variable was significant overall and accounted for 15 % of the variance. Family stability was found to be a significant positive predictor (β = 0.218, t(150) = 2.02, p = .045). This indicates that as the family stability score increases, total grammar score also increases, suggesting that children from more stable family environments tend to have better grammar skills. Child’s age was also a significant positive predictor (β = 0.082, t(150) = 4.79, p < .001), showing that older children have higher scores.
For word combining as the dependent variable, the model was significant and accounted for 17 % of the variance. However, only child’s age was significant (β = 0.078, t(150) = 4.54, p < .001), indicating that as children age, their ability to combine words improves. There was no evidence for or against an effect of home environment factors on word combining.
The final model, which contained complex phrases as the dependent variable, was significant and accounted for 20 % of the variance. As with the previous models, child’s age was a significant positive predictor (β = 0.089, t(150) = 5.45, p < .001). Family stability was again a significant positive predictor (β = 0.271, t(150) = 2.65, p = .009), showing some evidence that children from more stable families are more likely to use complex phrases. Additionally, there was a significant interaction between language and family stability (β = −0.180, t(150) = −2.23, p = .027). This negative interaction suggests that the positive effect of family stability on complex phrases is stronger for children speaking Afrikaans compared to those speaking SA English.
RQ2: Home environment factors across languages
Hierarchical cluster analysis was conducted to discover whether home environment factors were grouped based on the language of the sample, i.e., Afrikaans or SA English. The cluster analysis was run on the individual principal component scores that were extracted. The optimal number of clusters was two (see Figure A in the Supplementary Materials).
Cluster analysis revealed that 95 participants were assigned to Cluster 1 and 124 to Cluster 2. Chi-square tests were run to determine whether there was an association between cluster membership and language group, and a significant association between language and cluster membership (χ2 (1) = 7.098, p = .008) was found. This indicated that the language groups were distinct with regard to their home environments.
Discussion
To answer the first RQ of this exploratory study, which asked whether home environment factors affect grammar development in Afrikaans-speaking and SA English-speaking toddlers in South Africa, regression analyses were conducted on the three grammar measures, namely total grammar, word combining, and complex phrases. A wide range of home environment factors were considered for the PCA (see Table 1), and the three resulting components, namely family stability, resource competition, and family dynamics, were used in these regression analyses. Along with these, child’s age, sex, and language were included as predictors. Child’s age (but not sex or language) predicted all three grammar measures: total grammar, word combining, and complex phrases. As the child’s age increased, so did their score on the grammar measures, which has also been found for other languages in other contexts using the CDI (e.g., Fenson et al., Reference Fenson, Dale, Reznick, Bates, Thal, Pethick, Tomasello, Mervis and Stiles1994 for American English; Maital et al., Reference Maital, Dromi, Sagi and Bornstein2000 for Hebrew; Simonsen et al., Reference Simonsen, Kristoffersen, Bleses, Wehberg and Jørgensen2014 for Norwegian).
Family stability was a significant predictor for both total grammar and complex phrases. This component contains high factor loadings for having a nuclear family, a father being present in the home, and a grandparent present; household expenditure on groceries; mother’s age at birth; and mother’s level of education. That this is involved in grammar development is relatively unsurprising as the variables that contribute to this factor have all been found to be related, in some manner, to child development in general and/or language development in particular. Consider, for example, that mothers who have higher education levels have children with increased language skills (e.g., Vogt et al., Reference Vogt, Mastin and Aussems2015 for Mozambique) and that being in a nuclear family (which involves having a physically present father) is beneficial for children’s overall development (Golombok, Reference Golombok2000).
Grandparent presence was the only variable with a negative loading on the family stability factor, which indicates a negative relationship with family stability. Recall that previous research has shown that grandparent presence can be beneficial to children’s development and can contribute to increased language input (Lee et al., Reference Lee, Ryan, Ofstedal and Smith2021; Romeo et al., Reference Romeo, Leonard, Robinson, West, Mackey, Rowe and Gabrieli2018). However, in the case of the present study, grandparent presence was involved with family stability overall; the exact contribution of grandparent presence to child language outcomes was not investigated here, as we regarded the family as an interconnected system instead of investigating relationships between various language skills, on the one hand, and individual home environment factors, on the other. To our knowledge, our study is the first on specifically home environment and grammar development – the relationship between the two might differ from the relationship between individual home environment factors and other aspects of child development.
For all three grammar measures, there was no significant main effect of language, indicating that there is no evidence for or against a difference between Afrikaans and SA English speakers on their grammar performance. This result shows the comparability of the grammar sections in the Afrikaans and SA English CDIs, aligning with the tool’s intended purpose of cross-linguistic utility.
For complex phrases, there is a significant interaction term between language group and family stability. This suggests that the nature of the relationship between family stability and complex phrases differs between Afrikaans and SA English speakers. Interpreting the negative coefficient of the interaction term, it can be inferred that the relationship between family stability and complex phrases is weakened for SA English speakers compared to Afrikaans speakers. In other words, the positive relationship between family stability and complex phrases is stronger for Afrikaans speakers than for SA English speakers.
The second RQ asked whether the SA English and Afrikaans samples formed distinct groups in their home environment factors. This was not a question about the children’s language acquisition, but about whether two language groups with a long-shared history of living intermingled in the same areas of the same country could be assumed to be homogenous in terms of home environment factors. From the cluster analyses performed, it appeared that there was a significant relationship between cluster membership and the two language groups. These two clusters were not a perfect separation into language groups but did indicate that there is an association between language group and home environment factors. This significant relationship between cluster membership and language group could indicate that the differences in language skills found between the two language groups in the current study, as well as by Southwood et al. (Reference Southwood, White, Brookes, Pascoe, Ndhambi, Yalala, Mahura, Mössmer, Oosthuizen, Brink and Alcock2021), could be ascribed to, amongst other things, differences in home environments. This is further reflected in the significant interaction term found in our model for complex phrases. To our knowledge, this is the first study to consider how home environment factors might differ across language groups in the same country. The finding from our exploratory study that home environment factors differ by language group, even among groups that have lived intermingled for many decades, is novel and warrants further, more nuanced investigation in future studies.
As stated above, the household is an interconnected system with various characteristics that may influence child language development, and therefore one should be cautious when investigating the effect of only one home environment factor on child language development without considering, or controlling for, the other possible influencing factors. In the same vein, our results point to potential social or cultural differences between speakers of Afrikaans and of SA English in South Africa, despite the two language communities having a long history of close contact. This highlights the importance of considering home environment factors in future studies, regardless of assumed sample homogeneity.
Conclusion
Afrikaans-speaking and SA English-speaking toddlers in South Africa clustered by language group on home environment factors, and indeed an interaction between family stability and language was found for complex phrases. Family stability was the only component from the home environment factors to be predictive of grammar, specifically for total grammar and complex phrases. These results indicate that certain home environment factors are associated with children’s grammar abilities and are at play with more than only vocabulary, the latter having been the focus of most prior studies (e.g., Sutcliffe et al., Reference Sutcliffe, Barnes, Belsky, Gardiner and Melhuish2012; Vernon-Feagans et al., Reference Vernon-Feagans, Garrett-Peters, Willoughby and Mills-Koonce2012).
The household in which a child grows up consists of multiple, complex, and interconnected factors. Measuring each of these factors is challenging owing to the number of variables which need to be considered. Our results show that when home environment factors are treated together, i.e., accounting for the fact that some patterns of variance are shared among the variables and contribute to the same underlying factor, they are capable of explaining variance in the grammar measures. These results emphasise the need for caution when interpreting child language outcomes from single-factor studies, as they may not account for the interactions and potential shared variance between the many variables found in a household context (see Lee et al., Reference Lee, Kubik, Fulkerson, Kohli and Garwick2020 who concluded similarly). Considering the diversity among young children’s home environments, researchers should take care to describe their child participants in terms of these characteristics so that results pertaining to language skills can be interpreted against this background.
Supplementary material
The supplementary material for this article can be found at http://doi.org/10.1017/S0305000924000527.
Funding information
This study was financially supported by the South African Center for Digital Language Resources. Additional funding came from The National Research Foundation of South Africa (HSD170602236563). Preliminary work for this research was supported by The British Academy Newton Fund (NG160093) and the National Research Foundation of South Africa/Swedish Foundation for International Cooperation in Research and Higher Education (NRF/STINT160918188417). Any opinions, findings, conclusions, or recommendations expressed in this material are those of the authors.