Introduction
Infant-directed speech (IDS) refers to the speech register that parents across many cultures and languages use when addressing their young infants (Fernald, Taeschner, Dunn, Papoušek, Boysson-Bardies, & Fukui, Reference Fernald, Taeschner, Dunn, Papoušek, Boysson-Bardies and Fukui1989). Some components of IDS, such as heightened pitch and positive vocal affect, play a significant role in the process of early social–emotional development by regulating infants’ emotional states and introducing them to the communicative process (Papoušek, Reference Papoušek2007). Aside from these emotional and social benefits, IDS has also been proposed to facilitate early language acquisition (Kuhl, Reference Kuhl2004). In this view, when parents use IDS, they modify specific speech components to produce just the type of linguistic input that is optimal for early linguistic processing in that particular language.
In comparison to adult-directed speech (ADS), IDS is characterised by exaggerated mean pitch and greater pitch variations (Cooper & Aslin, Reference Cooper and Aslin1990; Fernald et al., Reference Fernald, Taeschner, Dunn, Papoušek, Boysson-Bardies and Fukui1989), positive vocal affect (Kitamura & Burnham, Reference Kitamura and Burnham2003; Singh, Morgan, & Best, Reference Singh, Morgan and Best2002), simplified grammar (Soderstrom, Reference Soderstrom2007), more regular rhythmical structure (Lee, Kitamura, Burnham & Todd, Reference Lee, Kitamura, Burnham and Todd2014; Leong, Kalashnikova, Burnham, & Goswami, Reference Leong, Kalashnikova, Burnham and Goswami2017; Payne, Post, Astruc, Prieto, & Vanrell, Reference Payne, Post, Astruc, Prieto and Vanrell2009), distinctive facial expressions (Chong, Werker, Russell, & Carroll, Reference Chong, Werker, Russell and Carroll2003), and acoustically exaggerated vowels (Burnham, Kitamura, & Vollmer-Conna, Reference Burnham, Kitamura and Vollmer-Conna2002; Kalashnikova, Carignan, & Burnham, Reference Kalashnikova, Carignan and Burnham2017; Kuhl et al., Reference Kuhl, Andruski, Chistovich, Chistovich, Kozhevnikova, Ryskina and Lacerda1997). This study will focus specifically on three of these IDS components: exaggerated pitch, positive vocal affect, and exaggerated vowels. All three components are carried in the acoustic signal of IDS, and are available to the infant from the first months of life (e.g., Burnham et al., Reference Burnham, Kitamura and Vollmer-Conna2002; Kuhl et al., Reference Kuhl, Andruski, Chistovich, Chistovich, Kozhevnikova, Ryskina and Lacerda1997), and all three have also been earmarked as the components of IDS that promote infants’ early language processing (Liu, Kuhl, & Tsao, Reference Liu, Kuhl and Tsao2003; Singh et al., Reference Singh, Morgan and Best2002; Song, Demuth, & Morgan, Reference Song, Demuth and Morgan2010; Trainor & Desjardins, Reference Trainor and Desjardins2002). However, there is continuing debate over the extent to which each of these components relates to infants’ emerging language abilities (Cristia, Reference Cristia2013).
In this study, we investigate longitudinally mothers’ use of these three components in their IDS when their infants are 7, 9, 11, 15, and 19 months of age. It must be noted that these IDS components are present in maternal and paternal IDS (e.g., Fernald et al., Reference Fernald, Taeschner, Dunn, Papoušek, Boysson-Bardies and Fukui1989). However, only maternal IDS was analysed in this study since mothers accompanied their infants to the laboratory visits. The overarching aim is to investigate the degree to which heightened pitch, positive affect, and vowel hyperarticulation in IDS might each be adapted to infants’ age and linguistic proficiency, and how the strength of each component might impact upon later vocabulary development. The following introductory sections review previous research on adjustments of IDS components as a function of infant age and their impact on infants’ language development.
Effects of infants’ age on the acoustic components of IDS
IDS is the product of the dynamic interaction of variables associated with the mother (Kaplan, Danko, Cejka, & Everhart, Reference Kaplan, Danko, Cejka and Everhart2015; Kaplan, Danko, Kalinka, & Cejka, Reference Kaplan, Danko, Kalinka and Cejka2012), the infant (Kalashnikova, Goswami, & Burnham, Reference Kalashnikova, Goswami and Burnham2018; Lam & Kitamura, Reference Lam and Kitamura2010), and the quality of their interaction (Lam & Kitamura, Reference Lam and Kitamura2012; Smith & Trainor, Reference Smith and Trainor2008). Some dynamic adjustments of IDS are attributed to mothers’ sensitivity to their infants’ needs (Papoušek, Reference Papoušek2007). Such adjustments are most apparent when the needs of the infant change – with infant age, or with qualitative changes in the manner in which infants perceive speech input, such as from more language-general to more language-specific speech perception (Werker, Yeung, & Yoshida, Reference Werker, Yeung and Yoshida2012). Adjustments of pitch, affect, and vowel articulation in IDS are considered in turn below.
First, pitch has been proposed to serve the function of attracting infants’ attention to the speech in their environment (Cooper & Aslin, Reference Cooper and Aslin1990; Fernald & Simon, Reference Fernald and Simon1984). The pitch component of IDS is indexed by an increase in the average height and range of fundamental frequency (F0) of maternal utterances produced in IDS compared to ADS (Fernald & Simon, Reference Fernald and Simon1984; Fernald et al., Reference Fernald, Taeschner, Dunn, Papoušek, Boysson-Bardies and Fukui1989; Trainor, Austin, & Desjardins, Reference Trainor, Austin and Desjardins2000). While pitch height and range are consistently different (higher) in IDS compared to ADS, adjustments to pitch in IDS also vary as a function of infant age. Pitch height increases over infant age up to around 12 months (Kitamura, Thanavishuth, Burnham, & Luksaneeyanawin, Reference Kitamura, Thanavishuth, Burnham and Luksaneeyanawin2002), then decreases around 16 to 30 months of age (Remick, Reference Remick, von Raffler-Engel and Lebrun1976; Stern, Spieker, Bernett, & MacKain, Reference Stern, Spieker, Barnett and MacKain1983). Pitch modifications in the speaker's IDS have also been related to developmental changes in infants’ early preferences for IDS. For instance, newborn infants prefer to listen to IDS over ADS (Cooper & Aslin, Reference Cooper and Aslin1990), but this preference wanes around 9 months, then waxes again around 12 months (Hayashi, Tamekawa, & Kiritani, Reference Hayashi, Tamekawa and Kiritani2001; Newman & Hussain, Reference Newman and Hussain2006). This developmental pattern is suggested to be due to infants’ early attention to the exaggerated pitch patterns of IDS at birth giving way to attention to phonetic features and then increased attention between 6 and 9 months to the specific phonemes in their native language (Hayashi et al., Reference Hayashi, Tamekawa and Kiritani2001).
The second component, positive vocal affect, functions to transmit positive emotion and regulate infants’ emotional states during early parent–infant interactions (Papoušek, Bornstein, Nuzzo, Papoušek, & Symmes, Reference Papoušek, Bornstein, Nuzzo, Papoušek and Symmes1990). The affective component of IDS is measured by independent ratings of low-pass filtered samples of naturally produced IDS (semantic information removed but prosodic information intact), in which raters are instructed to assess the positive valence and communicative intent of the speech (Kitamura & Burnham, Reference Kitamura and Burnham2003). This component is often confounded with pitch, as changes to F0 height and contour are two of the acoustic correlates of the positive affect transmitted in the voice. However, positive vocal affect is indeed manifested and perceived independently of the pitch component, as can be seen in two ways. First, positive affect is characterised by acoustic components aside from F0, such as greater amplitude and faster speech rate (Scherer, Reference Scherer1986), and modifications to the second formant (F2) (attributed to adjustments to the shape of the vocal tract that result from smiling; Benders, Reference Benders2013; Fagel, Reference Fagel, Esposito, Campbell, Vogel, Hussain and Nijholt2010). Second, young infants differentiate speech that is high in positive affect from speech that has increased pitch height and pitch range. Six-month-old infants prefer listening to passages of natural speech that have greater positive affect compared to passages that are low in affect, even when the pitch height and range of the passages are kept equivalent (Kitamura & Burnham, Reference Kitamura and Burnham1998). In addition, infants no longer prefer IDS over ADS if the positive affect in ADS is increased (i.e., happy ADS; Singh et al., Reference Singh, Morgan and Best2002). Turning to IDS vs. ADS, Kitamura and Burnham (Reference Kitamura and Burnham2003) showed that IDS is rated to have significantly higher positive affect than ADS. Importantly, this study also demonstrated that the nature of the positive affect in IDS varies according to infant age. Mothers produce speech that is more comforting to infants at 3 months, more approving at 6 months, and more directive at 9 months. Most interestingly, changes in infants’ preferences over age for speech varying in degree of rated comfort, approval, and direction match the relative predominance of these communicative intentions in IDS addressed to infants at each age. That is, from 3 to 6 months, infants’ listening preferences change from comforting to approving IDS, and from 6 to 9 months from approving to directive IDS (Kitamura & Lam, Reference Kitamura and Lam2009).
The third component, vowel hyperarticulation, is indexed by the area of the triangle in a plot of the first and second formant (F1, F2) values of the three corner vowels /i, u, a/. Expansion of this vowel triangle has been proposed to result in clearer, more intelligible speech (Bradlow, Torretta, & Pisoni, Reference Bradlow, Torretta and Pisoni1996), and indeed vowel triangle area is significantly greater in IDS than in ADS (Burnham et al., Reference Burnham, Kitamura and Vollmer-Conna2002; Kalashnikova et al., Reference Kalashnikova, Carignan and Burnham2017; Kuhl et al., Reference Kuhl, Andruski, Chistovich, Chistovich, Kozhevnikova, Ryskina and Lacerda1997; Uther, Knoll, & Burnham, Reference Uther, Knoll and Burnham2007). However, vowel hyperarticulation is not present in IDS addressed to all infants; it is absent in IDS addressed to infants whose auditory abilities are impaired due to cognitive (genetic risk for dyslexia; Kalashnikova et al., Reference Kalashnikova, Goswami and Burnham2018) or sensory disorders (hearing loss; Lam & Kitamura, Reference Lam and Kitamura2010). Thus, across all three components – pitch, affect, and vowel articulation – maternal sensitivity to infants’ linguistic and perceptual needs appears to modulate the degree to which each component is present in IDS during communicative interactions with young infants.
Unlike evidence for changes in pitch and affect in IDS as a product of infant age, the longitudinal evidence on the nature and degree of vowel hyperarticulation is mixed. Vowel hyperarticulation has most often been reported in maternal speech to infants around 6 months (Burnham et al., Reference Burnham, Kitamura and Vollmer-Conna2002; Kuhl et al., Reference Kuhl, Andruski, Chistovich, Chistovich, Kozhevnikova, Ryskina and Lacerda1997). Further studies suggest no changes over age. Cristia and Seidl (Reference Cristia and Seidl2014) compared the degree of vowel hyperarticulation in speech to infants at 4 and 11 months and found no age differences. Similarly, Burnham, Wieland, Kondaurova, McAuley, Bergeson, and Dilley (Reference Burnham, Wieland, Kondaurova, McAuley, Bergeson and Dilley2015) reported no effects of age on vowel production in IDS in a more comprehensive set of longitudinal analyses of IDS from 3 to 9 months, and cross-sectional comparisons at 3, 9, 13, and 20 months. However, in their tasks, mothers were asked to read to their infants, so these findings may not generalise to naturally produced IDS given that reading has an inherent didactic function compared to the emergent didactic function of speech in mother–infant interactions (Fitzgerald, Spiegel, & Cunningham, Reference Fitzgerald, Spiegel and Cunningham1991). Contrary to these findings of constant degree of hyperarticulation over age, an early study by Bernstein (Reference Bernstein1983) showed that, while vowel hyperarticulation was present in speech to infants across a variety of ages, the vowel categories became selectively more distinct and less overlapping in maternal speech addressed to infants who produced higher mean length of utterances (between three and four words). Therefore, it is possible that while English-language mothers generally hyperarticulate speech sounds in IDS, the degree of hyperarticulation is a product of infants’ linguistic competence rather than their age.
The relationship among individual IDS components and infants’ emerging language skills
The relationship between IDS and linguistic competence is important and has been investigated regarding the quantity (extent of exposure) and quality (strength of particular components) of IDS. Regarding IDS quantity, the number of words directed to the infant in day-to-day interactions in the home, as well as the context of the parent–infant interaction, significantly predicts young infants’ concurrent and future vocabulary size (Cartmill, Armstrong, Gleitman, Goldin-Meadow, Medina, & Trueswell, Reference Cartmill, Armstrong, Gleitman, Goldin-Meadow, Medina and Trueswell2013; Hurtado, Grüter, Marchman, & Fernald, Reference Hurtado, Grüter, Marchman and Fernald2013; Ramirez-Esparza, Garcia-Sierra, & Kuhl, Reference Ramirez-Esparza, Garcia-Sierra and Kuhl2014; Weisleder & Fernald, Reference Weisleder and Fernald2013). With respect to its quality, there is evidence suggesting that IDS supports some aspects of language acquisition. Infants are more successful in speech segmentation (Thiessen, Hill, & Saffran, Reference Thiessen, Hill and Saffran2005), familiar word recognition (Singh, Nestor, Parikh, & Yull, Reference Singh, Nestor, Parikh and Yull2009; Song et al., Reference Song, Demuth and Morgan2010), and novel word learning (Graf-Estes & Hurley, Reference Graf-Estes and Hurley2013; Ma, Golinkoff, Houston, & Hirsh-Pasek, Reference Ma, Golinkoff, Houston and Hirsh-Pasek2011) when speech is presented in IDS compared to ADS.
A number of studies have investigated the effect of the three IDS components of interest here – pitch, affect, and vowel articulation – on infants’ language processing. Regarding pitch, heightened pitch range and variation in pitch contours, but not pitch height, in IDS compared to ADS have been demonstrated to elicit more successful discrimination of speech sounds in infants at 6 and 7 months of age (Trainor & Desjardins, Reference Trainor and Desjardins2002). Regarding affect, Singh, Morgan, and White (Reference Singh, Morgan and White2004), showed that positive compared to neutral affect in IDS did not influence word recognition performance in infants at 7 and 10 months.
In contrast, vowel hyperarticulation has been shown to facilitate performance in a linguistic task, namely lexical processing. Song et al. (Reference Song, Demuth and Morgan2010) assessed infants’ ability to identify the referents of familiar words in a preferential looking paradigm when the words were presented in ADS, or in IDS manipulated in such a way that it only exaggerated pitch height, only a slower speaking rate, or only the vowel hyperarticulation component. Nineteen-month-old infants were more successful in this task when there was a slower speaking rate or vowel hyperarticulation, but not heightened pitch, indicating that speech rate and vowel articulation facilitate lexical processing mechanisms, whereas pitch does not. Together these studies suggest that while pitch, affect, and exaggerated vowels are all prominent components of IDS, they may serve different functions: exaggerated pitch and positive affect may successfully attract infants’ attention to the linguistic input and elicit a preference to IDS compared to ADS (Cooper & Aslin, Reference Cooper and Aslin1990; Fernald et al., Reference Fernald, Taeschner, Dunn, Papoušek, Boysson-Bardies and Fukui1989; Singh et al., Reference Singh, Morgan and Best2002; Trainor & Desjardins, Reference Trainor and Desjardins2002), whereas vowel hyperarticulation may serve a specific linguistic function and support early language acquisition processes (Burnham et al., Reference Burnham, Kitamura and Vollmer-Conna2002; Kuhl, Reference Kuhl2004; Kuhl et al., Reference Kuhl, Andruski, Chistovich, Chistovich, Kozhevnikova, Ryskina and Lacerda1997).
Despite such indications, it remains unclear whether individual differences in the manifestation of IDS components in maternal speech, specifically hyperarticulation of speech sounds, have an impact on infants’ developing linguistic competence beyond facilitating real-time performance in a laboratory task. To date, only two studies have investigated directly mothers’ degree of vowel hyperarticulation in their IDS and its relation to their infants’ speech perception ability and vocabulary size. Liu et al. (Reference Liu, Kuhl and Tsao2003) measured mothers’ vowel articulation in IDS to infants at 6–8 and 10–12 months, and infants’ ability to discriminate a native consonant contrast at 7 months. They found a significant positive correlation between hyperarticulation in IDS and infants’ speech perception skills. In a later study, Hartman, Ratner, and Newman (Reference Hartman, Ratner and Newman2017) demonstrated that the degree of vowel hyperarticulation in maternal IDS to infants at 18 months was a significant predictor of infants’ receptive and expressive vocabulary size at two years of age. These findings suggest that vowel hyperarticulation not only supports the acquisition of infants’ native language phonetic categories during their first months, but also facilitates later language outcomes such as vocabulary size.
Aims of this study
Given the uncertainty about the linguistic relevance of IDS components and their role in linguistic development, this study has two aims. The first aim is to assess the manifestation of heightened pitch, positive affect, and vowel hyperarticulation components in IDS longitudinally across infant ages of 7, 9, 11, 15, and 19 months. Three alternative predictions are made. First, if these three IDS components serve only to facilitate early speech perception processes, then we predict that the degree to which they are manifested in IDS should decrease after 11 months. This age marks two milestones – the sophistication of perceptual attunement as indexed by infants’ greater attention to native over non-native speech contrasts (Werker et al., Reference Werker, Yeung and Yoshida2012), and a decrease in infants’ preference for IDS over ADS (Hayashi et al., Reference Hayashi, Tamekawa and Kiritani2001; Newman & Hussain, Reference Newman and Hussain2006). Second, if the manifestation of these three IDS components is related to infants’ increasing linguistic competence, then we predict an increase in the degree to which they are manifested in IDS as a function of infant age (Bernstein, Reference Bernstein1983). Finally, if these three components of IDS support linguistic processing across domains (e.g., speech perception, speech segmentation, lexical processing, lexical acquisition), then we predict that their manifestation would remain unaltered from 7 to 19 months as infants continue to acquire these native language skills well after their first birthday (Song et al., Reference Song, Demuth and Morgan2010).
The second aim is to evaluate the relation between IDS components and infants’ language outcomes, specifically their expressive vocabulary size at 15 and 19 months. Measures of vocabulary size at these two ages were included in order to capture infants’ lexical development before and after the age at which infant vocabulary has been shown to undergo significant growth (Fenson et al., Reference Fenson, Dale, Reznick, Bates, Thal, Pethic and Stiles1994). We predict that, if the function of IDS is solely to increase infants’ attention to linguistic input (McMurray, Kovack-Lesh, Goodwin, & McEchron, Reference McMurray, Kovack-Lesh, Goodwin and McEchron2013), then only the degree of heightened pitch and positive affect in IDS compared to ADS will predict infants’ vocabulary size. If, however, vowel hyperarticulation in IDS serves a linguistic function, then vocabulary size will be predicted (also or only) by the degree of vowel hyperarticulation (Hartman et al., Reference Hartman, Ratner and Newman2017; Kuhl, Reference Kuhl2004; Liu et al., Reference Liu, Kuhl and Tsao2003; Song et al., Reference Song, Demuth and Morgan2010). The relative validity of these three predictions will be determined by the degree to which each of the three components predicts vocabulary size. In turn, the degree to which each of these three components in IDS at the different ages predicts vocabulary size will bear on the function of these components at the different ages.
Method
Participants
Eighteen mother–infant dyads participated in the study. All mothers (M age = 33.2 years; SD = 4.6) were native speakers of Australian English, and did not report any language or cognitive disabilities. All infants (13 female) were acquiring English as their first language, did not have any reported health problems, and were not at-risk for language or cognitive disabilities. Mothers’ educational levels ranged from a higher education professional degree to a master's degree (Median = university bachelor degree). All mother–infant dyads were part of the ‘Seeds of Literacy’ five-year longitudinal study. The dyads were selected for the sample based on their availability to complete all the IDS and ADS sessions.
The IDS sessions were conducted longitudinally at the infant ages of 7 (M = 31.7 weeks, SD = 1.5), 9 (M = 39.8 weeks, SD = 1.2), 11 (M = 48.3 weeks, SD = 0.8), 15 (M = 66.2 weeks, SD = 2.2), and 19 months (M weeks = 83.8, SD = 1.8). In addition, mothers completed an ADS session when their infants were about 12 months.
Infant-directed speech
IDS and ADS sessions were recorded in a child-friendly laboratory room. During the IDS sessions, mothers and infants were alone in the room. Mothers sat facing their infants who sat in a high chair. The sessions were video-recorded using four digital video cameras, one placed in each corner of the room, which allowed the experimenter to monitor the sessions on a four-way split screen from an adjoining room. The mothers wore a head-mounted microphone (AudioTechnica AT892) connected to Adobe Audition CS6 software via an audio input/output device (MOTU Ultralite MK3). During the IDS session, mothers were provided with a toy sheep, toy shark, and a baby shoe to elicit the target words sheep, shoe, and shark. The mothers were instructed to play naturally with their baby using these toys, so they were unaware that the specific vowels were of interest to this study. During the ADS session, a female experimenter, a native speaker of Australian English, interviewed each mother in the same room and asked questions about the IDS section, thus eliciting the same three target words. Infants were absent during the ADS sessions and were cared for by a lab research assistant. The IDS and ADS sessions each lasted for approximately five minutes.
For analyses, the IDS and ADS recordings were split into segments, with a segment defined as a period of mother's speech not interrupted by vocalisations of the infant or noises from the environment. Praat software (Boersma & Weenink, Reference Boersma and Weenink2005) was used to identify and excise these segments. Mean pitch height was determined for each entire segment for each mother in each register (see Table 1 for the mean duration of segments used for analyses). These segments were also used for affect ratings – see below). Next, the target words, sheep, shoe, and shark, were identified in each segment, their onset and offset were manually determined, and then each word was excised. Next, the target corner vowels /i/, /u/, /a/ were extracted from each of these words (note that /r/ is not rhotic in Australian English). Table 1 shows the mean number of target vowels extracted from each recording in each of the registers. Praat scripts were used to measure the fundamental frequency and first (F1) and second (F2) formants for each vowel using the mean value in Hz from the 40% and 80% points of each vowel's duration (Munhall, MacDonald, Byrne, & Johnsrude, Reference Munhall, MacDonald, Byrne and Johnsrude2009). Given the logarithmic nature of pitch perception, all F0 values were converted from Hz to perceptual units (Mels) using the formula: Semitone = 12LOG2(Mean F0). Mean F1 and F2 coordinates were derived for each target vowel for each mother for each register, the centroids of the clusters of values for each vowel determined, and then used to calculate the area of each of the six vowel triangles (ADS, IDS at 7, 9, 11, 15, and 19 months) for each mother. Vowel triangle areas were calculated using the formula: ABS ½ × [(F1/a/ × (F2/i/ – F2/u/) + F1/i/ × (F2/u/ – F2/a/) + F1/u/ × (F2/a/ – F2/i/)], where F1/a/ refers to the average value in Hz of the first formant for the vowel /a/, F2/i/ to the average value in Hz of the second formant for the vowel /i/, and so forth.
Affect ratings
Affect rating stimuli
For each ADS and IDS recording, two segments that contained no environmental noise were selected for the affect ratings: one from the start of the recording and one three minutes into the recording, or the closest available time-point to three minutes. The selected samples were low-pass filtered at 400 Hz using Cool Edit Pro software. The two segments for each mother in each recording were then concatenated into a single string separated by five seconds of silence. This resulted in a total of 108 strings (18 mothers × 6 recordings each) for use as stimuli for the affect-rating task.
Raters and procedure
The 108 strings were randomly separated into blocks of 32–35 strings each to make the affect rating task manageable. Fifteen adult raters (undergraduate Psychology students) were asked to rate each block of strings. The low-pass filtered strings were presented via headphones. Raters were instructed to listen to each string and rate it on five seven-point Likert scales: (i) affective content, and the speaker's communicative intention to (ii) express affection, (iii) encourage attention, (iv) comfort or sooth, and (v) direct behaviour (Kitamura & Burnham, Reference Kitamura and Burnham1998, 2003).
Derivation of affect scores
Ratings on each scale for each string were averaged across raters and entered in a Principal Components Analysis, which, as has been found in previous studies (e.g., Burnham et al., Reference Burnham, Kitamura and Vollmer-Conna2002; Kalashnikova et al., Reference Kalashnikova, Goswami and Burnham2018; Uther et al., Reference Uther, Knoll and Burnham2007), yielded two components for each register, ‘expressing affection’ and ‘directing attention’ (see ‘Appendix’). Note that the ‘directing attention’ component, which reflects maternal communicative intentions in IDS, is not of interest for the present analyses, but it is reported here for purposes of completeness as it is one of the measures produced by this rating scale (Kitamura & Burnham, Reference Kitamura and Burnham2003). Averaged factor scores for the ‘expressing affection’ component constituted the ‘affect scores’ for use in the subsequent Analyses of Variance – see ‘Results’ section.
Expressive vocabulary size
All mothers completed the OZI: Australian English Communicative Development Inventory (Kalashnikova, Schwarz, & Burnham, Reference Kalashnikova, Schwarz and Burnham2016) when their infants were 15 and 19 months of age. The OZI is the Australian English adaptation of the MacArthur-Bates Communicative Development Inventory (Fenson et al., Reference Fenson, Dale, Reznick, Bates, Thal, Pethic and Stiles1994). It is a vocabulary checklist containing 558 words that may be familiar to infants between 12 and 30 months, in which parents are asked to indicate the words that their infant can produce. As expected, infants’ expressive vocabulary size increased significantly from 15 (M = 14.77, SE = 4.9) to 19 months (M = 80.53, SE = 27.35) (t(16) = 2.85, p = .012, d = .811).
Results
The results are presented in two parts, relating respectively to Aim 1 (longitudinal modulation of mothers’ IDS components) and Aim 2 (prediction of infants’ vocabulary size) of the study. First, scores for the pitch, affect, and vowel articulation components are compared across infant age in repeated measures Analyses of Variance (ANOVAs). For these analyses, hyper-scores were calculated for each mother at each age by dividing her IDS pitch, affect, and vowel space area scores by the corresponding ADS scores. This way, individual differences were reduced by having each mother's own ADS productions act as her own baseline. Additional to the ANOVAs, one-sample t-tests were used to compare each hyper-score to the value of 1 to determine whether hyper scores were significantly >1 (denoting hyperarticulation – heightened pitch, affect, or vowel triangle area in IDS compared to ADS), < 1 (denoting hypoarticulation – reduced pitch, affect, or vowel triangle area in IDS compared to ADS), or = 1 (denoting neither). Hyper-scores are plotted in Figure 1, and Table 2 provides a summary of t-test results.
Notes. ** p < .001; * p < .05; ^ p = .07.
Second, regression analyses were conducted to determine the variance explained by each of the three IDS components in infants’ expressive vocabulary scores at 15 and 19 months.
Hyper-pitch
As can be seen in Table 2, one-sample t-tests indicated that mothers produced hyperarticulated pitch in IDS at all five infant ages. A repeated-measures ANOVA with infant age (5) as the within-subjects factor and hyper-pitch score as the dependent variable yielded no significant main effect of age (F(4,14) = .420, p = .791, η 2 = .107) (see Figure 1). Thus, there was significant hyper-pitch to an equivalent degree across the five IDS ages.
Hyper-affect
Mothers significantly exaggerated the affective qualities of their speech when addressing their infants across all ages (see Table 2), and a repeated-measures ANOVA with age (5) as the within-subjects factor showed no significant differences across age (F(4,12) = .288, p = .880, η 2 = .088) (see Figure 1). Thus, there was significant hyper-affect to an equivalent degree across the five IDS ages.
Hyper-vowels
Figure 2 shows the vowel area triangles for IDS at 7, 9, 11, 15, and 19 months in comparison to ADS. One-sample t-tests showed that there was significant vowel hyperarticulation (hyper-vowels) in IDS to infants at 7, 9, 11, and 19 months, with IDS at 15 months just failing to reach significance (see Table 2). However, as for hyper-pitch and hyper-affect, the ANOVA showed that there was no effect of age on hyper-vowel scores (F(4,14) = .603, p = .667, η 2 = .147) (Figure 1). Thus, there was significant hyperarticulation of vowels at four of the five ages, and the degree of hyperarticulation at the five ages did not differ.
Regression analyses
In order to extract the predictor variables for the regression analyses, factor analyses were conducted across ages and across the three IDS measures (hyper-pitch, hyper-affect, and hyper-vowels). Two factor analyses were required: Factor Analysis 1 included IDS scores from 7 to 15 months (to predict vocabulary at 15 months), and Factor Analysis 2 included IDS scores from 7 to 19 months (to predict vocabulary at 19 months). Factor scores were computed via the regression method (see Tables 3 and 4). As can be seen, even though Factor Analysis 2 included the additional time-point of 19 months, the factor loadings were similar for the two analyses. Four factors were identified in each: (i) hyper-vowels at 7 months, (ii) hyper-vowels at the other ages (9–15 months in Factor Analysis 1; and 9–19 months in Factor Analysis 2), (iii) hyper-pitch, and (iv) hyper-affect. The factor scores for each of these four factors were then used as predictors in regression analyses to assess the degree of variance explained in infants’ expressive vocabulary scores at 15 and 19 months.
Two multiple regression analyses were conducted with each of the four IDS factor scores entered at each block. In the Vocabulary Regression Analysis for 15 months, expressive vocabulary score at 15 months was the dependent variable and the four hyper factor scores from Factor Analysis 1 were the independent variables. In the Vocabulary Regression Analysis for 19 months, expressive vocabulary score at 19 months was the dependent variable and the four hyper factor scores from Factor Analysis 2 were the independent variables. In each regression analysis, the predictor variables were entered in four blocks in the following order: hyper-affect, hyper-pitch, hyper-vowels at 7 months, and hyper-vowels at and after 9 months. This order was determined on the basis of our prediction that if vowel hyperarticulation serves a specific linguistic function, it would explain a significant amount of variance in infants’ vocabulary scores over and above affect and pitch.
As can be seen in Tables 5 and 6, only the fourth block of each model, when the hyper-vowel scores from 9 months onward were included, predicted a significant amount of variance in expressive vocabulary scores (51.9% for the 15-months vocabulary regression analysis and 55.9% for the 19-months vocabulary regression analysis). As expected, neither overall hyper-pitch nor overall hyper-affect predicted vocabulary, but unexpectedly, neither did hyper-vowel scores at 7 months.
To ensure that the order of the blocks in the multiple regressions did not influence the results, additional regression analyses were conducted where the hyper-vowel scores from 9 months onward were entered before the hyper-vowel score at 7 months. Confirming our initial results, only the third blocks of each analysis (the block in which the hyper-vowel scores at ages 9 months and greater were included) resulted in a significant prediction, so only hyper-vowel scores at 9 to 15 months and 9 to 19 months predicted a significant amount of variance in expressive vocabulary scores at 15 months (R 2 = .512, ΔR 2 = .469, F change (1,12) = 14.422, p = .003) and 19 months (R 2 = .642, ΔR 2 = .570, F change (1,11) = 22.322, p = .001). The addition of hyper-vowels at 7 months in the last block of these re-ordered regression analyses did not produce a significant R change (ΔR 2 = .038, and ΔR 2 = .007 for regressions for 15 and 19 months, respectively). Finally, a post-hoc power analysis for the two regression models reported above was conducted to further investigate the robustness of the present findings. The observed power was high for both Model 1 (.854) and Model 2 (.965) (Soper, Reference Soper2017).
Discussion
In this longitudinal study the pitch, affect, and vowel articulation components of IDS were investigated regarding (i) their stability in mothers’ speech from 7 to 19 months, and (ii) their prediction of infants’ vocabulary size. With respect to component stability, the results show that the manifestation of the three components in IDS compared to ADS, that is, hyper-pitch, hyper-affect, and hyper-vowels, did not vary across age. This supports the prediction that these qualities of IDS have both attentional (pitch and affect) and linguistic (vowel hyperarticulation) functions, both in the first months and later, well into the second year of life. With specific respect to prediction of language development, the degree of vowel hyperarticulation (but not hyper-pitch or hyper-affect), significantly predicted infants’ expressive vocabulary scores both at 15 and 19 months. This confirms the linguistic utility of vowel hyperarticulation. Moreover, it was the degree of vowel hyperarticulation only at and after 9 months (and not at 7 months) that predicted vocabulary. These results are discussed in turn below.
Our findings dovetail with previous longitudinal investigations of these three components of IDS (Burnham et al., Reference Burnham, Wieland, Kondaurova, McAuley, Bergeson and Dilley2015; Cristia & Seidl, Reference Cristia and Seidl2014; Kitamura & Burnham, Reference Kitamura and Burnham2003; Kitamura et al., Reference Kitamura, Thanavishuth, Burnham and Luksaneeyanawin2002; Wang, Seidl, & Cristia, Reference Wang, Seidl and Cristia2015) and extend them by showing that hyper-pitch, hyper-affect, and vowel hyperarticulation continue to be manifested in mother–infant interactions even when infants have developed more advanced expressive language skills. Previous research has demonstrated that, even though infant preference for IDS over ADS decreases around 9 months, infants continue to extract more benefit from IDS than from ADS at later ages, especially in cognitively demanding tasks such as familiar-word recognition (Singh et al., Reference Singh, Nestor, Parikh and Yull2009; Song et al., Reference Song, Demuth and Morgan2010) and fast-mapping (Graf Estes & Hurley, Reference Graf-Estes and Hurley2013; Ma et al., Reference Ma, Golinkoff, Houston and Hirsh-Pasek2011). In fact, Ma et al. (Reference Ma, Golinkoff, Houston and Hirsh-Pasek2011) showed that infants only become reliably successful at learning novel words from ADS at 27 months, suggesting that the specific qualities of IDS are essential for promoting linguistic processing well after the infants’ first birthday.
There is an important rider to these conclusions: factor analyses identified vowel hyperarticulation at 7 months as a separate factor from vowel hyperarticulation scores at 9 through to 19 months. Furthermore, only hyperarticulation scores recorded from IDS at and after 9 months significantly predicted infants’ vocabulary development. These results point to the intriguing possibility that, while the strength of the three components of the IDS signal is similar across this age range, either (i) there are age-related qualitative aspects (or other quantitative aspects not measured here) that differ over age; or (ii) the signal remains unchanged but infants’ perception of IDS changes due to their cognitive and linguistic development (see, e.g., the PRIMIR theory; Werker & Curtin, Reference Werker and Curtin2005). For instance, Kitamura and Burnham (Reference Kitamura and Burnham2003) showed that communicative intentions attributed to maternal IDS addressed to infants at 6 months were to express affection and encourage their infants’ attention, whereas the communicative intent of IDS to infants at 9 months is to direct infants’ behaviour. Thus, when addressing their younger infants, mothers appear to employ IDS to maximise their infants’ engagement in the communicative process, while the more didactic intentions emerge around 9 months. In addition, infants’ linguistic capacities undergo significant changes around the age of 9 months. At this age, infants become more language-specific listeners – they recognise the phonotactic patterns of their language (Jusczyk, Cutler, & Redanz, Reference Jusczyk, Cutler and Redanz1993), discriminate native language phonemic contrasts but have reduced attention to non-native non-phonemic contrasts (Werker & Tees, Reference Werker and Tees1984), and recognise many words of their language (Bergelson & Swingley, Reference Bergelson and Swingley2012). Thus, it is possible that the incidence and nature of the components of IDS are tailored to the infants’ linguistic competence such that the vowel hyperarticulation in IDS to infants in their first months supports processes such as speech segmentation and acquisition of phonological categories, but in older infants, vowel hyperarticulation supports more sophisticated processes such as lexical processing and novel word learning.
This is the first study to investigate the relation between hyperarticulated pitch, affect, and vowel production in IDS and infants’ emerging vocabulary skills, and to do so across a wide range of infant ages. As only vowel hyperarticulation (and not hyper-pitch or hyper-affect) significantly predicts infants’ expressive vocabulary at 15 and 19 months, then it appears that vowel hyperarticulation may be involved in facilitating infants’ ability to use early linguistic input to build their vocabularies. The exact causal relation, status, and direction between vowel hyperarticulation in maternal speech and infants’ emerging language abilities require further research. It is, of course, possible that maternal tendency to produce clear speech, and their infants’ advanced linguistic competence, are underpinned by a shared genetic factor. This is, however, unlikely as it has been demonstrated that mothers’ degree of vowel hyperarticulation can be experimentally manipulated; when the audibility of mothers’ speech by the infant is reduced, mothers’ degree of hyperarticulation is similarly reduced (Lam & Kitamura, Reference Lam and Kitamura2012). Moreover, vowel hyperarticulation is absent in mothers’ speech to infants at-risk for dyslexia, even when the mother herself is not dyslexic (Kalashnikova et al., Reference Kalashnikova, Goswami and Burnham2018). These two studies suggest that the degree of vowel hyperarticulation is regulated by infant signals to the mother. While a common genetic component is still possible, more consistent with our findings is the notion that mothers are sensitive, albeit possibly unconsciously, to their infants’ developmental needs, and adjust their speech in order to facilitate their infants’ linguistic development.
The mechanisms by which vowel hyperarticulation facilitates vocabulary development require further investigation. It is possible that this relationship is mediated by the impact of vowel hyperarticulation on the development of more basic linguistic abilities such as speech perception and lexical processing. Liu and colleagues (2003) showed that mothers who hyperarticulate vowels to a greater degree have 7-month-olds with better speech perception skills, and it has been previously demonstrated that advanced speech perception is a significant predictor of vocabulary development (see Cristia, Seidl, Junge, Soderstrom, & Hagoort, Reference Cristia, Seidl, Junge, Soderstrom and Hagoort2014, for a review). Similarly, individual differences in exposure to IDS (Weisleder & Fernald, Reference Weisleder and Fernald2013) and the presence of hyperarticulation in IDS (Song et al., Reference Song, Demuth and Morgan2010) predict infants’ lexical processing abilities, which are also significant predictors of infants’ later vocabulary size (Fernald & Marchman, Reference Fernald and Marchman2012). An alternative but not mutually exclusive explanation is that hyperarticulation of speech sounds in IDS enhances the clarity of maternal speech, which allows infants more effectively to tune into and extract the information that is most relevant for their particular stage of development.
A remaining question concerns vocabulary development of infants for whom vowel hyperarticulation is absent in their linguistic input – infants with impaired hearing (Lam & Kitamura, Reference Lam and Kitamura2010) or at genetic risk for dyslexia (Kalashnikova et al., Reference Kalashnikova, Goswami and Burnham2018). For instance, Lam and Kitamura (Reference Lam and Kitamura2010) recorded IDS produced by a mother when interacting with her twin sons, one who had normal hearing and one who suffered from hearing loss, and found that the mother only hyperarticulated vowels to the normal hearing son. Nevertheless, the vocabulary size of the infant with hearing loss was similar, in fact slightly larger than that of his hearing twin, so it is possible that while the mother did not hyperarticulate vowels in IDS to her son with hearing loss, there were other cues in her speech that bore upon vocabulary development. A similar argument can be proposed with respect to languages and cultures in which IDS manifests different qualities from those reported here for Australian English. For instance, differences in prosodic exaggeration have been reported between American English and other languages such as Japanese, French, Italian, and German, and also British English (Fernald et al., Reference Fernald, Taeschner, Dunn, Papoušek, Boysson-Bardies and Fukui1989; Floccia et al., Reference Floccia, Keren-Portnoy, DePaolis, Duffy, Delle Luche, Durrant and Vihman2016). Mothers in other cultures such as the Quiche Mayan (Guatemala; Ratner & Pye, Reference Ratner and Pye1984) and Kaluli (New Guinea; Schieffelin, Reference Schieffelin, Ochs and Schieffelin1979) have been shown to decrease pitch in IDS compared to ADS or not to engage their infants directly in communicative interactions. With regard to vowel hyperarticulation, it has also been reported as absent in Dutch, Norwegian, and Japanese (Benders, Reference Benders2013; Englund & Behne, Reference Englund and Behne2005; Martin et al., Reference Martin, Schatz, Versteegh, Miyazawa, Mazuka, Dupoux and Cristia2015). These differences in IDS manifestations warrant further investigation; while mothers addressing infants acquiring some languages other than English or addressing infants with deficits in sensory or cognitive skills do not increase their pitch or hyperarticulate their vowels, they may emphasise other speech components (e.g., rhythmic regularity, visual hyperarticulation, language-specific vocabulary modifications, or phonetic qualities, etc.; e.g., Mazuka, Igarashi, Martin, & Utsugi, Reference Mazuka, Igarashi, Martin and Utsugi2015), which could serve as facilitative factors in infant linguistic development.
Our findings support the view that IDS assists early language acquisition processes. Specifically, it is the vowel hyperarticulation component of IDS that facilitates infants’ lexical development. However, while vowel hyperarticulation is equally strong across age, from 7 to 19 months, only vowel hyperarticulation at and after 9 months is related to later vocabulary size. This suggests a change in the role of vowel hyperarticulation over age, despite no change in the degree to which it is manifested. We contend that parents’ unconscious use of the IDS register and the vowel hyperarticulation therein reflects their sensitivity to the social, cognitive, or linguistic needs of their infant and allows them to provide the most optimal speech quality for their infant's linguistic and communicative development.
Acknowledgements
This research was supported by Australian Research Council grant DP110105123, ‘The Seeds of Literacy’, to the 2nd author and to Professor Usha Goswami, Centre for Neuroscience in Education, Cambridge University. We would like to thank all the parents and infants for their valuable time and interest in this research and Maria Christou-Ergos and Scott O'Loughlin for their assistance with data collection and analyses.
Appendix
Principal Components Analysis values for the ‘express affection’ (Affection) and ‘direct attention’ (Attention) components obtained from the affect rating data for ADS, IDS 7, 9, 11, 15, and 19 months