1 Introduction
1.1 English definite article allomorphy
The definite article the is the most commonly used word in English in both written and spoken language (Leach, Rayson & Wilson Reference Leach, Rayson and Wilson2001). It is believed to have derived from Old English (which originated from the masculine singular nominative form se) to become a distinct definite article (as head of the determiner phrase) by the Early Middle English period (Berg Reference Berg2011, Allen Reference Allen2016). The Modern English definite article has two major allomorphs: prevocalic /ðiː/Footnote 1 and preconsonantal /ðə/. In a corpus-based study of American English, Jurafsky et al. (Reference Jurafsky, Alan Bell, Girand and Raymond1998) found that the likelihood of /ðiː/ in the definite article was 14 times that of /ðə/ in prevocalic contexts. Although the definite article is described as having two major allomorphs, natural speech is highly overlapped and coarticulated which can lead to weak syllable reduction (see Davidson Reference Davidson2006, Bell et al. Reference Bell, Brenier, Gregory, Girand and Jurafsky2009, Seyfarth Reference Seyfarth2014). This is particularly true of high frequency function words such as the definite article and may result in a range of segmental/syllabic effects including shortened duration, segmental weakening and elision. A variant production of the definite article described as ‘vowel-less’ is common in many dialects of Northern England. In a phenomenon referred to as definite article reduction (DAR) (see e.g. Jones Reference Jones2002, Rácz Reference Rácz2012, Roeder Reference Roeder2012), the vowel is elided and the consonant is variably realised as [t], a glottal stop or a voiceless dental fricative (before vowels). We will not explore DAR further but use it as an illustration of the range of variation that may occur in this highly frequent function word.
The development of definite article allomorphy in children has only been examined in a few studies. Newton & Wells (Reference Newton, Wells, Maassen and Groenen1999) found that three-year-old children from Hereford, England tended to use /ðə/ in both prevocalic and preconsonantal contexts and developed allomorphy progressively to approach adult-like production by age seven. Children from a similar location in England showed increasing use of prevocalic /ðiː/ between seven and 10 years of age (Gaskell et al. Reference Gaskell, Helen Cox, Grieve and O’Brien2003). Recent research based on adult speech has found that definite article allomorphy is changing in some English varieties where it is undergoing regularisation to /ðə/ in both prevocalic and preconsonantal contexts. This change is evidenced by younger adult speakers using schwa in the prevocalic definite article (PVDA) more than older adult speakers (New Zealand: Hay et al. Reference Hay, Walker, McKenzie and Nielsen2012, Meyerhoff et al. Reference Meyerhoff, Alexandra Birchfield, Charters and Watson2020; US: Todaka Reference Todaka1992, Keating et al. Reference Keating, Byrd, Fleming and Todaka1994; UK: Cheshire et al. Reference Cheshire, Kerswill, Fox and Torgersen2011).
Findings also show that young people from culturally and linguistically diverse backgrounds are advanced with respect to allomorphic regularisation of the definite article (New Zealand: Meyerhoff et al. Reference Meyerhoff, Alexandra Birchfield, Charters and Watson2020; UK: Britain & Fox Reference Britain, Fox, Filppula, Klemola and Paulasto2009, Cheshire et al. Reference Cheshire, Kerswill, Fox and Torgersen2011, Fox Reference Fox2015). Similar observations have been made for some speakers of (middle class) South African English (Lass Reference Lass and Mesthrie2002). This finding is consistent with Trudgill’s (Reference Trudgill, Alexandra and Dixon2017:144) model of language change which predicts that a variety spoken in a community ‘which has experienced considerable contact with other communities speaking other varieties which are mutually intelligible with it will also undergo a certain amount of simplification’.
Sound change arises in response to both cognitive and social pressures that contribute variability which may in turn seed and propagate change (Harrington et al. Reference Harrington, Kleber, Reubold, Stevens, Esposito and Jain2016). Cognitive factors relate to the ability of the listener-speaker to associate categories and signals and their response to phonetic biasing conditions (such as motor planning, gestural mechanics, and aerodynamic constraints, as discussed in Garrett & Johnson Reference Garrett, Johnson and Yu2013). Social factors are associated with contact-based sociodemographic differences in language use across speakers. Change may be propelled within communities as a result of population movement that brings speakers of different dialects or languages together (Trudgill Reference Trudgill2004, Reference Trudgill, Alexandra and Dixon2017). Questions remain as to why and how the change to the definite article allomorphy has occurred in English. The integration of cognitive and social factors in analyses may help to provide insight into the change.
1.2 Hiatus resolution
When the definite article the is followed by a word beginning with a vowel, a phonologically sub-optimal hiatus context occurs leading to a heterosyllabic V#V sequence (e.g. the egg [ðiː eɡ̥]). Hiatus is dispreferred in many languages because it challenges the universal preference for a sonority trough between syllables (Bell & Hooper Reference Bell and Hooper1978). Vowel adjacency in hiatus may be resolved in English through various processes, most commonly those that reinstate sonority alternation, such as the insertion/emergence of a consonant between the two vowels or, alternatively, the use of glottalisation (Allerton Reference Allerton2000).
Berg (Reference Berg2011) describes the PVDA /ðiː/ allomorph as a repair strategy replacing schwa with /iː/ which facilitates management of the hiatus by supporting the emergence of [j] to prevent vowel adjacency. Broadbent (Reference Broadbent1991) considers consonant emergence of this type to be best modelled as glide formation whereby a consonant emerges to separate the two adjacent vowels. The characteristics of the emergent consonant are dependent on feature spreading from the vowel on the left edge of the hiatus and display phonological complementary distribution. High front vowels condition [j] (three eggs [θɹ̥iː j eɡz̥]), high non-front vowels condition [w] (two eggs [tʰʉː w eɡz̥]), and in non-rhotic varieties, non-high vowels condition [ɹ] (four eggs [foː ɹ eɡz̥]) (Allerton Reference Allerton2000, Casali Reference Casali, Marc van Oostendorp, Hume and Rice2011).
Davidson & Erker (Reference Davidson and Erker2014) propose that the percept of the glides [j] and [w] arises instead through interpolation rather than glide formation. They found for American English speakers that the perceived glide in phrases like see otters was acoustically different from the onset glide in phrases like see yacht. They argue that the hiatus glide percept is not phonologically-specified like the onset glides and can be interpreted as epiphenomenal arising through articulatory transition (see Gick & Wilson Reference Gick, Wilson, Goldstein, Whalen and Best2006, Heselwood Reference Heselwood2006).
It is possible that the percept of a glide in high vowel hiatus contexts may come about through the ‘trough’ effect. A trough effect describes a discontinuity that may occur between vowels in separate syllables. Articulatory and acoustic studies have found discontinuities between the two vowels during a VCV sequence involving an intervocalic labial stop (Lindblom et al. Reference Lindblom, Sussman, Modarresi and Burlingame2002, Fuchs et al. Reference Fuchs, Hoole, Brunner, Inoue, Slifka, Manuel and Matthies2004, Vazquez-Alvarez & Hewlett Reference Vazquez-Alvarez and Hewlett2007). For example, in /ibi/, the lingual activation required for the articulatory position of the first vowel is relaxed during the /b/ closure (lowering the tongue) but then re-activated (raising the tongue) for the final vowel. Lindblom et al. (Reference Lindblom, Sussman, Modarresi and Burlingame2002) propose that the vowel segments are independently activated rather than continuously transitioning via a vowel to vowel diphthongal-like trajectory suggested by Öhman (Reference Öhman1967). We might speculate that a ‘trough’ effect could occur in vowel hiatus contexts when associated with the syllable boundary, creating the percept of an inserted consonant through the changing lingual activity and related aerodynamic and acoustic characteristics.
Another common strategy related to the management of V#V adjacency in English is glottalisation (ranging from creaky phonation through to full glottal stop realisation) (Foulkes Reference Foulkes1997; Trudgill & Hannah Reference Trudgill and Hannah2002; Uffmann Reference Uffmann2007; Britain & Fox Reference Britain, Fox, Filppula, Klemola and Paulasto2009; Mompeán & Gómez Reference Mompeán and Alberto Gómez2011; Cox, Palethorpe & Bentink Reference Cox, Palethorpe and Bentink2014a; Cox et al. Reference Cox, Palethorpe, Buckey and Bentink2014b; Davidson & Erker Reference Davidson and Erker2014; Yuen, Cox & Demuth Reference Yuen, Cox and Demuth2017, Reference Yuen, Cox and Demuth2018). Studies of hiatus resolution in English have typically concentrated on r-sandhi contexts (i.e. insertion of /ɹ/ in V#V sequences such as in raw eggs or, in non-rhotic varieties, four eggs). Cox et al. (Reference Cox, Palethorpe, Buckey and Bentink2014b) found increasing use of glottalisation in r-sandhi contexts across word boundaries by younger adult Australian English (AusE) speakers compared to older adults, particularly when the right-edge vowel was strong (i.e. at a foot boundary). Similarly, Yuen et al. (Reference Yuen, Cox and Demuth2018) showed greatest use of glottalisation when the hiatus was coincident with a foot boundary compared to when it was more distant from the boundary. Studies of American English have also found glottalisation to be likely at prosodic boundaries before vowel initial words (Pierrehumbert Reference Pierrehumbert1995, Dilley, Shattuck-Hufnagel & Ostendorf Reference Dilley, Shattuck-Hufnagel and Osterndorf1996, Redi & Shattuck-Hufnagel Reference Redi and Shattuck-Hufnagel2001, Garellek Reference Garellek2014, Davidson & Erker Reference Davidson and Erker2014), reinforcing the idea that glottalisation is a boundary-related phenomenon.
Uffmann (Reference Uffmann2007) considers glottal stop insertion as a strategy to maximise syntagmatic contrast with the surrounding vowels through the sonority differential – vowels being the most sonorous segments and glottal stops the least sonorous. Glottalised items are considered less sonorous than non-glottalised (Zec Reference Zec1995). Highly sonorous glides, on the other hand, (such as the emergence of a glide in sequences containing the definite article /ðiː/ followed by a vowel-initial word) minimise contrast across syllables. Uffmann’s position is consistent with the theory of domain-initial strengthening where a glottal stop is considered the optimal hiatus breaker as it is more ‘consonantal’ (i.e. less sonorous) than the following vowel, thereby enhancing the boundary. The alternative hiatus breaker is a (more sonorous) glide which reduces the boundary percept. In support of boundary enhancement through syntagmatic contrast, Cho, Kim & Kim (Reference Cho, Kim and Kim2017) found that nasal segments in domain-initial position had reduced sonority, enhancing the syntagmatic contrast between the nasal and the following vowel. In domain-final position, on the other hand, the syntagmatic contrast between a final nasal and the vowel was reduced through higher nasal sonority. Their findings show that the articulatory/acoustic characteristics of the consonant depend on how the segment is syllabified with reference to a domain-edge. It is an empirical question as to why speakers may choose to either enhance or reduce the percept of a boundary through glottalisation or gliding respectively in PVDA hiatus contexts.
1.3 Changes to definite article allomorphy
As indicated above, a change in definite article allomorphy has been recently described for English. In an analysis of 242 tokens in the Prototype version of the read speech TIMIT corpus of American English (Lamel, Kassel & Seneff Reference Lamel, Kassel and Seneff1986, Zue, Seneff & Glass Reference Zue, Seneff and Glass1990), Todaka (Reference Todaka1992) found that younger speakers were more likely to produce a schwa in the PVDA than older speakers, particularly those over 50 years of age who showed no evidence of schwa use in this context. Keating et al. (Reference Keating, Byrd, Fleming and Todaka1994) observed anecdotally that Californian undergraduate students at the time of their TIMIT study (presumably younger than the youngest TIMIT speakers analysed) appeared to have progressed this change towards schwa in the PVDA even further. In an analysis of North London speech, Cheshire et al. (Reference Cheshire, Kerswill, Fox and Torgersen2011) found 28 $\%$ (33/119) of PVDA tokens in speech data from Anglo-background 16–19-year-olds contained schwa whereas only 9 $\%$ (16/187) of cases were found in the Anglo caregivers’ speech. Similarly, Hay et al. (Reference Hay, Walker, McKenzie and Nielsen2012) found that of the PVDA tokens (n = 820) extracted from two New Zealand English corpora recorded at the University of Canterbury, 18 $\%$ and 23 $\%$ respectively were realised as /ðə/, with younger speakers more likely to use /ðə/ than older speakers. Meyerhoff et al. (Reference Meyerhoff, Alexandra Birchfield, Charters and Watson2020) found the same effect in a corpus of interviews from older and younger adults from three socially differentiated localities in Auckland. Younger speakers in all three communities made greater use of /ðə/ in the PVDA (between 33 $\%$ and 100 $\%$ ) compared to older speakers (less than 33 $\%$ ). Hay et al. (Reference Hay, Walker, McKenzie and Nielsen2012: 29) argue that use of /ðə/ is unlikely to be ‘manipulated as a stylistic variable’ but they did find a class-based effect with non-professional speakers making greater use of /ðə/ than professionals.
Studies of PVDA not only describe the vowel quality in the definite article (usually either /ə/ or /iː/), but also the strategies that speakers use to resolve the V#V hiatus. Todaka (Reference Todaka1992) found 65/242 (27 $\%$ ) PVDA tokens in TIMIT contained a glottal stop and each was followed by a word beginning with an unreduced vowel. Of those 65 tokens, 32 (49 $\%$ ) were preceded by the high front vowel. When there was no glottal stop, the determiner contained the high front vowel 86 $\%$ of the time. Keating et al. (Reference Keating, Byrd, Fleming and Todaka1994) found 27 $\%$ of a sample of the in the TIMIT corpus had glottalisation in the V#V sequence, mainly when the following vowel had primary stress regardless of vowel quality (see also Gaskell et al. Reference Gaskell, Helen Cox, Grieve and O’Brien2003 and Raymond, Fisher & Healy Reference Raymond, Fisher and Healy2002). Glottalisation was lowest following a PVDA containing /iː/ and highest when it contained schwa (and other non-high vowels). Hay et al. (Reference Hay, Walker, McKenzie and Nielsen2012) also found glottalisation more likely to occur following /ðə/ and, for glottalised tokens, /ðə/ occurred more often in less frequent collocations (defined according to whether the collocation with ‘the’ could be considered frequent in their Canterbury corpus relative to CELEX; Baayen, Piepenbrock & Gulikers Reference Baayen, Piepenbrock and Gulikers1995). Conversely, no effect of word frequency was found in the analyses of definite article allomorphy in Jurafsky et al. (Reference Jurafsky, Alan Bell, Girand and Raymond1998) who analysed highly frequent function words from the three-million-word Switchboard corpus (Godfrey, Holliman & McDaniel Reference Godfrey, Holliman and McDaniel1992). Similarly, no frequency effect was found in Raymond et al. (Reference Raymond, Fisher and Healy2002), using Kucera & Francis (Reference Kucera and Nelson Francis1967) lemma frequencies verified in CELEX with high frequency items considered above 100 per million words and low frequency items selected from words occurring less than 10 times per million words.
Hay et al. (Reference Hay, Walker, McKenzie and Nielsen2012) propose two mechanisms by which the PVDA /ðiː/ could become /ðə/: reduction and analogy. Reduced articulatory effort could result in schwa emerging in the PVDA, particularly in low frequency utterances, leading to ‘erosion of the boundary between the words’ (Hay et al. Reference Hay, Walker, McKenzie and Nielsen2012: 31). This would result in schwa but no glottalisation. Analogy on the other hand invokes the more frequent preconsonantal form /ðə/ leading to allomorphic simplification and the insertion of glottalisation to preserve the boundary. These mechanisms suggest that schwa would potentially precede glottalisation in the process of change.
Changes to the PVDA may also be impacted by the characteristics of the following vowel, although few studies have examined this factor. Meyerhoff et al. (Reference Meyerhoff, Alexandra Birchfield, Charters and Watson2020) found a dissimilation effect in their analysis where speakers were more likely to use schwa in the PVDA preceding a word beginning with a high front vowel but had a lower probability of schwa preceding a short low vowel. However, their results for the long low vowel were equivocal. Regarding glottalisation, several studies have shown that low vowels are more likely to be glottalised than high vowels (Pompino-Marschall & Żygis Reference Pompino-Marschall, Żygis, Weirich and Jannedy2010, Brunner & Żygis Reference Brunner, Żygis and Zee2011, Malisz, Żygis & Pompino-Marschall Reference Malisz, Żygis and Pompino-Marschall2013, Hejná & Scanlon Reference Hejná and Scanlon2015, Penney et al. Reference Penney, Cox, Miles and Palethorpe2018, Penney, Cox & Szakay Reference Penney, Cox and Szakay2021).
Aside from phonetic, phonological and lexical explanations for changes to definite article allomorphy, Britain & Fox (Reference Britain, Fox, Filppula, Klemola and Paulasto2009) describe the importance of sociocultural factors in progressing the change. They provide evidence that contact between speakers in diverse communities may have been the impetus for change in London English where multicultural varieties are at the forefront of simplification to the hiatus resolution system. They, along with Fox (Reference Fox2015), showed the influence of young Bangladeshi males in London in the spread of /ðə/ amongst their male Anglo peers. Britain & Fox (Reference Britain, Fox, Filppula, Klemola and Paulasto2009) propose multi-ethnic friendship groups as the catalyst for diffusion of this variant in the community. Similarly, Cheshire et al. (Reference Cheshire, Kerswill, Fox and Torgersen2011) show that change to the definite article was most advanced in non-Anglo groups (e.g. Black Caribbean, Black African, Mixed-race Anglo/Black Caribbean, Turkish) in their sample from North London, speculating that the change may be driven by the reduction of redundancy, resulting in simplification of the system (i.e. use of a single form for the definite article rather than two forms).
As described above, there is evidence for glottalisation being used to manage hiatus in favour of epenthetic ‘r’ in r-sandhi contexts in AusE (Cox et al. Reference Cox, Palethorpe, Buckey and Bentink2014b), particularly amongst young people. If young people generalise the deployment of glottalisation to other hiatus contexts, such as when the definite article is followed by vowel-initial words, there would be less incentive for /ðiː/ to be used because a glide would not surface. In addition, glottalisation is also more common in younger AusE speakers to signal coda /t/ voicelessness compared to older speakers (Penney et al. Reference Penney, Cox, Miles and Palethorpe2018, Reference Penney, Cox and Szakay2020, Reference Penney, Cox and Szakay2021). These results suggest that younger speakers of AusE are making extensive and increasing use of glottalisation, providing a new tool in their phonological repertoire that can be deployed in a range of contexts.
To summarise, change to the PVDA has been documented in several English varieties. PVDA is increasingly realised as /ðə/, accompanied by glottalisation, and this is particularly the case for young people and in contact varieties such as Multicultural London English. We will now turn to AusE and why this variety might provide some insight into the change process ongoing in English.
1.4 Mainstream and non-mainstream Australian English
Australia is one of the most ethnically diverse countries in the world. According to the most recently reported census, nearly half of all Australians (49 $\%$ ) were either born overseas or have at least one parent born overseas (Australian Bureau of Statistics (ABS) 2016). This complexity in Australian society is attributable to changes to government policy in the 1970s which encouraged immigration from a wide range of non-English speaking countries (Joppke Reference Joppke2004). In response, multiculturalism has expanded rapidly over the past 50 years with immigration, particularly from Southeast Asia, China, the Middle East, and India, contributing markedly to the rich cultural landscape (ABS 2021). The demographic changes have led to increased linguistic diversity within the Australian community which boasts over 300 commonly used languages (ABS 2021) including many endangered, but some robust, indigenous languages (National Indigenous Languages Report 2020). The 2016 census found that 21 $\%$ of Australians speak a language other than English at home with the next most common languages after English being Mandarin, Arabic, Cantonese, and Vietnamese.
AusE is the variety of English spoken by those who have been born and/or raised in Australia. Three main accent groups can be identified: Mainstream Australian English (Cox & Palethorpe Reference Cox and Palethorpe2007), the majority variety; Australian Indigenous Englishes (e.g. Butcher Reference Butcher2008, Malcolm Reference Malcolm2013, Meakins & O’Shannessy Reference Meakins and O’Shannessy2016), used by many First Nations Australians; and a range of ethnocultural varieties used to express non-mainstream or ethnic identity (Warren Reference Warren1999; Clyne, Eisikovits & Tollfree Reference Clyne, Eisikovits, Tollfree, Blair and Collins2001; Kiesling Reference Kiesling2005; Antoniou et al. Reference Antoniou, Best, Tyler and Kroos2010, Reference Antoniou, Best, Tyler and Kroos2011; Cox & Palethorpe Reference Cox and Palethorpe2011; Clothier Reference Clothier, Willoughby and Manns2019; Grama, Travis & González Reference Grama, Travis and Gonzalez2020).
Ethnicity is a key factor in language variation and change in Australia (Horvath Reference Horvath1985) but, despite this, understanding of the phonetic characteristics of AusE is almost exclusively based on an Anglo-centric monocultural model which fails to represent the increasingly diverse community (Warren Reference Warren1999, Clyne et al. Reference Clyne, Eisikovits, Tollfree, Blair and Collins2001, Leitner Reference Leitner2004). The ethnocultural varieties of AusE, often referred to as ethnolects, are native but non-mainstream varieties which may be used by second or third generation Australians who may or may not speak a heritage language. The increasing diversity of Australian society continues to challenge traditional ideas about AusE phonology. In this study we introduce a comparison between mainstream and non-mainstream AusE accent groups in order to examine whether changes to the PVDA may be associated with linguistic and cultural diversity, as has been suggested for London English and predicted by Trudgill (Reference Trudgill, Alexandra and Dixon2017).
1.5 Aims and research questions
Although the previous studies discussed above have demonstrated increasing use of prevocalic /ðə/ (Hay et al. Reference Hay, Walker, McKenzie and Nielsen2012, Fox Reference Fox2015), few have explored the relationship between PVDA and hiatus management strategies in the progression of the change.
Here we report two analyses: a diachronic analysis of PVDA and glottalisation across a 50-year time span (∼1960s to ∼2010s), and a synchronic analysis of present-day AusE speakers who vary in terms of their linguistic and cultural background. The diachronic analysis was conducted to provide insight into the progression of change over two generations and the synchronic analysis aimed to examine the impact of a select set of sociocultural factors on present day usage.
In a novel approach, both auditory and acoustic measures are used to determine the characteristics of the vowel in the PVDA and the incidence of glottalisation. The diachronic analysis is based on careful examination of the same phrase elicited from young adults in a sentence reading task in both the 1960s and 2010s data. Such directly comparable connected speech contexts are unusual in speech analysis across this depth of time and provide a rare opportunity for detailed phonetic archaeology. The synchronic analysis provides a comparison between speakers of mainstream AusE from first-language (L1) English-speaking backgrounds and non-mainstream AusE Lebanese-heritage speakers each producing two phrases that allow us to examine specific phonetic effects in PVDA realisation. The downside of this approach is that we are unable to investigate the important issue of how variations in prosody across contexts and speakers may affect the realisation of the definite article because prosody is relatively fixed in the sentence-reading task. This remains an area to be examined in future work on PVDA realisation and hiatus resolution more generally.
Based on the findings from previous literature regarding the PVDA, we make the following predictions:
The diachronic analysis will show greater use of schwa in the PVDA and hiatus-breaking glottalisation in modern data compared to historical data indicating a change in line with observations in the UK, the US and New Zealand.
Differences in the incidence of schwa vs. glottalisation in the diachronic analysis may provide insight into the processes that initiate the change. If Hay et al. (Reference Hay, Walker, McKenzie and Nielsen2012) are correct in their suggestion that glottalisation is a boundary recovery strategy following reduction and analogical use of schwa in the PVDA, we would expect schwa in the PVDA to precede the use of glottalisation.
If language contact and community diversity is driving modern day change towards regularisation (simplification), in the synchronic analysis we would expect speakers of non-mainstream AusE to be more advanced in the use of schwa in the PVDA and glottalisation as a hiatus breaker compared to mainstream AusE speakers.
In both the diachronic and synchronic analyses we would expect females to be at the forefront of change in line with the suggestion from Meyerhoff et al. (Reference Meyerhoff, Alexandra Birchfield, Charters and Watson2020) and studies of sound change generally that have long shown a gender effect with respect to the progression of change (Labov Reference Labov2001).
In the synchronic analysis we expect the height of the vowel on the right-edge of the hiatus to affect choice of vowel in the PVDA. Meyerhoff et al. (Reference Meyerhoff, Alexandra Birchfield, Charters and Watson2020) found some evidence for high front vowels conditioning PVDA schwa and Gaskell et al. (Reference Gaskell, Helen Cox, Grieve and O’Brien2003) made a similar observation.
2 Method and materials
2.1 Speakers and recordings
The data for this study are based on recordings of scripted sentences extracted from three corpora of AusE. One is archival (historical): Mitchell and Delbridge corpus (MD) (Mitchell & Delbridge Reference Mitchell and Delbridge1965), collected in 1959 and 1960. Two are modern: Australian Voices (AusV) (Cox & Palethorpe Reference Cox and Palethorpe2008), collected between 2004 and 2016, and AusTalk (Burnham et al. Reference Burnham, Dominique Estival, Jette Viethen, Robert Dale, Julien Epps, Michael Wagner, Roland Göcke, Mark Onslow, Butcher and Hajek2011), collected between 2011 and 2015. These archival and modern corpora, which were collected at either end of an approximate 50-year period, provide us with the opportunity to examine changes to hiatus management and the English definite article in AusE that have occurred over half a century. All the corpora contain recordings of the same scripted sentence: The plane flew down low over the runway, then increased speed and circled the aerodrome/airfield a second time (hereafter the plane sentence). Note that in the MD corpus, the word aerodrome was produced, whereas in the more recent recordings this word was substituted with airfield.Footnote 2 This sentence provides a PVDA hiatus context: the airfield/aerodrome.Footnote 3 An additional scripted sentence that also contains a PVDA hiatus context was included from the more recent corpora: The grass was mown before the uncontrollable children came out to play (hereafter the grass sentence). The inclusion of both the plane and grass sentences allows us to explore the effect of V2 context in the V1#V2 hiatus sequence (with V2 as either /eː/ or /ɐ/) in a synchronic analysis of the modern data. These two V2 contexts represent not only two different vowel qualities but also two different levels of vocalic prominence. The /eː/ in airfield carries primary lexical stress whereas the /ɐ/ in uncontrollable can be considered to carry secondary stress as the head of the weak foot. Garellek (Reference Garellek2014) found that prominence did not affect the degree of glottalisation at the onset of vowel initial words in intermediate phrase medial (ip-medial) contexts. Garellek’s ip-medial context equates to the contexts in which our hiatus environments occur so we do not expect vowel prominence to be a confounding factor in our study. Future work is needed to examine these and other prosodic factors such as foot structure in the realisation of hiatus (see Yuen et al. Reference Yuen, Cox and Demuth2018).
The read speech data from the historical and modern corpora do not contain any further examples of the PVDA so our analysis is necessarily restricted to these two sentences.
2.1.1 Mitchell and Delbridge corpus
The MD corpus (https://speech.library.sydney.edu.au/) is a digitised archive of audio recordings made in 1959 and 1960 (Mitchell & Delbridge Reference Mitchell and Delbridge1965). The collection comprises recordings of the speech of 7082 high school students from 327 schools spread geographically across Australia. Students who participated in the recordings were aged between 16 and 18 years and were in their final year of schooling. The recordings were conducted in schools and were facilitated by teachers, who were instructed to ensure that the recorded speech was ‘spontaneous and unprepared’, that the sampling was to be random and that it was ‘most important that the speakers should not be selected according to the teacher’s knowledge of their ability as speakers’ (The University of Sydney 1998). Recordings were made on tape reels, which were subsequently sent back to the researchers by mail. Each speaker was recorded producing spontaneous speech in the form of a brief interview, as well as reading a list of six words and two sentences. The different tasks (interview, word list, sentence 1, sentence 2) are available as separate digitised files in wav format for each speaker, though several speakers and some tasks for individuals are missing from the database. Basic demographic data – speaker sex, place of birth of the speaker and both parents, father’s occupation – was also collected, although some details are missing for some speakers (see below).
In this study, we extracted recordings of the plane sentence for 1315 speakers (female: 761; male: 554). Speakers were selected based on the following criteria: they attended high schools located in the Sydney region, were born in Australia (most were born in New South Wales, the state of which Sydney is the capital, though some were born interstate), and at least one of their parents was born in Australia. Seventy-two per cent of speakers in the sample had Australian-born parents. As is to be expected when dealing with archival data of this type, many of the available files in the collection have poor audio quality; therefore, a further criterion was that the audio file for a particular speaker needed to be of sufficient quality as determined by trained phoneticians to enable both auditory analysis and visual inspection of the spectrograms. In addition, two phonetically trained researchers listened to each file to ensure that the speakers used an L1 Australian English accent. For speakers from a single Sydney school, no information was available regarding the place of birth of either the speaker or their parents. Data for 22 speakers from this school were nevertheless retained, after being assessed as L1 AusE. However, two speakers were excluded from this school on the basis of non L1 accent.
2.1.2 Australian Voices
The AusV corpus is a collection of audio recordings of 373 AusE speaking university and high school students, collected between 2004 and 2016 (Cox & Palethorpe Reference Cox and Palethorpe2008). The majority of participants were recorded in a sound attenuated studio in the Department of Linguistics at Macquarie University, Sydney. The data were recorded at a 44.1 kHz sampling rate using an AKG C535 EB microphone, Cooledit 2000 audio recording software via M-Audio delta66 soundcard to a Pentium 4 PC (one participant was recorded at 48 kHz sampling rate). A subset of the participants was recorded in a sound attenuated room at Western Sydney University (35 speakers) or in a quiet location in their own homes (eight speakers) at a 44.1 kHz sampling rate using an AKG C520 headset condenser microphone to a Marantz PMD661 MK II solid-state recorder.
Data were collected from participants belonging to two accent groups: a mainstream (MS) AusE speaking group and a non-mainstream (non-MS) AusE speaking group. The participants in the non-MS group were selected from the AusV corpus on the basis of Lebanese heritage. All participants, both MS and non-MS, were born in Australia and had completed all of their schooling in Australia. Participants produced one to four repetitions of the 18 stressed vowels of AusE in the standard /hVd/ frame, as well as one to four repetitions of 10 read sentences, including the plane and grass sentences described above. Some participants additionally produced the same vowels in a combination of /hVt/, /hV/, /hVl/, and /hVn/ frames. For this study, we extracted all available repetitions of the plane and the grass sentences for 131 MS speakers (female: 100; male: 31) and 53 non-MS speakers (female: 39; male: 14) aged between 18 and 30 years. The MS AusE speakers had at least one parent born in Australia with the other parent speaking L1 English. The non-MS speakers all had at least one parent born in Lebanon and were from high language contact communities.
2.1.3 AusTalk
The AusTalk corpus is a collection of speech recordings from 861 AusE speakers, aged between 18 and 83, recorded at 15 regionally diverse locations throughout Australia using 12 standardised portable recording stations between 2011 and 2015 (Burnham et al. Reference Burnham, Dominique Estival, Jette Viethen, Robert Dale, Julien Epps, Michael Wagner, Roland Göcke, Mark Onslow, Butcher and Hajek2011, Cassidy, Estival & Cox Reference Steve, Estival, Cox, Ide and Pustejovsky2017). Participants were audio-visually recorded using an array of microphones (for details including specific equipment and hardware see Burnham et al. Reference Burnham, Dominique Estival, Jette Viethen, Robert Dale, Julien Epps, Michael Wagner, Roland Göcke, Mark Onslow, Butcher and Hajek2011). The data selected here were recorded at a 44.1 kHz sampling rate using an AudioTechnica headworn AT892c microphone through an MAudio FastTrackUltra8R digital recording interface, then down sampled to 16 kHz. Each participant took part in three separate recording sessions in which they produced a range of scripted and spontaneous speech (see Burnham et al. Reference Burnham, Dominique Estival, Jette Viethen, Robert Dale, Julien Epps, Michael Wagner, Roland Göcke, Mark Onslow, Butcher and Hajek2011 for full details). The scripted speech recordings included a sentence reading task, in which a single production of both the plane and the grass sentences were included. For this study, we extracted recordings of the plane and the grass sentences produced by 25 speakers (female: 11; male: 14) aged between 18 and 30 from Sydney, having completed all of their schooling in Sydney, and with both parents born in Australia (with the exception of four participants who had one parent born in another country but who spoke L1 English, and one participant who had one parent born in New Zealand and one parent born in the Netherlands). The data for these speakers supplement the data for the AusV MS speakers.
2.2 Annotation and acoustic analysis
All of the data were first processed by WebMAUS (Kisler, Reichel & Schiel Reference Kisler, Reichel and Schiel2017) utilising an AusE model, which returned textgrids segmented and aligned at the level of the phoneme. In each item the phrases containing the hiatus contexts under examination (the aerodrome, the airfield, the uncontrollable) were then hand checked with reference to the corresponding waveforms and wide-band spectrograms. The V1#V2 hiatus context was delimited according to the beginning of V1 (i.e. the vowel in the PVDA) and the end of V2 (the vowel on the right-edge of the hiatus, either /eː/ or /ɐ/). Phoneme boundaries were corrected where necessary according to the following criteria:
for all sentences, the onset of V1 was labelled at the onset of strong F2 as indicated by a marked intensity change and a concomitant increase in amplitude with a clear repeating waveform pattern indicating a vowel;
for all sentences, the end of V1 was labelled as follows:
-
∘ for items containing glottalisation at the end of V1 but separated from V2 by a full glottal stop closure, the end of V1 was marked following the last glottal pulse prior to glottal stop closure
-
∘ for items containing glottalisation throughout the hiatus, the MAUS allocated boundary was checked as coinciding with an amplitude drop
-
∘ for items containing continual modal phonation throughout V1 and V2, the MAUS allocated boundary was checked as coinciding with an amplitude drop and occurring after the peak of F2 for V1
-
for plane sentence items produced with aerodrome, the end of V2 (/eː/) was labelled at the trough of F3 signalling the following rhotic;
for plane sentence items produced with airfield, the end of V2 (/eː/) was labelled at the end of strong F2/F3 and the onset of noise indicating onset of the following fricative;
for grass sentence items, the end of V2 (/ɐ/) in uncontrollable was labelled at the drop of amplitude concomitant with simplification of the waveform pattern and, in some cases, visible antiformants indicating the onset of the following nasal.
The V1 was additionally labelled according to whether it produced an auditory percept of /iː/ or /ə/. Items that contained a dysfluency associated with the target phrase were excluded from further analysis (eight items from the MD corpus and 24 items from the AusV corpus). According to Fox Tree & Clark (Reference Tree, Jean and Clark1997), greater use of /ðiː/ would be expected in the definite article before a dysfluency because words immediately preceding a dysfluency are more likely to have less reduced productions (see also Bell et al. Reference Bell, Daneil Jurafsky, Girand, Gregory and Gildea2003).
In addition, the phoneme boundaries for the vowel in the word ‘speed’ in the plane sentence were also corrected according to the criteria outlined above, to provide reference values for each participant’s /iː/ vowel in non-PVDA contexts. /iː/ from ‘speed’ is used as the reference because /iː/ is the expected (and standard) vowel used in the PVDA. The data were also labelled for the presence of glottalisation in the V1#V2 hiatus. Although we observed variation in type and duration of glottalisation, ranging from a brief period of glottalised (creaky) phonation at the hiatus juncture to a full glottal stop with a closure phase analogous to a (voiceless) oral stop closure (and in all but a few cases glottalised phonation on either side of the glottal stop closure – maximum closure duration 168 ms), we do not differentiate between types of glottalisation in this study (see also Davidson & Erker Reference Davidson and Erker2014; Garellek Reference Garellek2015; Penney et al. Reference Penney, Cox and Szakay2020, Reference Penney, Cox and Szakay2021). Rather, all items that exhibited evidence of glottalisation between these two extremes were treated as items in which glottalisation was used as a hiatus resolution strategy. Figure 1 gives an example of a spectrogram showing glottalised phonation at the hiatus juncture, as seen in the sudden change from regular, modal phonation to irregular, glottalised phonation, visible in the waveform and spectrogram. Figure 2 shows an example of a file containing a full glottal stop (as well as preceding and following glottalised phonation), as seen by the complete cessation of energy between the two vowels. Items exhibiting no evidence of glottalisation showed continuous formant structure from the onset of V1 to the offset of V2. Figure 3 shows an example of such an item that is perceived to contain a glide between the two vowels at the hiatus juncture and no glottalisation.
Two phonetically trained researchers were responsible for annotating the data. Intra-and inter-annotator agreement was assessed separately for auditory judgements of the V1 (/iː/ or /ə/) and for the presence or absence of glottalisation using the irr package (Gamer et al. Reference Gamer, Lemon, Fellows and Singh2019). Kappa values were very high in all cases (V1 quality inter-annotator: k = 0.975, 0.911; V1 quality intra-annotator: k = 0.969; presence/absence of glottalisation inter-annotator: k = 0.89, 0.896; presence/absence of glottalisation intra-annotator: k = 0.877).
The labelled data were converted into an emu database and analysed using emuR (Winkelmann, Harrington & Jänsch Reference Winkelmann, Harrington and Jänsch2017). Formant measurements for the modern dataFootnote 4 were extracted from Praat (Boersma & Weenink Reference Boersma and Weenink2020) using the PraatR package (Albin Reference Albin2014) with default settings, apart from the following, which were identified as optimal for the data: for female speakers, we calculated the first four formants within the 0–5000 Hz frequency range; for the male speakers, we calculated the first five formants within the 0–5000 Hz frequency range. F1 and F2 measures were hand checked for all vowels in the hiatus contexts and the reference /iː/ vowel in ‘speed’. In 31 items (10 plane; 21 grass) mistracked formants were hand corrected. These were generally due to errors in F2 and, in most cases, corrections were limited to one or two points within the vowel.
We extracted F1 and F2 measures for each item at the 0.65 point of V1. The 0.65 point was selected rather than a midpoint as the /iː/ vowel in AusE exhibits onglide and therefore a delayed target (Harrington, Cox & Evans Reference Harrington, Cox and Evans1997, Cox & Palethorpe Reference Cox and Palethorpe2007, Cox et al. Reference Cox, Palethorpe and Bentink2014a, Elvin, Williams & Escudero Reference Elvin, Williams and Escudero2016). We also calculated, for each speaker, average reference F1 and F2 values for /iː/ at the 0.65 point of all available repetitions of the vowel in the word ‘speed’. For each V1 we then calculated the Euclidean distance from the speaker’s mean F1 and F2 values of the reference /iː/ vowel in ‘speed’, to serve as a measure of how /iː/-like the V1 vowels were. The greater the Euclidean distance from the /iː/ in ‘speed’, the less /iː/-like/more schwa-like the V1. This value has the potential to vary according to gender due to differences in vocal tract length. We have not normalised formant values as this process could remove important non-physiological effects (see Hay et al. Reference Hay, Pierrehumbert, Walker and LaShell2015) and, as we have few data points per speaker, normalisation would not be appropriate in this analysis. Euclidean distance provides a transparent way of understanding distributions in the data. The issue of gender in the Euclidean distance measure will be addressed in the results section below.
2.3 Statistical analysis
Statistical analyses were conducted using the lme4 package (Bates et al. Reference Bates, Mächler, Bolker and Walker2015) in R (R Core Team 2020). Generalised linear mixed effects regression models (GLMER) were used to analyse the categorical variable of whether glottalisation was present or not. A linear mixed effects regression model (LMER) was used to analyse the continuous variable of Euclidean distance. The choice of (generalised) linear mixed effects models is appropriate because the data includes multiple tokens produced by the same speakers, and this non-independence can be accounted for in mixed models through the inclusion of random effects (Baayen, Davidson & Bates Reference Baayen, Davidson and Bates2008).
For the diachronic analysis, we examined the 1315 items of the plane sentence from the MD data (female: 761; male 554) and compared these to the 447 items of the plane sentence from the modern data (i.e. AusV and AusTalk combined) produced by MS speakers only (female: 333; male 114). The data were analysed for the presence of glottalisation using a generalised linear mixed effects (GLMER) model. The categorical dependent variable was whether glottalisation was present or absent. Fixed factors were the auditorily-determined V1 quality (/iː/, /ə/), gender (female, male), and time period (1960s, 2010s). We also included two-way interactions between gender and time period, and between gender and V1 quality. Note that models which included all two- or three-way interactions did not converge. Random intercepts were included for speaker. This was the maximal random effects structure to converge.Footnote 5
For the synchronic analysis, we examined all of the items from the modern data (i.e. AusV and AusTalk combined). This included the plane and grass sentences for both MS and non-MS speakers (plane: MS female: 333; MS male: 114; non-MS female: 84; non-MS male: 56; grass: MS female: 330; MS male: 112; non-MS female: 128, non-MS male: 53). To analyse the presence of glottalisation, we fitted a GLMER model to these data with the categorical dependent variable of whether glottalisation was present or absent. Fixed factors were gender (female, male), accent group (MS, non-MS), and V2 context (/eː/, /ɐ/). All two- and three-way interactions between these factors were included. We also included auditorily-determined V1 quality as a covariate, random intercepts for speaker, and random slopes for V2 context by speaker. This was the maximal random effects structure to converge.Footnote 6
In addition, in order to analyse the V1 quality and the extent to which it varied from a speaker’s typical /iː/ vowels, we fitted a linear mixed effects (LMER) model using the lme4 package (Bates et al. Reference Bates, Mächler, Bolker and Walker2015) with the Euclidean distance of V1 from the vowel in ‘speed’ as dependent variable. Fixed factors were gender (female, male), accent group (MS, non-MS), and V2 context (/eː/, /ɐ/). All two- and three-way interactions between these factors were also included. Random intercepts were included for speaker, and random slopes were included for V2 context by speaker.Footnote 7
When reporting effects from the models below, we interpret significant effects as those with p-values below 0.05. The p-values were calculated by Type III tests conducted on each model with the afex package (Singmann et al. Reference Singmann, Ben Bolker, Aust and Ben-Shachar2021), using likelihood ratio tests for the GLMER models and Kenward–Roger approximation for degrees-of-freedom for the LMER model. Only highest-order terms in which a factor is involved are reported (i.e. we do not report simple effects for factors involved in significant interactions). Summaries of each model are available in Appendix Tables A1, A3, and A7. Output model summaries including parameter estimates, standard error, and z/t statistics are available in Tables A2, A4, and A8. Pairwise comparisons were made for significant interaction terms using the emmeans package with Tukey HSD corrections (Lenth Reference Lenth2020).
3 Results
3.1 Diachronic analysis (MD 1960s vs. Modern Mainstream 2010s data)
We found that hiatus was more frequently resolved by glottalisation in the modern data than in the archival data: 66 $\%$ of items (295/447) in the modern data were produced with glottalisation, compared to 9 $\%$ (116/1315) in the archival data. There was a significant interaction between gender and time period (χ² = 23.91; p < .0001). Female speakers were more likely to produce glottalisation than male speakers in both time periods, and post-hoc comparisons confirmed that the difference was significant in both the archival and the modern data (both p < .0001). The interaction shows that female speakers displayed a larger increase in the use of glottalisation over time compared to males. Figure 4 illustrates the proportion of items in which glottalisation was used to resolve hiatus in the archival and modern data according to speaker gender.
We also found a significant interaction between gender and V1 quality (χ² = 5.44; p = .02). For items in which the V1 contained a schwa, female speakers produced glottalisation in 97 $\%$ of cases (113/116); on the other hand, they produced glottalisation in 22 $\%$ of items where the V1 vowel was /iː/ (215/978). For the male speakers, 60 $\%$ of items produced with V1 schwa showed glottalisation (29/48), compared to only 9 $\%$ of items with V1 /iː/ (54/620). Post hoc comparisons confirmed that glottalisation was more likely to be produced in conjunction with a V1 schwa for both female (p < .0001) and male (p = .015) speakers and that females produced more glottalisation than males for both V1 /iː/ and /ə/ (both p < .0001) but more so for schwa.
As the model did not converge when three-way interaction terms were included, we were unable to analyse a potential three-way interaction between gender, V1 quality and time period. However, descriptive details showing all three variables are included in Tables 1 and 2. Table 1 shows the proportion of items produced with V1 /iː/ and /ə/ by female speakers in each time period according to whether glottalisation was present or absent. The 1960s females glottalised 11 $\%$ of all items. Less than 1 $\%$ (3 items) of the items not produced with glottalisation were produced with schwa. The 2010s females glottalised 74 $\%$ of all items. All of the items not produced with glottalisation were produced with /iː/. Figure A1 in the Appendix illustrates how the various categories are distributed in the 1960s and 2010s datasets for female speakers.
Table 2 shows the proportion of items produced with V1 /iː/ and /ə/ by male speakers in each time period according to whether glottalisation was present or absent. The 1960s males glottalised 6 $\%$ of all items. 3 $\%$ of the non-glottalised items were produced with schwa. The 2010s males glottalised 42 $\%$ of all items. Only one of the items not produced with glottalisation was produced with schwa. Figure A1 in the Appendix illustrates how the various categories are distributed in the 1960s and 2010s datasets for male speakers.
3.2 Synchronic analysis (Mainstream and Non-Mainstream 2010s data)
3.2.1 Analysis of glottalisation
Synchronic analysis of the two sentence contexts showed that hiatus was frequently resolved by glottalisation in both the MS and non-MS accent groups, but the non-MS speakers used glottalisation more frequently: MS 70 $\%$ glottalised (620/889); non-MS 82 $\%$ glottalised (264/321). Although there is some intraspeaker variation, many speakers use glottalisation categorically in these data (see Figure A2). We found a significant effect of V1 quality (χ² = 123.64; p < .0001), with glottalisation more likely when the V1 was identified as schwa. We also found a significant two-way interaction between gender and V2 context (χ² = 8.63; p = .003). Post hoc comparisons showed that females produced similarly high levels of glottalisation in both V2 contexts: /eː/ (airfield) and /ɐ/ (uncontrollable); males produced lower levels of glottalisation than females, particularly for /eː/ (airfield) (p < .0001). This effect is illustrated in Figure 5, which shows the proportion of items in which glottalisation was used to resolve hiatus in both V2 contexts according to speaker gender.
We also found an interaction between accent group and V2 context (χ² = 7.19; p = .007). For the MS speakers, more glottalisation was present in the /ɐ/ V2 context compared to the /eː/ V2 context, whereas for the non-MS speakers, high levels of glottalisation were similar in both V2 contexts. Post hoc comparisons confirm a significant difference between V2 context for the MS speakers (p = .0101), whereas no difference was found for the non-MS speakers. Figure 6 shows the proportion of items in which glottalisation was used to resolve hiatus in the two V2 contexts according to accent group, and illustrates the reduced rate of glottalisation in the /eː/ context for the MS speakers. There was no significant three-way interaction.
Appendix Tables A5 and A6 provide the proportion of items produced by female and male speakers respectively in the synchronic analysis with V1 /iː/ and /ə/ according to accent group, V2 context, and whether glottalisation was present or absent. Figure A3 illustrates these results.
3.2.2 Acoustic analysis of PVDA vowel (V1) quality
As detailed above, for each V1 we calculated the Euclidean distance from the speaker’s mean F1 and F2 value of /iː/ in the word ‘speed’. Figure 7 shows F1 and F2 values for all V1 realisations (in black) and for all cases of the reference vowel /iː/ (in grey), according to accent group and gender. This figure illustrates that while the vowel quality of the reference /iː/ vowel remained compact, the value of V1 in the PVDA demonstrated considerable variation in both F1 and F2.
The LMER analysis of the Euclidean distance between V1 and the reference /iː/ vowel showed a significant interaction between gender and V2 context (F(1,189.79) = 4.09; p = .044). Given differences in vocal tract size between females and males, we would expect females to show greater differences than males. Both female and male speakers produced vowels with greater Euclidean distances in the V2 /ɐ/ context than in the V2 /eː/ context, and in both contexts females produced greater Euclidean distances than males, although the difference between males and females was greater in the /ɐ/ context. Post hoc comparisons showed that the differences between females and males were significant in both contexts (/ɐ/: p = .0031; /eː/: p = .0030). Although the interaction was significant based on our criterion of p < .05, we note that it is not a strong effect, and given the significant pairwise differences between males and females in both V2 contexts, it is likely that this interaction is driven by the gender differences. Note that as expected the simple effect of gender was highly significant at p < .0001 as shown in Table A7.
We also found a significant interaction between accent group and V2 context (F(1,189.79) = 22.06; p < .0001). Post hoc comparisons showed that, within each group, Euclidean distances differed significantly across V2 contexts (both p < .0001). In addition, the two groups differed significantly from one another in both the V2 /eː/ context (p = .0012) and in the V2 /ɐ/ context (p < .0001). In both, the non-MS speakers produced substantially greater Euclidean distances compared to the MS speakers; however, this was most apparent in the V2 /ɐ/ context: both groups showed greater Euclidean distances from the reference vowel in the context of a V2 /ɐ/ vowel, which is indicative of a less /iː/-like/more schwa-like V1 quality in this context compared to the V2 /eː/ context, with the greatest values (i.e. most schwa-like vowels) produced by the non-MS speakers. This can be seen in Figure 8, which shows Euclidean distance from the reference vowel according to accent group and V2 context, with lower values representing more /iː/-like V1.
4 Discussion
The goal of our analysis was to investigate whether AusE is participating in changes to the PVDA and the management of the associated hiatus context that have been documented for other varieties of English. In doing so, we hoped to shed light on the process of change and the factors that may influence its progression. Using a restricted set of read-speech data from each end of a 50-year time span, the diachronic analysis showed that glottalisation was very infrequent in the hiatus context examined here in the speech of adolescents recorded in 1959 and 1960, at just 8.8 $\%$ (116/1315) compared with 66 $\%$ (295/447) of items in the MS data from the modern dataset. Use of schwa in the PVDA was also rare in the 1960s dataset, only occurring in 2.6 $\%$ (34/1315) of items compared with 29 $\%$ (130/447) in the modern MS dataset.
A significant interaction between gender and time period in the use of glottalisation showed, as predicted, that females were progressive with respect to this feature, vastly increasing their usage from 11 $\%$ to 74 $\%$ of items across the 50-year period. Males increased usage as well (from 6 $\%$ to 42 $\%$ ) but did not reach the same level of use as females. An interaction between gender and V1 quality (the PVDA vowel) showed that glottalisation was more likely following a schwa vowel for all speakers but more so for females. For female speakers, items containing V1 schwa were glottalised in 97 $\%$ of cases compared to only 22 $\%$ of items containing V1 /iː/. Similarly for males, glottalisation was more likely for V1 schwa (60 $\%$ ) than for V1 /iː/ (9 $\%$ ). Glottalisation appears to be the modern solution to the management of the hiatus in these PVDA contexts.
The diachronic analysis showed greater use of glottalisation and schwa in modern data compared to archival data and a close association between these two features indicating a change to definite article allomorphy in line with observations from the UK (Britain & Fox Reference Britain, Fox, Filppula, Klemola and Paulasto2009, Cheshire et al. Reference Cheshire, Kerswill, Fox and Torgersen2011, Fox Reference Fox2015), the US (Todaka Reference Todaka1992, Keating et al. Reference Keating, Byrd, Fleming and Todaka1994) and New Zealand (Hay et al. Reference Hay, Walker, McKenzie and Nielsen2012). Interestingly, the use of glottalisation to manage the hiatus is more common than the use of schwa in the PVDA. We shall return to this finding below.
The synchronic analysis was designed to investigate four variables with respect to the deployment of glottalisation in hiatus management: V1 quality (/iː/ vs. schwa), gender (female vs. male), accent type (MS vs. non-MS) and V2 context (high /eː/ vs. low /ɐ/). Consistent with the diachronic analysis and findings from the literature (Todaka Reference Todaka1992, Keating et al. Reference Keating, Byrd, Fleming and Todaka1994, Raymond et al. Reference Raymond, Fisher and Healy2002, Gaskell et al. Reference Gaskell, Helen Cox, Grieve and O’Brien2003, Hay et al. Reference Hay, Walker, McKenzie and Nielsen2012), glottalisation was more prevalent when the vowel in the PVDA was schwa. Females again showed that they were progressive with respect to the use of glottalisation compared to males, with high levels in both V2 contexts. Males on the other hand used less glottalisation in the case of the following high vowel /eː/ compared to the low /ɐ/. An accent group by V2 interaction showed that the non-MS speaker group were more progressive than the MS speakers, with very high levels of glottalisation in both V2 contexts. The MS speakers, however, varied across V2 contexts, with greater usage when the low vowel /ɐ/ followed. These findings suggest, as predicted, that females and non-MS speakers are at the forefront of the change. The high levels of glottalisation in both V2 contexts suggest stabilised usage for non-MS and female speakers. For the MS males (who have the lowest levels of glottalisation), increased use of glottalisation is evident in the low V2 context compared to the high vowel context. Cross linguistically, glottalisation is known to favour low vowels (Pompino-Marschall & Żygis Reference Pompino-Marschall, Żygis, Weirich and Jannedy2010; Brunner & Żygis Reference Brunner, Żygis and Zee2011; Malisz, Żygis & Pompino-Marschall Reference Malisz, Żygis and Pompino-Marschall2013; Hejná & Scanlon Reference Hejná and Scanlon2015; Penney et al. Reference Penney, Cox, Miles and Palethorpe2018, Reference Penney, Cox and Szakay2021). We might speculate that glottalisation in the PVDA hiatus context could have first arisen in such low vowel contexts. Future research is necessary to explore this possibility.
In order to examine the characteristics of the vowel in the PVDA in greater detail, F1 and F2 values were extracted for V1 and tokens of the vowel /iː/ in the word ‘speed’ from the plane sentence in the modern dataset. We calculated, for each V1, the Euclidean distance from the related speaker’s mean F1 and F2 values in ‘speed’ as an index of how /iː/-like the realisation of the vowel in the PVDA was. The results from the Euclidean distance analysis show that the non-MS speakers produced less /iː/-like/more schwa-like vowels in the PVDA than the MS speakers, particularly when followed by the low vowel /ɐ/. The results showing the schwa-like nature of the vowel in the PVDA in the speech of non-MS speakers was predicted based on findings from varieties of English that are undergoing change in response to increased linguistic and cultural diversity and the resulting language contact environment. These results support Trudgill’s (Reference Trudgill, Alexandra and Dixon2017) model of language change which predicts that a language variety of a high contact community, such as that represented by our non-MS group, may undergo change processes that lead to simplification.
The finding that the V2 low vowel conditions more schwa-like productions in the PVDA does not support the observation in Meyerhoff et al. (Reference Meyerhoff, Alexandra Birchfield, Charters and Watson2020) that greater use of schwa appeared more commonly in contexts containing high front vowels compared to low vowels via a dissimilation process (although their findings were inconsistent for long low vowels). Gaskell et al. (Reference Gaskell, Helen Cox, Grieve and O’Brien2003) also found greater use of schwa in the PVDA when the following vowel was a high front vowel /iː/ or /ɪ/. However, an examination of their stimuli reveals that a greater proportion of words in their non-high front vowel set began with an unstressed vowel compared with the high front vowel set. Unstressed vowels are more likely to be preceded by PVDA /iː/ (Anderson et al. Reference Anderson, Jean Arnold, Christina Evans, Danae McConnel, Nielson and Walker2004, cited in Britain & Fox Reference Britain, Fox, Filppula, Klemola and Paulasto2009). Our results do not support greater use of schwa in the higher vowel context. Instead, we found that the low vowel V2 context facilitated the use of schwa in the PVDA. This could perhaps be explained as an assimilatory process whereby the unstressed vowel in the PVDA is highly coarticulated with the following low vowel. As the present analysis is restricted to just two contexts, an opportunity exists for future research to examine the following vowel context in more detail.
Returning now to the possible actuation of the change to definite article allomorphy, Hay et al. (Reference Hay, Walker, McKenzie and Nielsen2012) suggested that glottalisation could be considered a boundary recovery strategy following reduction and analogical use of schwa in the PVDA. Under this approach we would predict that schwa in the PVDA would precede the use of glottalisation in historical allomorphic change. Our findings, however, do not support this progression. Instead, glottalisation appears to precede the use of schwa in that a greater proportion of items containing glottalisation compared to the use of schwa in the PVDA were found in both datasets examined in the diachronic and synchronic analysis. In other words, items containing /iː/ co-occurred with glottalisation yet items containing schwa rarely occurred without glottalisation. This finding may provide an insight into the actuation of the change. Why might glottalisation arise in the V1#V2 hiatus context examined here? One possible explanation is that in non-glottalised items a trough may occur in the vicinity of the V1#V2 syllable boundary. The trough could result from tongue musculature deactivation (analogous to the ‘trough effect’ phenomenon that has been observed in the case of intervocalic labial stops) (Lindblom et al. Reference Lindblom, Sussman, Modarresi and Burlingame2002, Fuchs et al. Reference Fuchs, Hoole, Brunner, Inoue, Slifka, Manuel and Matthies2004, Vazquez-Alvarez & Hewlett Reference Vazquez-Alvarez and Hewlett2007). It is well known that the interrelationship between supralaryngeal and laryngeal structures affects phonation (see Chen, Whalen & Tiede Reference Chen, Whalen and Mark2021 for a review). An articulatory and/or aerodynamic trough in V1#V2 sequences may affect phonation leading to glottalisation under certain conditions (Hanson et al. Reference Hanson, Stevens, Jeff Kuo, Chen and Slifka2001; Slifka Reference Slifka2006, Reference Slifka2007). Of course, this suggestion for the actuation of PVDA allomorphic change remains highly speculative. We are currently examining f0 and intensity in non-glottalised hiatus contexts to explore the intervocalic trough hypothesis further.
In summary, the results of the diachronic and synchronic analysis indicate change to definite article allomorphy in AusE bearing in mind the restricted nature of the dataset. The finding that the incidence of glottalisation was more extensive than the incidence of schwa, occurring in both PVDA schwa and /iː/ contexts, may indicate that glottalisation could have been the initiating factor in the change. It remains to be determined how glottalisation could have developed spontaneously in this context. Further examination of V#V hiatus could shed new light on the phonetic processes that may be used to create the percept of a boundary and ultimately initiate change towards glottalisation. Hiatus breaking glottalisation provides a syntagmatic contrast between the adjacent vowels. Therefore, the use of /iː/ in the PVDA becomes redundant because the intervening glide no longer surfaces. This process may lead to regularisation of the definite article in the form of /ðə/ through analogy.
We have also shown that changes to definite article allomorphy in these data are more advanced amongst non-MS AusE speakers and in females consistent with previous analyses from other varieties of English and highlighting the importance of linguistic diversity in language change. This change may be driven by the reduction of redundancy leading to simplification of the system through analogical levelling (see Trudgill Reference Trudgill and Hickey2010, Reference Trudgill, Alexandra and Dixon2017). Trudgill (Reference Trudgill, Alexandra and Dixon2017: 144) suggests that simplification may occur in contact situations that have required large-scale second language learning by adults and adolescents ‘under demographic and social conditions which are such that the simplification that results from the removal of linguistic L2-difficult features also becomes part of the speech of later generations of native speakers’. He also indicates that a variety spoken in high-contact communities where mutually intelligible varieties exist will undergo some simplification. This scenario is applicable to modern day Sydney where many communities have low numbers of English-only households and where a large proportion of the population were not born in Australia. One possible explanation for the greater prevalence of glottalisation in the non-MS speaker group compared to MS speaker group relates to community language. All of our non-MS speakers, while born in Australia, are from a Lebanese background and most would be exposed to Arabic in the home and the community although for many of our participants English was their first (or simultaneous) language. As Arabic does not allow onsetless syllables (see e.g. Khattab Reference Khattab2013), an epenthetic glottal stop or glottalisation are predicted in this hiatus context for L1 speakers of Arabic. If it is the case that the use of glottalisation to manage hiatus relates to this aspect of Arabic phonology, we would expect a similar strategy to be used in other hiatus contexts for this group of speakers but further research is needed to explore this suggestion. Of course, this explanation cannot account for the increased use of glottalisation in our MS population of speakers (both in the diachronic and synchronic analyses) who would not have exposure to Arabic. Perhaps this aspect of Arabic phonology has a facilitation effect in enhancing the uses of a variant in our non-MS population that is more generally present in the wider AusE speaking community.
Production of glottalisation in the management of hiatus reduces the need for the high front vowel in the PVDA hence the use of the single form containing schwa. As indicated above, there are several possible explanations for our findings. Understanding why such a process is more advanced in the non-MS group remains a question for future research. It will be interesting to consider whether speakers from different language backgrounds also use glottalisation in the same way. Further research is needed to understand why speakers choose to either enhance or reduce the percept of a boundary through glottalisation or gliding respectively in these hiatus contexts. One explanation may relate to the potential for strengthening the word boundary (see Dilley et al. Reference Dilley, Shattuck-Hufnagel and Osterndorf1996) which may be more important in diverse communities where communication challenges may exist.
The items examined in the current study consisted of a small set of highly controlled contexts. The choice of data was dictated by availability in the historical archive providing a consistent but controlled context across a 50-year timespan. In order to test the generalisability of the findings it will be important to examine a more extensive set of data with variable phonetic and prosodic contexts including a range of collocational frequencies.
It is interesting that the deployment of glottalisation is becoming common in AusE in a number of contexts including the implementation of coda voicelessness (Penney et al. Reference Penney, Cox, Miles and Palethorpe2018, Reference Penney, Cox and Szakay2020, Reference Penney, Cox and Szakay2021), the use of creaky voice quality (Dallaston & Docherty Reference Dallaston and Docherty2019, White et al. Reference White, Joshua Penney, Szakay and Cox2021), and in hiatus management (Cox et al. Reference Cox, Palethorpe, Buckey and Bentink2014b; Yuen et al. Reference Yuen, Cox and Demuth2017, Reference Yuen, Cox and Demuth2018). A fruitful area for future research could be to explore the intriguing relationship between these various segmental, sociophonetic, and prosodic uses for glottalisation in the phonological toolkit.
5 Conclusion
The findings reported here support suggestions of change to definite article allomorphy in AusE. The PVDA is associated with increased use of glottalisation to manage hiatus and a concomitant increase in the use of schwa-like vowels. This is particularly true of the non-MS speakers who represent culturally and linguistically diverse communities in this study. The results raise questions about the role of linguistic diversity in language change, suggesting parallels with Multicultural London English where diversity is driving change towards definite article regularisation (Cheshire et al. Reference Cheshire, Kerswill, Fox and Torgersen2011, Fox Reference Fox2015). Further research is needed with a more extensive range of data and speakers to determine whether the observed changes are linked to widespread effects involving the management of hiatus, and the use of glottalisation more generally, and to further explore data that may help us unravel the conditions that may have led to the actuation and spread of the change.
Acknowledgements
An earlier version of this research was presented at the 2020 annual conference of the Australian Linguistic Society; we thank participants for their feedback and suggestions. We also thank Linda Buckley, Benjamin Purser, and Elliot Peck for annotation and labelling, Andy Gibson and members of the Macquarie University Phonetics Lab for their comments, Raphael Winkelmann for assistance with emuR, and Serje Robidoux for statistical advice. This work was supported by an Australian Research Council Future Fellowship grant (FT180100462) to the first author. We gratefully acknowledge the valuable and insightful comments and suggestions of Associate Editor Oliver Niebuhr and two anonymous reviewers.
Appendix. Additional material