1 Introduction
Apical vowels are syllabic segments such as the nuclei [ɹ̩] and [ɻ̩] of the syllables [sɹ̩55] ‘think’ and [ʂɻ̩55] ‘lion’ in Mandarin Chinese, which typically occur respectively after the alveolar sibilants [s ts tsʰ] and the retroflex sibilants [ʂ tʂ tʂʰ], as illustrated in Table 1.Footnote 1 In Mandarin Chinese, these apical segments share the places of their preceding alveolar and retroflex sibilants (Chao Reference Chao1930, Ladefoged & Maddieson Reference Ladefoged and Maddieson1996) and can be regarded as the ‘vocalized prolongation’ of their preceding consonants (Chao Reference Chao and Joos1934). Similarly to vowels such as [i] and [a], the apical vowels [ɹ̩] and [ɻ̩] in Mandarin Chinese have formant patterns (Howie Reference Howie1970) and there is no visible change in formant transitions between an alveolar/retroflex sibilant and a following apical vowels [ɹ̩/ɻ̩] (Lee & Li Reference Lee and Li2003), which corresponds to their homorganicity. There has been controversy in the literature about the phonetic status of the apical vowels [ɹ̩] and [ɻ̩] in Mandarin Chinese, which have been described as syllabic consonants (Hartman Reference Hartman1944, Hockett Reference Hockett1967, Duanmu Reference Duanmu2007), syllabic approximants (Lee & Zee Reference Lee and Zee2003, Zee & Lee Reference Zee, Lee, Lu and Wang2004, Lee-Kim Reference Lee-Kim2014), fricative vowels (Ladefoged & Maddieson Reference Ladefoged and Maddieson1996), etc., based on different aspects of their phonetic properties. In this study, we refer to apical vowels using the IPA symbols [ɹ̩] and [ɻ̩], following Kong, Wu & Li (published online 15 July Reference Kong, Wu and Li2022). In terms of their phonological properties, the two apical vowels in Mandarin Chinese are in complementary distribution with the phonetic vowel [i], as allophones of the phoneme /i/. In the current study, the term ‘apical vowel’ is used to refer to these two segments in Mandarin Chinese and their counterparts in other Chinese dialects with similar phonetic properties. Apical vowels are reported to be widely distributed across Chinese dialects (Lee & Zee Reference Lee, Zee and Sybesma2017, Li Reference Li2017, among others), e.g. in 85% in a sample of 170 Chinese dialects in Li’s (Reference Li2017) typological survey.
1.1 The articulation of apical vowels
The articulatory gestures of the apical vowels [ɹ̩]/[ɻ̩] in Mandarin Chinese are similar to those of their preceding alveolar/retroflex sibilants, and their production can be impressionistically recognized as a voiced extension of the sibilant onsets to carry the syllables (Chao Reference Chao1968). Instrumental investigation of apical vowels in Mandarin Chinese and other Chinese dialects revealed some of their general articulatory properties: First, there are slight differences between the tongue gestures of a preceding alveolar/retroflex sibilant and a following apical vowel, as observed in the articulatory studies of Mandarin Chinese (Chen Reference Chen2011, Lee-Kim Reference Lee-Kim2014, Chen et al. Reference Chen, Jin Zhang, Liu, Wei and Dang2015, Faytak & Lin Reference Faytak and Lin2015), Jixi-Hui Chinese (Shao Reference Shao2020, Shao & Ridouane, published online 19 January Reference Shao and Ridouane2023) and Suzhou Chinese (Ling Reference Ling2009, Faytak Reference Faytak2018, Hu & Ling Reference Hu and Ling2019). Second, different apical vowels have their own articulatory gestures (Lee-Kim Reference Lee-Kim2014, Faytak & Lin Reference Faytak and Lin2015), as observed in the X-ray study of the two apical vowels in Mandarin Chinese (Zhou & Wu Reference Zhou and Wu1963) showing that the alveolar [ɹ̩] involves a more front constriction and the retroflex [ɻ̩] a more back one. Bao (Reference Bao1984) further noted that the articulation of the two apical vowels in Mandarin Chinese is characterized by a concavity in the tongue shape. For apical vowels in Ningbo Chinese, Hu’s (Reference Hu2005) study showed that their articulation involves the tongue apex as well as the tongue dorsum. Third, inter-speaker variation has been observed in previous studies, in particular for the apical [ɹ̩] and [ɻ̩] in Mandarin Chinese, with a wide variety of lingual adjustments such as tongue dorsum lowering, tongue blade lowering, and tongue raising, etc. (Chen Reference Chen2011, Faytak & Lin Reference Faytak and Lin2015, Huang, Hsieh & Chang Reference Huang, Hsieh and Chang2021).
1.2 Apical vowels in Hefei Mandarin
This study focuses on apical vowels in Hefei Mandarin, which is a branch of Jianghuai Mandarin spoken in Anhui Province, China. As in Mandarin Chinese (illustrated in Table 1) and many other Chinese dialects, two apical vowels [ɹ̩] and [ɻ̩] in Hefei Mandarin are both in complementary distribution with the vowel [i] and occur after homorganic alveolar and retroflex sibilants respectively. In contrast to such cases, apical vowels in Hefei Mandarin have different phonological properties (Meng Reference Meng1962, Reference Meng1997; Li Reference Li1997), as illustrated in Table 2.Footnote 2
-
First, while many other Chinese dialects have one or two apical vowels (e.g. [ɹ̩] and [ɻ̩] in Mandarin Chinese), Hefei Mandarin has three phonetic apical vowels: the alveolar unrounded [ɹ̩], the alveolar rounded [ɹ̩ʷ], and the retroflex unrounded [ɻ̩].Footnote 3 With its three apical vowels, Hefei Mandarin was recognized as one of the Chinese dialects with the largest number of apical segments (Baron Reference Baron1974).Footnote 4 The rounded alveolar apical [ɹ̩ʷ], as in [sɹ̩ʷ213] ‘allow’, is relatively rare across Chinese dialects. For example, it is observed in only two of the 88 Chinese dialects in the survey of Lee & Zee (Reference Lee, Zee and Sybesma2017), although its counterparts exist in a number of Wu dialects (Zhu Reference Zhu2004). For example, Suzhou Chinese has contrastive unrounded and rounded apical vowels, phonetically realized as syllabic fricatives [z] and [zʷ], with a loose degree of constriction (Faytak Reference Faytak2018).Footnote 5
-
Second, the alveolar apical [ɹ̩] in Hefei Mandarin appears after homorganic consonants (e.g. [sɹ̩213] ‘dead’) as well as after non-homorganic ones (e.g. [pɹ̩213] ‘to compare’) as seen in Table 2. It thus differs from the apical [ɹ̩] in Mandarin Chinese (see Table 1) which occurs only after a homorganic alveolar sibilant such as in [sɹ̩55] ‘think’.
-
Third, the unrounded [ɹ̩] and the rounded [ɹ̩ʷ] are phonologically contrastive in Hefei Mandarin (e.g. [sɹ̩213] ‘dead’ vs. [sɹ̩ʷ213] ‘allow’), as seen in Table 2, while apical vowels in other languages are usually in complementary distribution with each other, e.g. the two apical vowels [ɹ̩] and [ɻ̩] in Mandarin Chinese appear respectively after alveolar and retroflex sibilants such as in [sɹ̩55] ‘think’ and [ʂɻ̩55] ‘wet’.
In addition to these properties, the three apical segments in Hefei Mandarin are all in complementary distribution with the high front vowel [i], as seen in Table 2, although there is a phonological contrast between the unrounded [ɹ̩] and the rounded [ɹ̩ʷ].
Previous studies have shown that the apical vowels in Hefei Mandarin have clear formant structure (Li Reference Li1997, Meng Reference Meng1997, Hou Reference Hou2007, Kong et al., published online 15 July 2022), similar to their counterparts in Mandarin Chinese, and frication noise, as observed in studies such as Hou (Reference Hou2007) and Kong, Wu & Li (Reference Kong, Wu and Li2019). Figure 1 gives an illustration of the waveforms and spectrograms of the syllables [sɹ̩213] ‘wash’, [sɹ̩ʷ213] ‘to allow, [ʂɻ̩213] ‘arrow’, and [pɹ̩213] ‘to compare’, [tsɹ̩213] ‘to squeeze’, [zɹ̩213] ‘gift’, produced by a male speaker of Hefei Mandarin (M02).Footnote 6 As illustrated in Figure 1f, the apical [ɹ̩] differs from the voiced alveolar fricative [z] by having a clear formant structure, which is relatively consistent when [ɹ̩] appears after different consonants, e.g. [s] [ts] [z] [p]. As shown in Figure 1a–c, [ɹ̩] [ɹ̩ʷ] [ɻ̩] have different formant patterns, with the F2 of [ɻ̩] slightly higher than those of [ɹ̩] and [ɹ̩ʷ]. Based on the data from three female and three female speakers, Hou (Reference Hou2007) reported F1 and F2 values of the three segments in Hefei Mandarin, as shown in Table 3. This table also includes mean F1 and F2 values of the two apical vowels [ɹ̩] and [ɻ̩] in Mandarin Chinese as reported in Lin & Wang (Reference Lin and Wang1992).Footnote 7 As shown in Table 3, F1 and F2 values of [ɹ̩] and [ɻ̩] in Hefei Mandarin are similar to those of their counterparts in Mandarin Chinese across the male and female speakers; F1 and F2 values of the rounded alveolar [ɹ̩ʷ] in Hefei Mandarin are slightly lower respectively than those of the unrounded alveolar [ɹ̩] in Hefei Mandarin in terms of mean values for both female speakers and male speakers. F3 of [ɻ̩] in Hefei Mandarin was reported to be lower than that in [ɹ̩] while being close to that in [ɹ̩ʷ] (Kong et al., published online 15 July Reference Kong, Wu and Li2022).
Note: HF = Hefei Mandarin, data from Hou (Reference Hou2007) based on three female and three male speakers; MC = Mandarin Chinese, data from Lin & Wang (Reference Lin and Wang1992) based on four female and four male speakers.
In contrast to the many acoustic studies, there have been few articulatory studies on the apical vowels in Hefei Mandarin, which is representative of an under-studied group of languages differing drastically from Mandarin Chinese in the inventory and phonotactics of apical vowels. In particular, articulatory properties of the typologically rare rounded apical [ɹ̩ʷ] have not been investigated with the exception of a handful of studies of other Chinese dialects, namely of Suzhou Chinese (Ling Reference Ling2009) and Ningbo Chinese (Hu Reference Hu2005). Both these dialects differ from Hefei Mandarin in not having the retroflex [ɻ̩] in their inventories. A case study on the apical vowels in Hefei Mandarin, therefore, is expected to supplement materials to the investigation of apical vowels and syllabic segments in general. Regarding the special properties of Hefei Mandarin as compared with many other languages, this study aims to address the following issues:
-
1. What are the tongue gestures involved in the production of the apical [ɹ̩] [ɹ̩ʷ] [ɻ̩] relative to [i]?
-
2. What are the lip gestures involved in the production of the apical vowels [ɹ̩] [ɹ̩ʷ] [ɻ̩]?
-
3. Is there articulatory difference between the consonant and the following vowel in the syllables [sɹ̩] [sɹ̩ʷ] [ʂɻ̩]?
-
4. Is the apical [ɹ̩] articulated differently or similarly when it appears after a homorganic sibilant (e.g. [s]) and a non-homorganic consonant (e.g. [p])?
To answer these questions, we employed ultrasound imaging to examine the tongue gestures and video recordings to examine the lip gestures in the production of the apical vowels in Hefei Mandarin. Ultrasound is a non-invasive and low-cost technique used to observe the real-time movement of the tongue in speech production and it has been applied in studying apical vowels in Mandarin Chinese (Chen Reference Chen2011, Lee-Kim Reference Lee-Kim2014, Chen et al. Reference Chen, Jin Zhang, Liu, Wei and Dang2015, Faytak & Lin Reference Faytak and Lin2015, Huang et al. Reference Huang, Hsieh and Chang2021) and the alveolar vs. retroflex consonant contrast in Mandarin Chinese (Luo Reference Luo2020) and Jixi Chinese (Shao Reference Shao2020, Shao & Ridouane, published online 19 January Reference Shao and Ridouane2023).
2 Method
2.1 Participants
Native speakers of Hefei Mandarin were recruited as participants based on the following criteria: The speakers should have grown up in Hefei, with Hefei Mandarin being their mother tongue as well as that of their parents. They must not have resided in a place other than Hefei for more than three months in the past 12 months. Finally, they should have had no speaking or hearing impairments. Based on these criteria, three female speakers and three male speakers from the old city area of Hefei were identified as the participants, with their ages from 25 to 39 (median = 26) at the time of the recording.Footnote 8 After the data collection, it was found that the ultrasound images of the participant M03 were of inferior quality and could not be extracted with EdgeTrak or other contour-based methods; therefore his data were excluded.
2.2 Materials
The stimuli for the ultrasound study included nine disyllabic sequences, as shown in Table 4.Footnote 9 The first syllable of each word in (a), (b), and (c) contained one of the three apical vowels, while the first syllable of the word in (d) contained the vowel [i] as a baseline for comparison. Following Lee-Kim (Reference Lee-Kim2014), we selected stimuli containing [x] as the onset of the second syllable. This is because, among the consonants of Hefei Mandarin, the fricative [x] is expected to have a minimal influence on the articulation of its preceding vowel in the first syllable.Footnote 10 In addition, Hefei Mandarin has a small number of syllables with the onset [x] and the second syllables [xɔ213] ‘good’ in (a)–(c) and [xu31] (an adjective affix) in (d) are two of the most frequently used morphemes, which are expected to facilitate the speakers’ natural articulation of the disyllabic sequences than other less frequent [x]-initial syllables. In terms of tone, the target syllables all bear the same lexical tone /213/, except for [ɕi], which bears a /24/ tone. In Hefei Mandarin, a sequence of two /213/ tones triggers a tone sandhi, i.e. /213 + 213/ → [24 + 213], by which the first tone surfaces as [24] (Kong et al., published online 15 July Reference Kong, Wu and Li2022). Thus, across the stimulus words in (a)–(c) in Table 4, the first syllable bears a [24] tone in its real phonetic form, which is the same tone as for the first syllable [ɕi] in (d) in Table 4. This design aimed to filter out the potential laryngeal difference when producing different tones.Footnote 11
2.3 Procedure
Following previous studies (Lee-Kim Reference Lee-Kim2014, Westerberg Reference Westerberg2016, among others), we focused on the midsagittal contour of the tongue in producing of the target segments, as they display the relative backness, height, and slope of the tongue, as well as the lip gestures.Footnote 12 Before the recording, the participants were presented with the written forms of the stimuli in simplified Chinese to let them get familiar with the stimulus words. Following the common practice of ultrasound studies (e.g. Epstein & Stone Reference Epstein and Stone2005), the participants were asked to swallow water before the production for extracting the palate trace. During the ultrasound data collection, the stimuli were displayed on a teleprompter one meter in front of the participant at the eye level, and the disyllabic words were presented in a random order. The participants read each disyllabic word in the carrier sentence: 我读___第个词 ‘I read___this word’ [o213 tuəʔ4__ti53 kə53 tsʰɹ̩24] as in Hefei Mandarin. For each of the nine target stimulus words in Table 4, five tokens were recorded with ultrasound imaging and audio recording. Thus, the five participants gave a total of 225 tokens (= 9 words × 5 repetitions × 5 participants).
When collecting the ultrasound data, the midsagittal sections were recorded at 40 fps using a SonixTablet ultrasound system in a sound-attenuated booth at Shanghai Normal University, Shanghai, China. The audio was recorded using a lavalier microphone through an Mbox mini audio interface, at the sampling rate of 44100 Hz, which was synchronized into an AVI file with the ultrasound video using an Epiphan USB capture card. Figure 2 illustrates the ultrasound probe and the ultrasound stabilization helmet. The probe was held in place under a participant’s chin using a stabilization helmet (Articulate Instruments Ltd. 2008), adjusted to maximize the freedom of movement of the mandible while maintaining full contact of the probe with the participant’s skin. This aims to avoid the movement of the probe relative to the head and to ensure that ultrasound videos are controlled with respect to the orientation of the probe to provide a positional reference in quantitative assessment.
When recording the lip gestures in producing the apical vowels, the speakers were invited to read the same materials as used for the ultrasound recording. Each participant produced three tokens for each target word and the five participants gave a total of 135 tokens (= 9 words × 3 repetitions × 5 participants). A built-in camera in a Xiaomi smartphone was set up at 23 cm directly in front of a speaker’s mouth when recording the front view and at 30 cm away from the mouth when recording the side view. The recording was done at a resolution of 640 × 368 pixels at a sampling rate of 44,100 Hz. An audio recording was made simultaneously with the video recording of the lip gestures, which was used to track the time course correspondence between the lip gestures and the produced segments.
2.4 Data analysis
In processing the ultrasound data, the onset and offset of each onset consonant and each vocalic segment were identified following the practice in previous studies (e.g. Iskarous, Shadle & Proctor Reference Iskarous, Shadle and Proctor2011, Lee-Kim Reference Lee-Kim2014, among others). Frames from the midpoints of the target segments were selected using the software Adobe Premiere, referring to the acoustic landmarks in the time-aligned audio; the tongue surface contours were extracted using EdgeTrak (Li, Kambhamettu & Stone Reference Li, Kambhamettu and Stone2005). These contours are assumed to represent the genuine tongue shapes for the relevant vocalic segments. When there was an even number of frames, the first frame after the middle was used (Tabain & Beare Reference Tabain and Beare2018). Figure 3 demonstrates a sample waveform and spectrogram of the syllable [sɹ̩] and the corresponding ultrasound imaging frames at which the vocalic articulation reached its maximal constriction and Figure 4 shows an extracted tongue contour.
To assess the tongue shapes of the apical vowels and the vowel /i/, a Smoothing Spline ANOVA in polar coordinates was adopted as the statistical procedure, using the R code tongue_ssanova.r by Mielke (Reference Mielke2017).Footnote 13 Polar coordinates were adopted as they are expected to be more appropriate than Cartesian coordinates for comparisons involving tongue retraction or advancement, especially in vowels (Mielke Reference Mielke2015). The horizontal coordinate x and the vertical coordinate y of each point in the traces were converted from Cartesian coordinates into polar coordinates with the angular coordinate θ and the radial coordinate r. The origin was determined according to the x and y values of the highest point and the lowest point of all the traces in the samples. The x coordinate of the origin was the x value of the highest point of the traces, while the y coordinate is the point 1% less than the y value of the lowest point of the traces. The x and y values of the polar coordinates are the differences between the point in the trace and the origin. The SSANOVA does not return an F value, instead, the smoothing parameters of the components of the equation are compared to determine their relative contributions (Gu Reference Gu2002, Stone Reference Stone2005, Davidson Reference Davidson2006). Using this method, smoothing splines that best fitted the five repetitions for the stimuli were obtained. Analyses were conducted with the R statistical package version 3.1.2 (R Core Team 2014). Figures were created using the ggplot2 package in R (Wickham Reference Wickham2009).
In the analysis of lip gestures, the video recordings and the audio recordings were examined to find the time points corresponding to the midpoints of the apical vowels. With the limited number of speakers and the relatively small amount of video recording for lip gestures, visual inspection was adopted to generalize the qualitative property of the lip gestures when producing the apical vowels.
3 Results
In this section, we present the articulatory properties of the apical vowels and relevant comparisons following the order of the four research questions in Section 1.2: the tongue gestures involved in the production of the three apical vowels vs. the vowel [i] (Section 3.1), the lip gestures of the three apical vowels (Section 3.2), the tongue gestures involved in the production of the onset vs. vowel in the syllables [sɹ̩] [sɹ̩ʷ] [ʂɻ̩] (Section 3.3), and the tongue gestures of [ɹ̩] after homorganic vs. non-homorganic consonants (Section 3.4).
3.1 Tongue gestures of the apical vowels and the vowel [i]
Figure 5 presents the smoothing spline estimates of the tongue gestures associated with the production of the vowels [ɹ̩] (solid line), [ɹ̩ʷ] (dashed line), [ɻ̩] (dotted line) and those of the vowel [i] (dash-dotted line) by the five speakers, with 95% Bayesian confidence intervals, based on five repetitions of each syllable. Across the female and male speakers, the strategies to produce [i] appeared to be relatively invariant while those of the three apical vowels showed some commonalities as well as certain variations, especially in the tongue blade region. First, the most obvious difference between [i] and the apical vowels is that the latter three have an obviously more retracted tongue root or tongue dorsum, although to different degrees. For example, the most retracted tongue root occurred in [ɻ̩] (dotted line) in speakers F01, M02, F03, while it occurred in [ɹ̩ʷ] (dashed line) in speakers M01 and F02. Thus, tongue root retraction is likely to be a defining articulatory property of Hefei Mandarin apical vowels, in contrast to [i]. Second, as compared with [i], the three apical vowels had a lower tongue body and generally a lower tongue blade, with the exception in F01, for whom [ɹ̩] (solid line) and [ɹ̩ʷ] (dashed line) had a slightly more raised tongue blade, and in M02, for whom [ɻ̩] (dashed line) had a more raised tongue blade. Third, the vowel [i] (dash-dotted line) was articulated with an obvious front bunching while [ɹ̩] [ɹ̩ʷ] [ɻ̩] were generally not as front-bunched as [i] across the five speakers, as revealed by the separation of [ɹ̩] [ɹ̩ʷ] [ɻ̩] on the left and [i] on the right. Fourth, the constriction location of [i] was closer to the hard palate while those of the apical vowels were at a more anterior position. It is likely that the tongue tip was involved in making the constriction of apical vowels along with the tongue blade, which was not visible in the images.
The alveolar apical vowel [ɹ̩] (solid line) generally had a lowered tongue body and a more front tongue root as compared with the other two apical vowels, with the exception in F01 and M02. For F01, the tongue root of [ɹ̩] overlapped with that of [ɹ̩ʷ] (dashed line) and, for M02, it was slightly more retracted than [ɹ̩ʷ]. As a special case, the [ɹ̩] (solid line) of speaker F01 had a tongue concavity, which was not obvious in other speakers. The tongue blade of the vowel [ɹ̩] (solid line) was generally lower than [ɻ̩] (dotted line) with the exception of F01 and M01.
The alveolar apical [ɹ̩ʷ] is recognized in the literature as a rounded counterpart of the unrounded [ɹ̩], which usually concerns the gestures of lips (to be detailed in Section 3.2). As shown in Figure 5, [ɹ̩ʷ] (dashed line) differed from [ɹ̩] (solid line) in having a more retracted tongue root in speakers F02, F03, and M01; for speakers F01 and M02, [ɹ̩ʷ] and [ɹ̩] generally overlapped in the tongue root area with M02’s [ɹ̩ʷ] even a bit more front than [ɹ̩]. Close to the area of the alveolar ridge, the rounded [ɹ̩ʷ] involved a reduced degree of constriction, as compared with [ɹ̩], which is most obvious in F03 and M01.
The retroflex apical [ɻ̩] (dotted line) generally had a higher tongue body and a more retracted tongue root than [ɹ̩], with the exception of M01, whose [ɻ̩] had a more front tongue root than [ɹ̩]; for speakers F02, F03, and M02, the tongue root of [ɻ̩] had a similar degree of retraction as [ɹ̩ʷ]. Focusing on the tongue blade, a raising gesture in [ɻ̩] could be observed for speakers F01, F02, and M02, while no obvious raising for speakers F03 and M01. Close to the alveolar ridge region, the retroflex apical [ɻ̩] showed a reduced degree of constriction across the five speakers, which was most obvious for M02 and similar to [ɹ̩ʷ] for F02.
3.2 Lip gestures of the apical vowels
To examine the lip gestures associated with the three apical vowels, the images corresponding to the temporal midpoints of the vowels were obtained from the video recording, referring to the corresponding audio recording. A visual inspection showed that the speakers were generally consistent in their lip gestures when producing the three apical vowels respectively. Below, Figure 6 gives an illustration of the lip gestures by a female speaker (F01) and sample lip gestures of the other speakers are provided in appendix Figure A1.
When producing the unrounded alveolar [ɹ̩], the speakers consistently had unrounded lips with a moderate aperture; when producing the rounded alveolar [ɹ̩ʷ], in contrast, the speakers generally had a clear rounding of their lips with a small aperture. As shown in the Figure A1, the lip gestures for [ɹ̩ʷ] of speakers F03 and M01 were similar to that of F01 as shown in Figure 6, and those of F02 and M02 involved an even stronger rounding than F01. This is consistent with the impressionistic description of the two apical vowels in the literature respectively as an unrounded vowel vs. a rounded vowel (Hou Reference Hou2007).
When producing the retroflex apical vowel [ɻ̩], the speakers generally had a larger vertical aperture than [ɹ̩]. A lip protrusion (out-rounding) was generally involved for [ɻ̩] across the speakers, which was the most obvious for F02 and the least obvious for M01 as shown in the Figure A1. The out-rounding in the apical [ɻ̩] of Hefei Mandarin is reminiscent of similar gestures in English rhotic segments and postalveolar obstruents, which is presumably an enhancement to lower F3 or center of gravity (King & Ferragne Reference King and Ferragne2019, Smith et al. Reference Smith, Mielke, Magloughlin and Wilbanks2019).Footnote 14
Across the five speakers, the lip gestures for the three apical vowels generally differed from each other the same way as in speaker F01, although some speakers had a stronger lip rounding when producing [ɹ̩ʷ] or a stronger lip protrusion when producing [ɻ̩].
3.3 Onset consonants vs. vowels in the syllables [sɹ̩] [sɹ̩ʷ] [ʂɻ̩]
To examine the potential articulatory difference between the onset consonants and vowels in the syllables [sɹ̩] [sɹ̩ʷ] [ʂɻ̩], the tongue gestures across the syllables were extracted, as illustrated in Figure 7 using the data of a female speaker (F01). On average, five to seven frames were obtained for a consonant and seven to nine frames for a vowel. For each speaker, Figure 8 presents the smoothing spline estimates of the curves of the onset consonants (dashed line) and the vowels (solid line) in the syllables [sɹ̩] [sɹ̩ʷ] [ʂɻ̩], each based on five tokens, with the palate traces imposed. The curves of the onset consonants and vowels were modeled respectively from the middle points of the segments.
As shown in Figure 8, across the syllables [sɹ̩] [sɹ̩ʷ] [ʂɻ̩], the onset consonants and the vowels deviated more or less in their tongue positions to different degrees across the speakers. Specifically, the lowering of tongue dorsum/blade and the retraction of tongue root existed as adjustments of the tongue position from the onset /s/ and /ʂ/ to the apical vowels across the speakers. The exception to this pattern occurred in the [sɹ̩ʷ] and [ʂɻ̩] by M01, for which the smoothing spline estimates showed largely overlapping curves of the onset consonants and the vowels, and the [sɹ̩] by F03, for which the tongue root was less retracted in the vowel.
In the syllable [sɹ̩], the onset [s] had a slightly higher tongue blade than the apical vowel [ɹ̩] across the speakers, with the smallest difference in F03, whose [s] and [ɹ̩] overlapped in the tongue blade. This is consistent with the recognition that the onset sibilant [s] has a tighter constriction, and thus more frication noise, than its following vocalic [ɹ̩]. In terms of the tongue root, [s] is more front than [ɹ̩] in speakers F01 and F02, while the reverse seems to be true for F03 and M02, indicating inter-speaker variation in this aspect.
In the syllable [sɹ̩ʷ], the tongue body and tongue blade in the onset [s] were higher than those in the apical [ɹ̩ʷ], with the exception of M01, whose [s] and [ɹ̩ʷ] generally overlapped in the tongue blade. The tongue root of the vocalic [ɹ̩ʷ] was more retracted than that of the onset [s], which was also observed in the syllable [sɹ̩] as stated above. Across the five speakers, the tongue dorsum was generally more raised in [ɹ̩ʷ] than in [s]. In addition, [ɹ̩ʷ] also involved a concavity curve as compared with [s], more obvious in F01 and F03 than in M02, by which the tongue blade of [ɹ̩ʷ] was a bit more distant from the palate than the onset [s].
In the syllable [ʂɻ̩], the onset [ʂ] and the vocalic [ɻ̩] differed in their gestures across the five speakers with diverse patterns in tongue blade, tongue body, or tongue root. More specifically, [ʂ] had a higher tongue body than [ɻ̩], which was more obvious in F01 than in F02; [ʂ] had a more front tongue dorsum than [ɻ̩] in F01, F03 and M02; [ʂ] had a slightly more retracted tongue root than [ɻ̩] in M01, for whom the two almost overlapped in the tongue blade. The relative diversity in the tongue gestures of [ɻ̩] presumably indicates that, for speakers of Hefei Mandarin, with lip protrusion in position, the tongue gesture has more flexibility as long as it differs from the onset [ʂ] with a higher tongue blade or a less retracted tongue dorsum.
In general, the above shows different tongue gestures in the onset consonants and the vowels of the syllables [sɹ̩] [sɹ̩ʷ] [ʂɻ̩], with [s] vs. [ɹ̩] differing primarily in tongue blade and tongue body, [s] vs. [ɹ̩ʷ] mainly in tongue body and tongue root, and [ʂ] vs. [ɻ̩] with diverse patterns. This is reminiscent of the observed adjustment to tongue position in the apical vowels in other Chinese dialects, in terms of dorsum or blade lowering, relative to homorganic onset sibilant (Lee-Kim Reference Lee-Kim2014, Faytak & Lin Reference Faytak and Lin2015). Put differently, the three apical vowels in Hefei Mandarin seem to involve articulatory gestures that are not necessarily the same as in their preceding consonants. In addition, again, it is possible that the articulation of the apical vowels and the onset consonants in the syllables [sɹ̩] [sɹ̩ʷ] [ʂɻ̩] involves the tongue tip together with the tongue blade. Given the limitation of ultrasound imaging, such articulatory details are not clearly visible, and this may explain some of the inter-speaker variation discussed above.
3.4 Apical /ɹ̩/ after the alveolar and bilabial consonants
The alveolar apical vowel [ɹ̩] in Hefei Mandarin may appear after a homorganic alveolar sibilant, as in [sɹ̩213] ‘wash’, as well as after a non-homorganic bilabial consonant, as in [pɹ̩213] ‘to compare’ and [mɹ̩213] ‘rice’. An SSANOVA analysis was performed for the tongue gestures of the apical vowel [ɹ̩] over the time course of the vowel when it appears after the bilabial consonants, as in [pɹ̩213] [mɹ̩213], as opposed to when it appears after alveolar consonants, as in [sɹ̩213]. The results of SSANOVA splines are presented in Figure 9.
As Figure 9 shows, focusing on a single speaker, the tongue gestures of the apical vowel [ɹ̩] were generally similar when the preceding consonants were homorganic vs. non-homorganic, with the exception of M01. In other words, for a speaker, the gesture of the [ɹ̩] after an alveolar fricative /s/ or /z/ (blue line), largely overlapped with that of the [ɹ̩] after a bilabial consonant /m/ or /p/ (brown line), despite slight discrepancies in the tongue blade or tongue body. The discrepancies differed across the speakers. The tongue gestures of [ɹ̩] after an alveolar /s/ or /z/ involved a tongue blade raising (M01 and M02) or a tongue body raising (F02 and F03); it also involved a less degree of tongue root retraction (M01 and F03) as compared with the [ɹ̩] after a bilabial /m/ or /p/. In general, the data seem to suggest a relative intra-speaker articulatory uniformity in producing the apical [ɹ̩] in Hefei Mandarin across different consonantal contexts, whether the onset consonants are homorganic or non-homorganic.
4 General discussion
Apical vowels as in Chinese dialects are generally recognized as vocalic segments homorganic to their proceeding sibilants, with controversies arising as to whether these vowels have their own intrinsic articulatory gestures independent of the preceding consonants. In terms of articulatory gestures, some studies showed virtually identical gestures of an apical vowel and its preceding consonant, e.g. the apical [ɹ̩] in Suzhou Chinese (Faytak Reference Faytak2018). Other studies observed that an apical vowel may have different gestures relative to its preceding consonant, e.g. a slight lowering of the tongue body in [ɻ̩] in Mandarin Chinese (Lee-Kim Reference Lee-Kim2014), a raising of tongue blade in [ɻ̩] in Mandarin Chinese (Chen et al. Reference Chen, Jin Zhang, Liu, Wei and Dang2015), or a lowering of tongue dorsum in [ɹ̩] in Jixi Chinese (Shao Reference Shao2020, Shao & Ridouane, published online 19 January Reference Shao and Ridouane2023). The observation in the current study about apical vowels in Hefei Mandarin is consistent with the view that an apical vowel may differ from its preceding consonant in its articulatory gesture. As reported in Section 3, the apical [ɹ̩] [ɹ̩ʷ] [ɻ̩] involved different degrees of tongue root retraction and tongue body lowering relative to the vowel [i], and their constrictions occurred at a more anterior position than that of [i]. Within a speaker, they each had a relatively consistent tongue gesture that might not be necessarily attributed to the influence of a preceding consonant. In particular, the tongue gestures in producing [ɹ̩], in terms of tongue root retraction or tongue dorsum lowering across the speakers, showed a relative articulatory similarity when it appears after a homorganic vs. non-homorganic consonant.
The results reported in Section 3 suggested that, for apical vowels in Hefei Mandarin, the retraction of tongue root or tongue dorsum is likely to be a defining articulatory property. This finding is reminiscent of reports by Lee-Kim (Reference Lee-Kim2014), Faytak & Lin (Reference Faytak and Lin2015) and Huang, Hsieh & Chang (Reference Huang, Hsieh and Chang2021). For example, a slightly retracted tongue root was observed for [ɹ̩] and a slightly lowered tongue body for [ɻ̩] in Mandarin Chinese (Lee-Kim Reference Lee-Kim2014); similarly, the tongue dorsum when producing [ɹ̩] and [ɻ̩] was reported to be as low and retracted as [a] (Faytak & Lin Reference Faytak and Lin2015). The results about Hefei Mandarin showed that, within the syllables [sɹ̩] [sɹ̩ʷ] [ʂɻ̩], the onset sibilants and the apical vowels deviate more or less in their tongue gestures, with the apical vowels involving a retracted tongue root, lowered tongue dorsum or blade, or both, relative to the onset fricatives respectively. This is reminiscent of the observation of apical vowels in Jixi-Hui Chinese, i.e. a lower tongue dorsum in apical vowel than in [s] on the mid-sagittal plane (Shao Reference Shao2020) as well as Mandarin Chinese, i.e. an articulatory change from a homorganic onset to an apical vowel (Chen et al. Reference Chen, Jin Zhang, Liu, Wei and Dang2015). That being the case, it needs to be noted that other studies have also reported little displacement between a homorganic consonant and an apical vowel in Mandarin Chinese (Chen Reference Chen2011, Faytak & Lin Reference Faytak and Lin2015) and Suzhou Chinese (Ling Reference Ling2009, Faytak Reference Faytak2018). The similarities and differences between the results in our study and those in the literature suggest a future direction of cross-linguistic investigation of apical vowels. It is possible that the tongue gestures involved in the articulation of apical vowels in a particular language involve commonalities as well as idiosyncrasy. In addition, ultrasound data cannot provide detailed information in the gestures involving tongue tip, which might be relevant to the tongue gestures involved in the production of apical vowels. Such details might be revealed in future research with the use of other experimental method such as EMA.
The unrounded apical [ɹ̩] and the rounded apical [ɹ̩ʷ], which differ in their lip gestures, contrast with each other in Hefei Mandarin. In terms of tongue gestures, [ɹ̩] generally has a more retracted tongue root while [ɹ̩ʷ] involves a reduced degree of constriction close to the alveolar ridge. Such a difference echoes previous studies showing that unrounded vs. rounded vowels may differ in their tongue gestures (Raphael et al. Reference Raphael, Bell-Berti, Collier and Baer1979, Wood Reference Wood1986, Radisic Reference Radisic2014). For example, Wood (Reference Wood1986) observed that the rounded [y] has a tongue blade raising and a slightly lower tongue body relative to its unrounded counterpart. Radisic (Reference Radisic2014) observed that unrounded vowels in Turkish are articulated higher than rounded ones in the front region while lower in the back region. Raphael et al. (Reference Raphael, Bell-Berti, Collier and Baer1979) observed that front rounded vowels may involve a lower tongue height relative to their unrounded counterpart. Our results for different speakers are consistent with these points respectively. For example, for speakers M01 and F03, the unrounded [ɹ̩] was more front than the rounded [ɹ̩ʷ]; for the same two speakers, the unrounded [ɹ̩] was higher than the rounded [ɹ̩ʷ] in the more front region while the reverse was true in the more back region, which also held true for speaker F02. On the other hand, the tongue gestures of [ɹ̩] vs. [ɹ̩ʷ] by speakers F02, F03, and M01 differed from the pattern observed in Wood (Reference Wood1986) in that the rounded [ɹ̩ʷ] involved a higher tongue body than its unrounded counterpart [ɹ̩]. Across different speakers, the rounded vowel [ɹ̩ʷ] had a lower tongue body relative to the unrounded vowel in some cases, while it had a higher tongue body in other cases. This echoes the observations in Perkell et al. (Reference Perkell, Matthies, Svirsky and Jordan1993) and Chen (Reference Chen2011) that some speakers’ rounded vowels were associated with a tongue body raising relative to unrounded vowels while others’ were associated with a tongue body lowering. Furthermore, for speaker F01, there seemed to be no obvious difference between the unrounded [ɹ̩] and the rounded [ɹ̩ʷ], as shown in their overlap in Figure 5.
5 Conclusion
This study examined the articulatory properties of apical vowels in Hefei Mandarin, which represents an under-studied type of languages in terms of its inventory of apical vowels and their phonological status. While the apical vowels in languages such as Mandarin Chinese follow a homorganic consonant and are in complementary distribution, the apical vowels in Hefei Mandarin may follow non-homorganic consonants, thus allowing us to disassociate an apical vowel from a homorganic consonant onset to examine its own articulatory gesture in particular in terms of tongue position. Our results showed that an apical vowel involves distinct articulatory targets that may not be simply attributed to the influence from its preceding consonants, consistent with some observations in the literature. The commonalities in producing apical vowels in Hefei Mandarin include a retracted tongue root, lowered tongue dorsum or blade, or both, in addition to a coronal constriction implemented with the blade and/or the tip; lip gestures are also involved in distinguishing the apical segments. We would like to acknowledge that the observations in the current research were based on a relatively limited number of speakers, although the speakers are representative of Hefei Mandarin. Finally, as the focus of this study was on articulatory properties of apical vowels, we did not explore the connection between their articulation and acoustics – something that would be important to address in the future work.
Acknowledgements
The authors would like to thank Professor Zhongmin Chen for his generous help. We are greatly indebted to the anonymous reviewers and the editors of Journal of the International Phonetic Association, whose comments led to great improvement of this research. The research is partly supported by Philosophical and Social Science Grant of Anhui Province (Grant No. AHSKY2019D099), Hong Kong Baptist University Faculty Research Grant (FRG) Category II [Grant No. FRG2/17-18/076], and Hong Kong Baptist University Research Committee’s Start-up Grant for New Academics.
Appendix