Phonological and acoustic properties of ATR in the vowel system of Akebu (Kwa)

Nadezhda Makeeva; Natalia Kuznetsova

doi:10.1017/S0952675723000222

Phonological and acoustic properties of ATR in the vowel system of Akebu (Kwa)

Published online by Cambridge University Press: 12 January 2024

Nadezhda Makeeva and

Natalia Kuznetsova

Show author details

Nadezhda Makeeva: Affiliation:
Department of African Languages, FSBIS Institute of Linguistics of the Russian Academy of Sciences, Moscow, Russia
Natalia Kuznetsova*: Affiliation:
Dipartimento di Scienze linguistiche e Letterature straniere, Università Cattolica del Sacro Cuore, Largo Gemelli 1, Milano, Lombardia 20123, Italy Department of the Languages of Russia, Institute of Linguistic Studies of the Russian Academy of Sciences, Saint Petersburg, Russia
*: Corresponding author: Natalia Kuznetsova; Email: [email protected]

Article contents

Abstract
Introduction
Acoustic study: background, methods, research questions
Results
Discussion and conclusions
Author contributions
Competing interests
Footnotes
References

Rights & Permissions

Abstract

This study examines phonological and phonetic properties of ATR contrasts in the vowel system of Akebu (Kwa). The sum of descriptive evidence, including vowel harmony, vowel distribution in non-harmonising contexts, vowel reduction and typological and etymological considerations, indicates a rare vowel inventory with an ATR contrast in front/back vowels but a height contrast in the three redundantly [−ATR] central vowels /ᵻ, ə, a/. This analysis was checked against four common acoustic metrics of ATR: F1 and F2 frequencies, spectral slope and F1 bandwidth size (B1). As expected, the results for the last three metrics were variable across speakers and vowel types, and are therefore inconclusive. The results for F1 were consistent but do not distinguish between ATR and vowel height. Two results nonetheless suggest the [−ATR] status of central vowels: they occupy the same belt of F1 frequencies and show the same position of observed-over-predicted B1 values as front and back [−ATR] vowels.

Keywords

Akebu ATR feature vowel harmony central vowels interior vowels

Type: Article
Information: Phonology , Volume 39 , Issue 4 , November 2022 , pp. 641 - 678

DOI: https://doi.org/10.1017/S0952675723000222 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright: © The Author(s), 2024. Published by Cambridge University Press

1. Introduction

Akebu (Kebu; ISO 639-3 keu) is a language of the Kebu-Animere subgroup in the Ka-Togo group within the Kwa family of the Niger-Congo phylum (Heine Reference Heine1968, Reference Heine2017; Blench Reference Blench2009). It is spoken mainly in the prefecture of Akebu in Togo (West Africa) by ca. 70,000 people (Gblem-Poidi & Kantchoa Reference Gblem-Poidi and Kantchoa2012; Eberhard et al. Reference Eberhard, Simons and Fennig2019). According to our field findings, the vowel system of Akebu consists of 11 vowels, as shown in Table 1. Vowels contrast for height, place (to which lip rounding is correlated) and advanced tongue root (ATR).

Table 1 The Akebu vowel system.

This system is highly asymmetrical and typologically unusual in several aspects. Vowel harmony patterns, discussed below, suggest that Akebu contains a cross-linguistically unique feature – three harmonically non-neutral central vowels which do not contrast in [ $\pm $ ATR]. These patterns also suggest the relevance of a distinction between interior and peripheral vowels (see Table 1), rarely accounted for in the literature. The main harmony rules potentially allow for more than one possible interpretation of the ATR properties of central vowels, though. Still, as we discuss in §§1.5 and 4.1, there also exists less direct phonological and other evidence for the [−ATR] specification of central vowels.

In order to gather additional evidence for the [ $\pm $ ATR] status of Akebu vowels, we studied their phonetic properties via four typically investigated acoustic correlates of ATR: F1 and F2 frequency, spectral slope and F1 bandwidth size. Special attention was payed at the acoustic properties of the potential height contrast in central vowels vs. the [±ATR] contrast in other vowels. Phonetic results were partially inconclusive, but still suggested some arguments in favour of the analysis of central vowels as [−ATR].

As the Akebu vowel system is typologically unusual, our descriptive and acoustic data from the language contribute to our understanding of the cross-linguistic typology of ATR within the context of the languages of Africa (ATR outside Africa manifests partially different properties and is not considered here). In the remainder of this section, we present our field findings on Akebu phonology and morphology, placing the ATR-related phonological data against the typological background of the Niger-Congo and Nilo-Saharan language phyla. Our acoustic study is described in §2, and results are presented in §3. We discuss our findings in §4.

1.1. General features of Akebu phonology

In addition to 11 vowels, Akebu has 24 consonants /p b t d ʈ ɖ c ɟ k g kp gb m n ɲ ŋ f v s z h l y w/ and 3 level tones: low (L), mid (M) and high (H) (Makeeva Reference Makeeva2022a). The tone-bearing unit (TBU) in Akebu is the syllable. Contour tones represent sequences of level tones realised on a single TBU and are a result of contextual changes of lexical tones under tonal rules. Akebu has no metrical stress.

Four syllable types are possible: V, CV, C₁C₂V and ŋ. The syllabic nasal /ŋ/ can bear a tone different from the tones of adjacent syllables. Akebu also manifests the so-called featural foot – a phonotactic unit with a considerable degree of internal phonetic and phonological cohesion in terms of vocalic, consonantal and tonal restrictions (Green Reference Green2015; Vydrin Reference Vydrin2020). Monosyllabic (V, ŋ, CV), disyllabic (VV, ŋŋ, CVV, CV₁V₂, C₁C₂VV, CVŋ, C₁C₂Vŋ, C₁VC₂V) and trisyllabic (CV₁V₂V₂, C₁VC₂Vŋ) feet are attested. In the C₁C₂ clusters, C₁ can only be /b, p/; C₂ is always /ʈ/. In the V₁V₂ clusters, V₁ can only be /i, ɪ, u, ʊ/. The overall tonal contours on feet are restricted to H, M, L, HL, ML, LH, LM.

1.2. Differences between the proposed vowel inventory and earlier interpretations

The analysis of the Akebu vowel system in Table 1 is based on data collected by the first author during two field trips in 2012 and 2019 to the village of Djon in the prefecture of Akebu. It partially differs from the descriptions in the prior literature on Akebu (Wolf Reference Wolf1907; Heine Reference Heine1968; Koffi Reference Koffi1984; Storch & Koffi Reference Storch, Koffi, Meißner and Storch2000; Djitovi Reference Djitovi2003; Adjeoda Reference Adjeoda2008; Amoua Reference Amoua2011; Sossoukpe Reference Sossoukpe2012, Reference Sossoukpe, Ahoua and Elugbe2017; Shluinsky Reference Shluinsky2020).

1.2.1. Vowel nasality

First, most previous work (except for Wolf Reference Wolf1907 and Sossoukpe Reference Sossoukpe2012, Reference Sossoukpe, Ahoua and Elugbe2017) considers nasalised vowels separate phonemes. We analyse them as allophones of the corresponding oral vowels since they appear, often optionally, only after the nasal consonants /m, n, ɲ, ŋ/ or before a syllable-final /ŋ/, as in the verbs in (1).

1.2.2. The number of ATR contrasts

Second, in a number of studies (Wolf Reference Wolf1907; Heine Reference Heine1968; Koffi Reference Koffi1984; Amoua Reference Amoua2011), one or two peripheral high [−ATR] vowels are missing from the system. However, the independent phonological status of all peripheral [ $\pm $ ATR] vowel pairs outlined in Table 1 is confirmed by numerous contrasts, as illustrated by the verbs in (2).

In this sense, the Akebu vowel inventory belongs to the so-called /2IU-2EO/ type of ATR systems, which have this contrast in both high and mid vowels. Typically, such an inventory includes the following 9 or 10 vowels: /i, e, ɪ, ɛ, (ə), a, u, o, ʊ, ɔ/. It is opposed to the other two widespread types: /2IU-1EO/, which lacks the [±ATR] contrast in mid vowels, and /1IU-2EO/, which lacks it in high vowels. These systems have rather seven or eight vowels (depending on whether /ə/ is present): /i, ɪ, ɛ, (ə), a, u, ʊ, ɔ/ or /i, e, ɛ, (ə), a, u, o, ɔ/, respectively (Casali Reference Casali2016; Rose Reference Rose, Gallagher, Gouskova and Yin2017).

1.2.3. Interior vowels

The third way our analysis differs from previous studies concerns the number of interior vowels and their place in the Akebu vowel system. Interior vowels, as opposed to the peripheral ones, are those within the interior regions of the vowel space, including front rounded vowels, unrounded non-low back vowels and non-low central vowels (Rolle et al. Reference Rolle, Lionnet and Faytak2020). A distinction of interior vs. peripheral is absent from standard and more recent generative feature geometries (e.g., McCarthy Reference McCarthy1988; Lahiri Reference Lahiri, Hyman and Plank2018), yet there is evidence that this distinction might play role in some vowel patterns, as the Akebu vowel harmony data show.

Akebu is typologically unusual in that it manifests both ATR (contrasts and harmony) and the presence of interior vowels, which were shown to be antagonistic in a survey of 681 languages of the Macro-Sudan belt by Rolle et al. (Reference Rolle, Lionnet and Faytak2020). These authors placed Akebu among the 29 languages which show a co-occurrence of a ‘complete’ ATR system with the presence of phonemic interior vowels. Their ‘complete’ systems included /2IU-2EO/ inventories with cross-height ATR harmony in both static and dynamic patterns (cf. §1.2.2). Also categorised as ‘complete’ were /1IU-2EO/ and /2IU-1EO/ systems which showed partially phonetic cross-height harmony due to the variation of positional [ $\pm $ ATR] allophones in high or mid vowels, respectively (Casali Reference Casali2003).

In most work, only one interior vowel was mentioned for Akebu. Rolle et al. (Reference Rolle, Lionnet and Faytak2020: 142) cite Akebu as a very rare example of a language where the interior vowel /ə/ did not contrast for [±ATR] with /a/. Storch & Koffi (Reference Storch, Koffi, Meißner and Storch2000) consider both vowels neutral; Djitovi (Reference Djitovi2003) considers both to be [−ATR]; and Amoua (Reference Amoua2011) describes /ə/ as [−ATR].

The existence of a high central vowel in Akebu was first claimed by Sossoukpe (Reference Sossoukpe2012, Reference Sossoukpe, Ahoua and Elugbe2017). However, both Sossoukpe and Shluinsky (Reference Shluinsky2020) analyse both interior vowels as [+ATR]. No explicit grounds for this are given, but apparently this analysis was dictated by Dynamic Rule 1 of vowel harmony, discussed below in §9 (cf. Sossoukpe Reference Sossoukpe, Ahoua and Elugbe2017). The vowel /a/ is classified as [−ATR] by Sossoukpe and as neutral by Shluinsky. Shluinsky (Reference Shluinsky2022), in turn, considers all central vowels as [−ATR] based on the preliminary results of our study.

The high central vowel was also not mentioned and had no separate symbol in the existing orthography manuals, which distinguished only between two central vowels: /a/ and /ə/ (with $\langle $ ə $\rangle $ indicating both /ᵻ/ and /ə/; M’boma Reference M’boma2012; Sossoukpe Reference Sossoukpe2014). High and mid central vowels, however, distinguish minimal pairs, as in the verbs in (3).

They also contrast with other similar vowel qualities, as shown in (4).

Akebu central vowels do not manifest [ $\pm $ ATR] pairs of the same height, unlike front and back vowels, so there is no straightforward answer to whether they are [−ATR], [+ATR] or neutral.

Rolle et al. (Reference Rolle, Lionnet and Faytak2020) do not formulate any detailed typological expectations with respect to the ATR quality of interior vowels for those few systems with ATR which contained such vowels. They only draw a distinction between those languages with two central vowels which contrast for [ $\pm $ ATR] and all other types of interiority. Our preliminary observations on the quality of interior vowels in ATR systems with more than two central vowels, based on the sources reported in Rolle et al. (Reference Rolle, Lionnet and Faytak2020), indicated two recurring patterns:

• in languages with an even number of central vowels (four), at least two interior vowels form a [ $\pm $ ATR] pair (e.g. Daloa Bete, Koyo, Tima);
• in languages with an odd number of central vowels (three or five), high and mid (i.e., interior) vowels form [ $\pm $ ATR] pairs, while the low vowel /a/ usually remains unpaired and harmonically neutral (e.g., Lama, Gagnoa and Guiberoua Bete, Kabiye).

However, Akebu was cited there as a rare language with no [±ATR] contrast in central vowels at all. The question of the [±ATR] specification of central vowels is further discussed in the following sections.

1.3. General features of Akebu ATR harmony

ATR harmony patterns vary cross-linguistically along several dimensions. However, there is not yet an established consensus about the set of these parameters and their exact values. Phonological, phonetic and morphological criteria remain partially intermingled. Existing typological generalisations are reported below based on Casali (Reference Casali2003, Reference Casali2008a, Reference Casali2016), Rose (Reference Rose, Gallagher, Gouskova and Yin2017), Rolle et al. (Reference Rolle, Lionnet and Faytak2020) and Makeeva (Reference Makeeva2022b), where further references and details can be found.

1.3.1. Static vs. dynamic harmony

The first parameter distinguishes between ‘static’ and ‘dynamic’ ATR patterns (following the terminology of Rolle et al. Reference Rolle, Lionnet and Faytak2020) on the basis of the morphological content of the harmony domain. Static patterns, usually described for roots, are realised as vowel co-occurrence restrictions only within morphemes. Dynamic harmony patterns are implemented across morphemic boundaries through vowel alternations in morphemes as a result of co-occurrence restrictions within a phonological domain comprising several morphemes. This distinction made for ATR touches upon general issues about the relation between morpheme-internal and intermorphemic phonological patterns (Kenstowicz & Kisseberth Reference Kenstowicz and Kisseberth1977; Kiparsky Reference Kiparsky1982). If dynamic harmony patterns are attested in a language, it also exhibits static patterns, but not necessarily vice versa. Akebu contains both types of patterns. The remaining parameters concern only dynamic harmony.

1.3.2. Root-controlled vs. dominant–recessive harmony

The second parameter draws a distinction between root-controlled (or stem-controlled) harmony and dominant-recessive harmony. This parameter concerns the type of the harmony ‘trigger’ (the vowel which causes assimilation of other vowels, called harmony ‘targets’). In root-controlled harmony, affix vowels assimilate in ATR to root vowels of either value, as in Waja, Anii or Tima. In dominant–recessive harmony, vowels with the recessive ATR value are subject to harmony in the vicinity of a vowel with the dominant value, as in Kinande or Maa. This harmony happens within a given phonological domain regardless of the morpheme type to which the trigger belongs. The first type of harmony is often considered in terms of morphological asymmetry (while the two values of ATR are seen as symmetrical), whereas the second type rather displays a featural asymmetry between the [+ATR] and [−ATR] values (while different types of morphemes are considered as symmetrical as regards ATR). Both dynamic ATR harmony rules in Akebu (described in §§1.4.3 and 1.4.4) are classified under stem-controlled harmony and show the two values of ATR as phonologically symmetrical.

This binary division maintains the generally attested correlation between the two types of symmetry/asymmetry, phonological and morphological. However, it does not reflect numerous deviations observed in individual ATR harmony systems. Casali (Reference Casali2008a: 516–517) argues that in the dominant–recessive type of harmony, where the root–affix distinction is usually claimed to be irrelevant, prefix vowels still always behave as recessive and acquire the ATR value of root vowels, while suffix vowels can often be [+ATR]-dominant, as in Nkonya, Kirangi or LuBwisi. On the other hand, Casali (Reference Casali2003) also shows that, in languages with root-controlled harmony, such as Kɔnni and Ngiti, ATR values are not entirely symmetrical either. One of them can still be usually considered marked and the other unmarked, if ATR harmony is understood in a broader sense. Akebu data rather support this second observation (see §1.5).

Therefore, instead of a single opposition between root-controlled harmony, established on morphological grounds and dominant–recessive harmony, distinguished on phonological grounds, one might speak about two highly correlated but independent parameters. The morphological parameter regards the sensitivity of ATR harmony spread to various types of morphological boundaries. The phonological parameter concerns the degree of phonological symmetry between the two values of ATR (including their dominance/recessivity and markedness relations). Such a distinction refers to a general relevance of different hierarchical strata in phonology and morphology (e.g. in Lexical Phonology and Stratal OT; see Kiparsky Reference Kiparsky, Hyman and Plank2018).

Since neither value of ATR exhibits dominance in harmony patterns in Akebu (see §1.4), we will not consider cross-linguistic variability along this parameter (see Casali Reference Casali2003). As for the non-harmonic parameters helping to establish the phonological markedness relations between ATR values, in §1.5 we consider the distributional restrictions outlined by Casali (Reference Casali2016). In certain contexts, one of the values of ATR – seen as the marked one – is largely avoided in favour of the unmarked one. These contexts may include non-harmonising affixes, as well as independent pronouns and some classes of function words, such as prepositions, postpositions and conjunctions (Casali Reference Casali2016: 114–116).

1.3.3. Directionality

Finally, harmony patterns vary along the parameter of directionality, which concerns the direction of harmony spread across the domain, from the trigger to the target(s). Harmony can be regressive (spreading leftwards), progressive (spreading rightwards) or bidirectional (spreading in both directions). Formal accounts of the directionality of harmony and other types of vowel harmony asymmetry have been given by, among others, Archangeli & Pulleyblank (Reference Archangeli and Pulleyblank2002) and van der Hulst (Reference van der Hulst2018). Harmony rules in Akebu propagate either regressively or bidirectionally.

1.4. Akebu harmony rules

Akebu manifests one static (intra-morphemic) harmony pattern and two dynamic (cross-morphemic) ones. We describe only these main patterns and do not attempt at an exhaustive description of Akebu vowel harmony with all its irregularities here, because the phonetic and phonological analysis of our data is still ongoing.

The three harmony patterns indicate several different types of vowel groupings: apart from [ $\pm $ ATR], these include groupings by the features [ $\pm $ high], [ $\pm $ central] and [ $\pm $ front], and also by an atypical parameter of [ $\pm $ peripheral]. Additionally, Akebu vowel harmony rules are restricted to specific types of morphemes (stems, prefixes and suffixes), and one of the two dynamic rules occurs only in inflection, while the other one is, in contrast, typical of word-formation.

1.4.1. Basic word structure

To provide morphological context for the description of the harmony patterns, we will briefly present the basic word structure of Akebu (see Makeeva & Shluinsky Reference Makeeva and Shluinsky2018; Makeeva Reference Makeeva2022a; Shluinsky Reference Shluinsky2022). In Akebu, nouns are marked for class, and verbs, numerals and pronouns agree in class with nouns. The general structure of Akebu finite verbs and nouns is given in (5).

In the verbal structure in (5a), cpn means an obligatory class-person-number marker. These markers are grouped in a series, which cumulatively expresses verbal agreement with a noun by class/person/number and also verbal tam and negation (in cases when the overt markers for the latter two are absent from the form). Optional slots include neg for the overt negation marker, tam 1 and tam 2 for the overt tense-aspect-modality markers and mpurp for the motion-with-purpose marker. The verbal stem is either basic (lexical) or more complex (for the case of the factative, see §1.4.4). The lexical stem may consist simply of the root, but in the case of valency-changing derivation it is more complex (see §1.4.4).

In the nominal structure in (5b), obligatory nc (noun class) prefixes and suffixes mark one of the seven nominal classes, labelled by the first consonant of the object pronoun agreeing with each of them as Ŋ, p, Ʈ, w, y, k, kp (these pronouns are listed in §1.5). The plural is formed by conversion into another noun class (pluralia tantum keep their class): Ŋ $\rightarrow $ p; p $\rightarrow $ p; Ʈ $\rightarrow $ y or p; w $\rightarrow $ y; y $\rightarrow $ y; k $\rightarrow $ y or kp; kp $\rightarrow $ y. Stems of adjectives and qualitative verbs can be incorporated between the nominal stem and the class suffix, which is marked as [qual] in (5b). The nominal stem consists of the root (or several in compounds), but in the case of verbal nominalisation the stem can be more complex (see §1.4.4).

In the examples below, the grammatical markers mentioned in (5) are separated from the stem with hyphens (-), while the formants within the stem (see §1.4.4) are delimited by equals signs (=).

Many markers and formants are segmental, but there also exists zero marking, as well as marking by stem vowel lengthening, by partial reduplication, by tones, and by initial consonant voicing. Some of the segmental markers and pronouns (both are always monosyllabic) contain non-harmonising vowels, while others harmonise their vowel to the first or the last stem vowel, in the case of prefixes and suffixes, respectively.

1.4.2. The Static Rule

Three groupings of vowels are distinguished by their root-internal co-occurrence patterns: [+ATR, +peripheral], [−ATR, +peripheral] and [+central]. Consider the examples of nouns in (6), where roots are marked in bold (other morphemes are nc markers). The feature [+central] already identifies the last group exhaustively, so its [ $\pm $ ATR] specification is in fact redundant.

In nominal compounds, each root has its own harmonic specification, as in (7). The Static Rule is not preserved in some loanwords, such as those in (8).

1.4.3. Dynamic Rule 1

Dynamic Rule 1, given in (9), is a regressive assimilatory harmony pattern according to which the ATR and place values of the first (or only) vowel of the root spread to the prefix vowel(s), which can only be [−high] (in case of /a/, also [+central]). In this rule, the vowels of the first two groups are [+ATR, +peripheral] and [ $\pm $ front]; those of the third group are [−ATR, +peripheral]; and those of the fourth one are [−peripheral]. Again, /a/ is [−ATR], but the [±ATR] specification of the interior vowels is redundant.

This pattern is applied mostly to inflectional prefixes: to the nc nominal prefixes of classes p, y, k exemplified in (9), but also to many overt tam 1 and tam 2 markers (e.g., perfective /lV̄-/, prohibitive /pV̀V̄-/ and past habitual /kV́ŋ̀ /; V indicates vowel(s) harmonising according to Dynamic Rule 1), and to most verbal cpn markers. For example, in the latter case, in class-y agreement, the marking will be /yᵻ̀-lV̄-/ or /yV̄-/ in the perfective and /yᵻ̀-lV̄V̀-/ or /yV̄V̀-/ in the prospective. Some numerals also have class agreement prefixes of the same type. Possessive pronouns (different for each nominal class) before the noun also contain vowels which harmonise to the nominal stem (e.g., the class-y pronoun is /yə̄ lV́/ or /yV́/).

With stems that do not conform to the Static Rule, prefixes harmonise to the first root or the first vowel of the stem, as in the plural forms of (7) and (8), shown in (10) and (11).

1.4.4. Dynamic Rule 2

Dynamic Rule 2 is bidirectional: the ATR and place values of the verbal root spread leftward or rightward to a high target vowel. The only exception is /a/ in the rightward direction, where it triggers regular /ᵻ/ after a nasal consonant, but a non-high /ə/ after a non-nasal consonant. In this pattern, the vowels of the first four groups are [ $\pm $ ATR] and [+peripheral] and [ $\pm $ front], respectively. Vowels of the last group are [+central], with an indication to the relevance of [ $\pm $ peripheral] in the case of rightward spreading. Again, their ATR specification is redundant.

This pattern seems to apply only to formants within a verbal or a nominal (in case of verbal nouns) stem, although more data are needed here. Leftward spreading is exemplified in (12) with partial reduplication used in the formation of the stems of verbal nouns, e.g., /kūŋ̄/ ‘give’ > /kū=kūŋ̄-wə̀/ ‘smth given’. Rightward spreading is illustrated in (13) with the stems of causative verbs, derived from lexical stems with one of the three formants /-ʈV, -lV, -nV/ (often together with the stem vowel lengthening and tone raising), e.g., /yī/ ‘get up’ > /yíí=lì/ ‘raise’, /cɪ́ɪ́/ ‘be dirty’ > /cɪ́ɪ́=ʈɪ̄/ ‘make dirty’. Valency-changing word formation is in general quite irregular in Akebu and as yet understudied. For example, some intransitive verbs contain the same formants used in transitive verbs and differ from the latter only in tones: /sáá=nᵻ́/ ‘rise, wake up’ – /sáá=nᵻ̄/ ‘raise’.

The same rule seems to apply to the formant of the factative verbal stem, used in some types of verbs. The main strategies of factative stem formation are root vowel lengthening and tone raising. However, for some roots of the CV and CVŋ structure, the formants /-lV, -nV/ are used, e.g., /tàŋ̀/ ‘stop’ > /tā=nᵻ̄/, /kə̄ŋ̄/ ‘be high’ > /kə̄=nᵻ́/, /mɔ̀/ ‘laugh’ > /mɔ̄=nʊ̄/, /tɛ̄ŋ̄/ ‘say’ > /tɛ̄=nɩ́/, /yò/ ‘cook’ > /yō=lū/, wè ‘cry’ > /wē=lī/, /rī/ ‘eat’ > /rī=lí/. No verbs have yet been found which would exemplify rightward harmony to /a/ after a non-nasal consonant and to /i, ɪ/ in the factative.

1.5. Less direct evidence of the [−ATR] value of central vowels

As discussed above, harmony patterns do not give a straightforward answer about the [±ATR] specification of central vowels. Interior vowels combine with each other and with /a/, while /a/ combines also with [-ATR] peripheral vowels. Three phonological scenarios are theoretically possible here:

Scenario (14a) would yield a more elegant generalisation for Dynamic Rule 1: the [−ATR] vowels and /a/ would trigger the vowel /a/ in suffixes, while for the [+ATR] vowels the distribution would be as follows: /e/ for front vowels, /o/ for back vowels, /ə/ for central vowels. With such an interpretation, as an anonymous reviewer notes, the feature [ $\pm $ peripheral] would also become redundant.

In this scenario, the vowel /a/ must be seen as neutral. Unpaired /a/ is indeed often claimed to be neutral in languages with ATR harmony, since it may co-occur with vowels of both the [+ATR] and the [−ATR] set.

Casali (Reference Casali2008a: 529) points out, however, that vowel neutrality can be understood in several different senses. We will distinguish between phonological and phonetic neutrality: the phonological behaviour of vowels with respect to ATR harmony vs. their physical ATR properties as attested by acoustic and articulatory studies. Further, vowels which are phonologically neutral can be phonetically non-neutral. Casali (Reference Casali2008a) distinguishes between two types of systems with phonologically neutral /a/: those in which it is phonetically [−ATR] and those in which it is phonetically neutral, that is, having positional [−ATR] and [+ATR] allophones. We address the phonological issue first, before considering the phonetic correlates.

The phonological neutrality of /a/ under scenario (14a) would manifest exceptional properties unattested in other ATR systems. Phonological neutrality in static harmony usually implies little or no co-occurrence restrictions within the roots comprising vowels of both ATR categories. Akebu provides little evidence of such sort of neutrality, as /a/ does not co-occur with the vowels clearly specified as [+ATR] (peripheral [+ATR] vowels).

In dynamic harmony, the function of a neutral vowel can be either a target or a trigger. As a target, Akebu /a/ rather manifests the [−ATR] value, as it is triggered by clearly specified [−ATR] vowels (apart from by the /a/ itself) in Dynamic Rule 1. In turn, /a/ itself triggers only interior vowels. If the interior vowels were [+ATR], the phonological neutrality of /a/ in dynamic harmony would be, again, extremely unusual. Phonologically neutral vowels are expected to have wide distribution and to co-occur with both ATR values within the same harmony rules, not across different patterns. In Akebu, however, the situation would be that /a/ is a target for the [−ATR] vowels in one rule but a trigger for the [+ATR] vowels in another rule. This kind of phonological neutrality is, as far as we know, unattested.

The same considerations are valid also for scenario (14b), where all the three central vowels would be neutral. In such a scenario, the neutral interior vowels would co-occur only with another neutral vowel /a/, so their distribution would be extremely restricted, which is also untypical of neutral vowels.

In turn, scenario (14c), under which all the three central vowels are [−ATR] (and the [−ATR] value is seen as phonologically unmarked in Akebu), does not create any of the complications described above. Moreover, such an interpretation accords well with other data coming from the distribution of Akebu vowels in non-harmonising contexts and from typological generalisations (both types of arguments are discussed below), as well as with etymological considerations (see §4.1).

Cross-linguistically, there are certain correlations between vowel inventory types and ATR harmony patterns (Casali Reference Casali2003, Reference Casali2016). The main difference appears to be between /2IU/ systems (as mentioned in §1.2.2, these include the /2IU-2EO/ type, to which Akebu belongs, and the /2IU-1EO/ type) and the /1IU/ (actually /1IU-2EO/) systems.

In /2IU/ languages, the [+ATR] value generally functions as marked, whereas the opposite markedness pattern is found in the /1IU/ systems. Low central vowels also have partially different properties in the two types. In the /2IU/ systems, the vowel /a/ can be relatively safely classified as [−ATR]: it has a wide distribution and triggers the [−ATR] allomorphs of harmonising affixes. In the /1IU/ systems, /a/ can behave either in the same way or as [+ATR] (Casali Reference Casali2016: 126).

In any case, /a/ as a trigger vowel is never neutral: it requires [−ATR] allomorphs of alternating affixes in /2IU/ languages and [−ATR] or [+ATR] allomorphs in /1IU/ languages. In Akebu, as mentioned before, /a/ triggers /ᵻ/ and /ǝ/. Both the vowel /a/ and the vowels it triggers would be strongly expected to be [−ATR] in a /2IU/ language such as Akebu, based on the typological distributions outlined above.

There exist additional internal phonological indications of the typologically expected [−ATR] value of the Akebu interior vowels. Akebu contains nominal and verbal prefixes, suffixes and pronouns which do not harmonise for ATR. Such items can only contain the interior vowels /ᵻ/ and /ə/ and, in pronouns, also the high [−ATR] vowels /ɪ, ʊ/. This restriction is consistent with Casali’s (Reference Casali2003) claim that the values of ATR are never entirely symmetrical even in root-controlled harmony systems (see §1.3.2).

Examples of nominal class suffixes and prefixes that always contain the vowel /ə/ or /ᵻ/ regardless of the ATR value of the root are given in (15); other examples of nominal suffixes (/-wə̀, -ʈə̀, -yə̀, -kə̀, -pǝ̀/) are cited in (6)–(12).

An even stronger argument for the [−ATR] value of /ᵻ/ and /ə/ is provided by the distribution of vowels in two series of pronouns, the object and the contrastive ones. These pronouns never harmonise for ATR and contain either the high peripheral [−ATR] vowels /ʊ, ɪ/ or the interior vowels /ᵻ, ə/. For example, in the object pronouns, which occur in postposition to nouns, each of the third person forms /ŋʊ̀, pᵻ̄, ʈᵻ̄, wʊ̄, yɪ̄, kᵻ̄, kpᵻ̄/ replaces a noun of a certain nominal class, see §1.4.1 (classes p and y are plural, as are the pronouns /pᵻ̄, yɪ̄/). The distribution of high vowels in these forms depends on the type of their initial consonants, and it would be logical to suppose that all these vowels share their ATR value, that is, that they are [−ATR]. In addition to the third person forms, there are object locutor forms, where also /ə/ occurs: /mɨ́/ 1sg, /lə̀/ 2sg, /lə́/ 1pl, /nɨ́/ 2pl. The сontrastive pronouns are derived from the object ones and contain the same distribution of vowels (Makeeva Reference Makeeva2022a). These examples strongly suggest the [−ATR] value of /ᵻ, ə/.

We will cite one more piece of tentative internal evidence. Akebu manifests a trend for certain kinds of vowel reduction and loss both at the left and the right word edges. Suffix reduction is discussed by Shluinsky (Reference Shluinsky2020). Reduction to the left of the root is still unstudied, but we attested a shift of the [−ATR] high vowels /ʊ, ɪ/ towards the interior high vowel /ᵻ/ in some cases of partial reduplication in verbal nouns (see §1.4.4): /yᵻ̄=yɔ̄ɔ̄-wə̀/ ‘something tender’, /kpᵻ́=kpɛ́ɛ́lɪ̄-wə̀/ ‘something white’ (instead of expected /*yʊ̄=yɔ̄ɔ̄-wə̀/, /*kpɪ́=kpɛ́ɛ́lɪ̄-wə̀/). High [+ATR] vowels did not manifest any trend of this sort.

A set of distinctive features and their hierarchy in Akebu which would fully account for the various vowel groupings defined by the harmony rules and other phonological patterns outlined above, is proposed in (16).

Phonetic evidence which would support or contradict our phonological analysis of the Akebu vowel inventory was sought in the experimental study, set out below, on several acoustic ATR correlates of Akebu vowels. Especially important was the question on the phonetic properties of central vowels, considered in the context of the [ $\pm $ ATR] pairs of front and back vowels.

Before proceeding to the phonetic study, we will briefly consider the question of the phonetic neutrality or non-neutrality of the Akebu central vowels.

If we interpret all the central vowels either as [−ATR], as is proposed in this paper, or as phonologically neutral, then their phonetic neutrality cannot be checked, because /a/ co-occurs only with the peripheral [−ATR] vowels or with the interior vowels.

Under the scenario in (14a), where the interior vowels are seen as [+ATR], the phonetic neutrality of the central vowels could potentially be checked in a phonetic study. For example, the acoustic correlates of ATR could be studied for /a/ in the pairs where it occurs after a [−ATR] peripheral vowel vs. after an interior vowel, e.g., /sɪ̄ká-yə̀/ ‘money’ – /lᵻ̄là-wə̀/ ‘something torn’. If any significant differences in the acoustic correlates of ATR were discovered for /a/ between these two contexts, this might potentially be an argument towards considering the interior vowels as [+ATR] and /a/ as phonetically and phonologically neutral. Such a test remains for future research on Akebu.

2. Acoustic study: background, methods, research questions

2.1. Overview of acoustic and articulatory properties of ATR

Despite the close attention of linguists to ATR since the 1960s, its acoustic features and articulatory basis are not yet fully understood. A recent holistic Laryngeal Articulator Model, developed on the basis of laryngoscopic studies (summarised in Esling et al. Reference Esling, Moisik, Benner and Crevier-Buchman2019), gave primacy to an aryepiglotto-epiglottal constriction in the production of the ATR contrast. All other mechanisms proposed earlier as relevant in the production of ATR – tongue root position (Stewart Reference Stewart1967), the volume of the pharyngeal cavity (Painter Reference Painter1973; Lindau Reference Lindau1975, Reference Lindau1979; Tiede Reference Tiede1996) and larynx height (Lindau Reference Lindau1975, Reference Lindau1979) – were considered in this model as parts of a synergistic and hierarchical system of laryngeal articulations. However, more articulatory data are needed to test the model.

A number of acoustic measures have been found to correlate with ATR, but only to a certain degree. The only exception is the frequency of the first formant (F1), which was the most stable correlate of ATR across studies, languages, vowels types and speakers (Lindau Reference Lindau1975; Fulop et al. Reference Fulop, Kari and Ladefoged1998; Guion et al. Reference Guion, Post and Payn2004; Starwalt Reference Starwalt2008; Olejarczuk et al. Reference Olejarczuk, Otero and Baese-Berk2019).

However, acoustic changes in F1 can be attained both by tongue root gestures and by tongue body height, so additional indicators of ATR are needed. Apart from the frequency of the second formant (F2), a common secondary metric is spectral slope, or relative intensity of F1 to F2 (A1 minus A2). Another parameter closely related to spectral slope is F1 bandwidth (B1). The constriction in the laryngeal tube and the friction damping during the production of [−ATR] vowels cause an increased damping of F1 and an increase in F1 bandwidth. At the same time, isometric tension of the pharyngeal walls during the production of [+ATR] vowels (Tiede Reference Tiede1996) preserves F1 from the damping effects caused by cavity wall coupling (Fulop et al. Reference Fulop, Kari and Ladefoged1998: 93–94). Such articulatory activity is supposed to result in lower relative intensity of the first two formants and a wider F1 bandwidth in the [−ATR] vowels as compared to the [+ATR] ones.

These four metrics (F1, F2, spectral slope and F1 bandwidth) were tested in our acoustic study and are discussed in detail below.Footnote ¹ Other possible metrics include centre of gravity (Starwalt Reference Starwalt2008; Edmondson Reference Edmondson, Fant, Fujisaki and Shen2009; Ivanova Reference Ivanova2021), spectral emphasis (Remijsen et al. Reference Remijsen, Ayoker and Mills2011), cepstral peak prominence, harmonics-to-noise ratio (Olejarczuk et al. Reference Olejarczuk, Otero and Baese-Berk2019) and vowel duration (which tested as insignificant in Hess Reference Hess1992; Guion et al. Reference Guion, Post and Payn2004; Przezdziecki Reference Przezdziecki2005; Starwalt Reference Starwalt2008; Kirkham & Nance Reference Kirkham and Nance2017). These will be investigated in a further phase of this project.

2.2. Data recordings

Our experiment is the first detailed acoustic study of Akebu vowels (cf. a brief pilot study in Shluinsky Reference Shluinsky2020). Phonetic data were collected by the first author from six male Akebu speakers: AD (b. 1996), HO (b. 1990), YT (b. 1991), MA (b. 1986), AK (b. 1967) and BO (b. 1958), all of whom were born in and have been more or less permanently residing in the village of Djon, in the prefecture of Akebu. Data from HO were obtained in 2012, and from other speakers in 2019, mainly based on the data collected from HO.

The set of 110 carrier words included 10 words for each studied vowel. This wordlist contained items exemplifying all Akebu vowels after all types of non-nasal consonants, where possible. The preferred word structure was CV or CVV. However, due to the existing structural limitations on the positions of some vowels, it was not always possible to obtain entirely comparable sets of structures for all vowels. Speakers were presented a French equivalent and asked to translate it into Akebu and to say the isolated item three times, with pauses in between. Full qualitative and quantitative structure of the dataset containing 1,980 vowel tokens (30 productions per 11 vowels per 6 speakers), is given in Table 1s in the Supplementary Material. The recordings were made with a digital recorder Marantz PMD-660 and an external microphone AKG 1000 in .wav format at a sampling rate of 48 kHz with 16-bit quantisation.

2.3. Research questions

Our phonetic study had two main objectives:

RQ1: We checked how Akebu data matched existing predictions as regards the four acoustic metrics of the ATR contrast related to F1 and F2. It was expected that the F1-related metric would provide more robust results across speakers and vowel types than the F2-related one. The tested hypotheses are outlined in §§2.3.1–2.3.4.
RQ2: As discussed in §1, Akebu contains an unusual subsystem of central vowels. The phonological ATR properties of central vowels potentially allow for more than one interpretation, but the totality of the evidence indicates [−ATR] as their most plausible value. We investigated whether the potential [−ATR] status of central vowels is supported by acoustic data. Specific hypotheses are described in §2.3.5.

2.3.1. F1 as a correlate of ATR and vowel height

Studies show that [−ATR] vowels have higher F1 than [+ATR] vowels. Increase in F1 for the [−ATR] vowels is supposed to be caused by aryepiglotto-epiglottal constriction and laryngeal raising (Edmondson & Esling Reference Edmondson and Esling2006a; Edmondson Reference Edmondson, Fant, Fujisaki and Shen2009). However, F1 frequency can also be modulated by tongue height. Lower vowels have higher F1 than higher vowels, as they are pronounced with a lower degree of palatal narrowing and a higher degree of pharyngeal narrowing. Articulatory movements responsible for ATR and vowel height are apparently interlinked in a complex manner. On the one hand, pulling forward the tongue root by the genioglossus muscle is one of the articulatory mechanisms controlling for tongue height (Lindau et al. Reference Lindau, Jacobson and Ladefoged1972; Lindau Reference Lindau1975). On the other hand, the tongue root movement is one of the mechanisms controlling for the volume of pharyngeal cavity in case of ATR (Edmondson & Esling Reference Edmondson and Esling2006b).

Vowels belonging to different ATR categories can indeed additionally differ in tongue height. However, in this case, the height difference was non-significant or inconsistent across vowel types and speakers, or opposite to the expected one, due to compensatory articulations (Ladefoged et al. Reference Ladefoged, DeClerck, Lindau and Papcun1972: 72; Lindau et al. Reference Lindau, Jacobson and Ladefoged1972: 82; Allen et al. Reference Allen, Pulleyblank and Ajíbóyè2013: 194; Hudu Reference Hudu2014: 43; Kirkham & Nance Reference Kirkham and Nance2017: 76–77). Moreover, the genioglossus muscle, which controls the tongue root mechanism in ATR, did not contribute significantly to the tongue height in such cases (Lindau Reference Lindau1975). Tongue height was attained by other articulatory mechanisms, such as tongue lifting and jaw opening.

Our hypotheses about the difference in the magnitude of the (normalised) first formant for ATR and for vowel height are set in (17). Controversies regarding the distinction between the two features with respect to F1 are followed up in §4.

2.3.2. F2 as a potential ATR correlate

There is no consensus about the relevance of F2 as an acoustic correlate of ATR, as languages exhibit no uniform pattern. On the one hand, tongue retraction and descent should result in lower F2 for [−ATR] vowels (Edmondson & Esling Reference Edmondson and Esling2006b: 181–182). This effect was observed by Ladefoged & Maddieson (Reference Ladefoged and Maddieson1996: 304–306) in Igbo, Ijo, Ebira, Ateso and DhoLuo. For some languages, this effect was demonstrated only for front vowels, for example, Kinande (Starwalt Reference Starwalt2008: 134), Oroko (Starwalt Reference Starwalt2008: 199, 209), Akan (Kirkham & Nance Reference Kirkham and Nance2017: 71) and Ethiopian Komo (Olejarczuk et al. Reference Olejarczuk, Otero and Baese-Berk2019). In other languages, including Degema (Fulop et al. Reference Fulop, Kari and Ladefoged1998: 87–88), Ife (Starwalt Reference Starwalt2008: 168) and Dibole (Starwalt Reference Starwalt2008: 179), [+ATR] vowels showed more peripheral F2 values than their [−ATR] counterparts. Front [+ATR] vowels had higher F2 values and back [+ATR] vowels had lower F2 values than their [−ATR] counterparts. The latter pattern is similar to the one found for the contrast of tense vs. lax vowels (Lindau Reference Lindau1978: 558; Ladefoged & Maddieson Reference Ladefoged and Maddieson1996: 304–306). Finally, in some studies, the effect of ATR on F2 was insignificant (Guion et al. Reference Guion, Post and Payn2004: 531).

Based on these somewhat controversial findings, two competing hypotheses were set as regards the difference in the magnitude of the (normalised) second formant:

2.3.3. Spectral slope (A1 minus A2) as a potential correlate of ATR

Spectral slope gave cross-linguistically inconsistent results across vowel types and speakers, although all studies used the same normalisation procedure (see §2.4.2). In Degema, the vowel pairs /ə – a/, /e – ɛ/ and /u – ʊ/ were not distinguished by spectral slope at all, and the expected pattern was found only for /i – ɪ/ and /o – ɔ/ (Fulop et al. Reference Fulop, Kari and Ladefoged1998). For Maa, the results were the opposite: the effect of spectral slope was insignificant for /i – ɪ/ and /o – ɔ/ but significant for /e – ɛ/ and /u – ʊ/ (Guion et al. Reference Guion, Post and Payn2004: 533–534). Starwalt (Reference Starwalt2008) considered 11 Niger-Congo languages with three different types of ATR inventories (see §1.2.2). Only in LuBwisi were the vowels in each ATR pair distinguished by spectral slope (in two of five speakers). Otherwise, a tendency for [+ATR] vowels to have a significantly steeper spectral slope was found only in the mid vowels of some speakers in Foodo, Ikposo, Kinande, Ekiti-Yoruba, Ife and Mbosi. In other cases, differences were either insignificant or in the direction opposite to the expected one.

The general hypothesis for the normalised spectral slope/flatness (relative intensity of F1 to F2) was set as in (19). However, a high degree of heterogeneity both across the vowel types and across speakers was expected.

2.3.4. F1 bandwidth (B1 and ΔB1) as a potential correlate of ATR

The parameter of B1 appears to be a bit more consistent than that of spectral slope, although earlier studies differ as to whether a raw or a normalised metric was used.

Hess (Reference Hess1992), who first reported the effect of B1 in ATR, compared raw B1 for those Akan vowels which overlap in F1 magnitude, namely /ɪ, e/ and /ʊ, o/, as she wanted to avoid the correlation between the values of formant bandwidths and formant frequencies. Those vowel pairs showed a significant effect, with [−ATR] vowels having a wider B1.

The effect of raw B1 was also studied in three variants of Yoruba (Przezdziecki Reference Przezdziecki2005) and in Kabiye (Edmondson Reference Edmondson, Fant, Fujisaki and Shen2009). In both cases, the comparison was conducted for all ATR pairs despite the fact that the F1 values of the ATR vowel pairs significantly differ. In Standard Yoruba, B1 was not significant, while in Moba Yoruba, with the same vowel inventory (/1IU-2EO/), mid vowels were distinguished by B1. In Akure Yoruba with the /1IU-2EO/ inventory and allophonic ATR variation in high vowels, B1 was significant only for high vowels in some speakers. In Kabiye (/2IU-2EO/), the effect of B1 was significant for all ATR pairs. All those results, however, suffer from collinearity, because B1 values are highly predictable from F1 values (Starwalt Reference Starwalt2008: 91).

There is another approach to studying B1, in which this value is normalised, which produces more reliable results. In a study of Akan vowels, Hess (Reference Hess1992: 486) compared observed B1 values to those predicted by a formula elaborated by Fant (Reference Fant1972: 47); cf. §2.4.3. Observed B1 values for all [+ATR] vowels, apart from /i/, were close to the predicted ones. In turn, all [−ATR] vowels had B1 values on average 50 Hz greater than the predicted ones. Starwalt (Reference Starwalt2008: 415) also noted a cross-linguistic tendency for raw B1 to be at or above predicted values for [−ATR] vowels, at or below predicted values for [+ATR] mid vowels, and either below or above predicted values for [+ATR] high vowels. These observations were used as a hypothesis for our RQ2 (see §2.3.5).

Starwalt (Reference Starwalt2008) also compared distances between observed and predicted B1 values (ΔB1). ΔB1 followed the pattern of spectral slope in distinguishing mid vowels by ATR more consistently than the high ones. However, the distinction was found in much more languages and in more speakers within each language than for spectral slope. All languages showed a distinction in at least one vowel pair. Moreover, unlike the spectral slope, ΔB1 also distinguished high ATR vowel pairs in at least some speakers of Ikposo, Kinande, LuBwisi and Ekiti-Yoruba. The effect of ΔB1 was also significant for all ATR pairs in the /2IU-2EO/ inventory of Igbo (Ivanova Reference Ivanova2021).

The hypothesis about the difference in the normalised F1 bandwidth size was set as in (20), with expected variability across vowel types and speakers.

2.3.5. RQ2: Central vowels

It was expected that the three central vowels /ǝ, ᵻ, a/ would significantly differ in F1 (see (17)). However, this difference alone does not offer a way to distinguish between ATR and vowel height (see §2.3.1), so we sought additional metrics which might support or contradict our classification of all central vowels as [−ATR], differing only in height.

The metrics of F2, A1−A2 and ΔB1 could be checked for central vowels only in an exploratory manner, because no robust predictions for these parameters in the case of vowel height were available. However, the cross-linguistic observation by Starwalt (Reference Starwalt2008), cited in §2.3.4, about the direction and the magnitude of displacement of observed B1 values (B1_obs) from the curve of B1 values predicted by Fant’s formula (B1_pred) for [−ATR] vs. [+ATR] vowels could be taken as a working hypothesis in (21). If the central vowels were [−ATR], they would be expected to behave similarly to [−ATR] peripheral vowels:

2.4. Measurements

Measurements were conducted in Praat (Boersma & Weenink Reference Boersma and Weenink2017). Segmentation was fulfilled based on waveforms and wideband spectrograms. All types of analyses were done on a single window of the central 40% of a vowel. For the formant analysis, the built-in LPC method (BURG algorithm) of Praat was used.

2.4.1. Normalised formant values (F1 and F2)

In most earlier studies on ATR, F1 and F2 were not normalised (with the exception of Olejarczuk et al. Reference Olejarczuk, Otero and Baese-Berk2019). We used an optimisation procedure based on Escudero et al. (Reference Escudero, Boersma, Rauber and Bion2009: 1382) which makes it possible to adapt the formant ceiling to the speaker and the vowel. Following this procedure, for all tokens, the means of the first two formants were measured 41 times for all ceilings between 4,000 and 6,000 Hz in steps of 50 Hz. Only the most robust formant values were taken into consideration.Footnote ² Of the 41 ceilings, the one which yielded the lowest sum of standard deviations of F1 and F2 through the 30 tokens was identified as the optimal ceiling. During the formant ceiling optimisation procedure, 25 tokens (about 1%) produced implausible formant values with every formant ceiling and were eliminated from the dataset. This procedure was carried out for each vowel from each speaker, which resulted in 55 optimal ceilings.

Formant frequency data were then subjected to a normalisation procedure in order to render the formant frequencies comparable across vowels and speakers. The CLIH2 (Constant Logarithm Interval Hypothesis 2) method proposed by Nearey (Reference Nearey1978) was chosen. In order to get the normalised value of a certain formant in a particular token for a certain vowel of a given speaker, (a) the formant value was translated into a natural logarithm; and (b) the mean of the frequencies for a certain formant across all the tokens of a speaker translated into natural logarithms was subtracted from the value acquired in (a).

The normalisation procedure was carried out for all tokens of each speaker. The normalisation formula was $F*_{N[V]sr} = G_{N[V]sr} - G_{N[.]s}$ (Nearey Reference Nearey1978: 138), where:

• $F_{N[V]sr}$ is the frequency in hertz of the Nth formant of the rth token of vowel v for subject s;
• $F*_{N[V]sr}$ is the normalised value for $F_{N[V]sr}$ ;
• $G_{N[V]sr}$ is the natural logarithm (ln) of $F_{N[V]sr}$ ;
• the dot [.] indicates averaging over a particular subscript.

2.4.2. Normalised spectral slope (A1 minus A2)

The values for the amplitudes of F1 and F2 were obtained with a Praat script by Grawunder (Reference Grawunder2010). Amplitudes vary as a function of formant frequency due to vocal tract properties and are additionally reinforced when formants remain close together. Therefore, a normalising procedure was needed to allow for a comparison of the spectral slopes of vowels with different formant frequencies (Guion et al. Reference Guion, Post and Payn2004: 526).

A normalisation procedure proposed in Fulop et al. (Reference Fulop, Kari and Ladefoged1998; see also Starwalt Reference Starwalt2008: 90–93) was carried out for the spectral slope value of every token from each speaker. The observed values of F1 and F2 in each token were used to obtain the two modelled amplitudes (A1 and A2). The modelled spectral slope A1−A2, which served as a baseline, was then subtracted from the measured A1−A2 value.

The pair of modelled A1 and A2 values for each token was computed with the help of formulas in (22)–(24), where f is the measured value of the formant of interest (F1 for A1 and F2 for A2). First, the modelled contributions of the first three formants at frequency f were calculated using the formula in (22) three times, filling in the frequencies of F1, F2 and F3 as the value for F; b is the formant bandwidth, set at a constant value of 30 Hz for F1, 80 Hz for F2 and 150 Hz for F3. Then the modelled contributions of higher formants (F4 and F5) to the amplitude at frequency f were computed using the formula in (23). Finally, the modelled effects of the glottal pulse shape and lip radiation were computed using the formula in (24), where g is a variable representing phonation type, the value 1.0 being set for modal voice.

To summarise, A1 and A2 were obtained from the results of (22)–(24) as follows: A1/A2 = (22) [ $F=F1, f=\textrm {F1}/\textrm {F2}, b=30$ ] + (22) [ $F=\textrm {F2}, f=\textrm {F1}/\textrm {F2}, b=80$ ] + (22) [ $F=\textrm {F3}, f=\textrm {F1}/\textrm {F2}, b=150$ ] + (23) [ $f=\textrm {F1}/\textrm {F2}$ ] + (24) [ $g=1, f=\textrm {F1}/\textrm {F2}$ ].

2.4.3. Normalised F1 bandwidth (ΔB1)

The values for the F1 bandwidth (B1_obs) were obtained with a modification of the Praat script by Grawunder (Reference Grawunder2010). All values higher than 300 Hz were removed as unreliable, following Fulop (Reference Fulop2011). Since B1 has been shown to correlate with the changes in F1 frequency (see §2.3.4), a normalisation procedure proposed by Starwalt (Reference Starwalt2008) was used to compare the bandwidth values of vowels with different formant frequencies. The measured B1 value in each token was compared to the B1 value predicted for this token by Fant’s formula in (25). This formula makes it possible to predict the B1 value of a vowel based on its measured F1 and the damping effects of cavity wall losses, surface losses, and radiation losses, represented in the three formula terms, respectively.

Normalised B1 values (ΔB1), used for the comparison of the ATR pairs of peripheral vowels in §3.2.4, were obtained by subtracting the predicted B1 values from the observed ones. In §3.3, the positions of observed B1 values were evaluated with respect to the curve of predicted B1 values (at, above or below the curve).

2.5. Statistical analysis

A mixed regression linear model for each studied ATR correlate (F1, F2, A1−A2 and ΔB1) in each speaker was fitted by the second author with lme4 (Bates et al. Reference Bates, Mächler, Bolker and Walker2015) in R (R Core Team 2020). In order to account for the structural heterogeneity of the carrier words, the intercept for item was included as a random effect (Baayen et al. Reference Baayen, Davidson and Bates2008).

Due to the small number of speakers (six) and various types of structural across-speaker imbalances in the sample (see §2.2), we could not fit reliable overall models containing speaker-specific random slopes for the studied parameters (Bolker Reference Bolker, Fox, Negrete-Yankelevich and Sosa2015: 315–316 predicts this for random effects with fewer than eight levels). High interspeaker variability was expected at least for A1−A2 and ΔB1, so instead of fitting overall models without speaker-specific slopes, we chose to run speaker-specific models. In this, we also followed the general practice for the ATR-related measurements, where speakers are usually considered separately. Such results do not allow direct inferences about the general Akebu-speaking population, but the degree of interspeaker variability is directly comparable to most of the earlier studies.

The effect of vowel type resulted significant in every model (Table 2s in the Supplementary Material), and post-hoc pairwise comparisons were conducted with lmerTest (Kuznetsova et al. Reference Kuznetsova, Brockhoff and Christensen2017) for the vowel pairs opposed by ATR (or by height, for the central vowels in §3.3) within each speaker for each of the four parameters.

The first author conducted a Wilcoxon test for two related samples in SPSS 11.01.1 to compare the observed B1 values and the B1 values predicted by Fant’s formula.

3. Results

3.1. Vowel space

The distribution of the Akebu vowels in a non-normalised F1–F2 space for each speaker in Hz, obtained with a script by Stanley (Reference Stanley2018), is given in Figures 1–6. The vowel type means, represented by the phoneme symbols, represent all tokens for a given vowel in a given speaker. The ellipses mark confidence intervals of $\pm 1$ standard deviation (67%).

Figure 1 Vowel space: Speaker AD.

Figure 2 Vowel space: Speaker HO.

Figure 3 Vowel space: Speaker YT.

Figure 4 Vowel space: Speaker MA.

Figure 5 Vowel space: Speaker AK.

Figure 6 Vowel space: Speaker BO.

Visual examination of Figures 1–6 shows that peripheral [−ATR] vowels always have higher F1 than their [+ATR] counterparts, and the pairs hardly overlap. In turn, [−ATR] peripheral higher vowels often overlap in F1/F2, even significantly, with lower [+ATR] vowels (especially /ɪ – e/ and /ʊ – o/). The central vowels typically show more scattered distribution along the F1 and/or F2 dimensions than the front and the back ones, but not in all speakers. The F1 of the two interior vowels stay close to each other and partially overlap in most speakers.

Importantly, the interior vowels occupy the F1 belt typical of the [−ATR] peripheral vowels rather than that of the [+ATR] vowels. However, /ᵻ/ often has higher F1 than /ɪ, ʊ/, while /ə/ is lower in F1 than /ɛ, ɔ/, which indicates a centralising tendency for the interior vowels. Some centralisation is observed also in the F2 of the peripheral [−ATR] vowels, especially in the front and mid vowels, as compared to their [+ATR] counterparts, but for /u – ʊ/ and, more rarely, /i – ɪ/, the opposite relation holds.

Table 2 Differences in F1 between members of [ $\pm $ ATR] pairs.

Table 3 Differences in F2 between members of [ $\pm $ ATR] pairs.

Table 4 Differences in A1−A2 between members of [ $\pm $ ATR] pairs.

Table 5 Differences in ΔB1 between members of [ $\pm $ ATR] pairs.

Statistical results for the normalised F1 and F2 values follow in the next sections.

3.2. Pairwise comparisons for ATR

Tables 2–5 show the results of post-hoc pairwise comparisons across the four peripheral [ $\pm $ ATR] pairs /ɪ – i/, /ɛ – e/, /ʊ – u/ and /ɔ – o/ within each speaker for each ATR correlate measured. Significance codes for p-values throughout are as follows: *** indicates $p<0.001$ , ** $p<0.01$ and * $p<0.05$ ; when $p>0.05$ , the cell is left blank. An exclamation point (!) appended to the significance code indicates that the value for a [−ATR] vowel was lower than the value for the corresponding [+ATR] vowel; the absence of this mark signifies the opposite relation. Detailed results of all ATR tests are given in Table 3s in the Supplementary Material.

The last column in Tables 2–8 cites the hypothesis tested in each case (see §2.3); ‘yes’ indicates that the relation between the values fully complies with the hypothesis and ‘no’ indicates that it does not.

Table 6 Differences in A1−A2 between pairs of central vowels.

Table 7 Differences in ΔB1 between pairs of central vowels.

Table 8 Difference between predicted and observed B1 values in all vowels.

3.2.1. F1

The F1 values of the [+ATR] vowels were significantly lower than those of the [−ATR] vowels in every pair for every speaker, as predicted by Hypothesis H1A-1 in (17a). In speaker MA, who showed more scattered F1/F2 vowel distributions in a more compact space than average (Figure 4), F1 differences were less significant than in other speakers.

3.2.2. F2

There was a more or less significant difference in F2 for two or three ATR pairs in four speakers. Speaker AK showed no significant differences in F2 at all. In this speaker, F2 distributions were relatively scattered in all vowels (Figure 5).

Front and back mid vowels complied with Hypothesis H1B-2 in (18b): /e/ had higher F2 values than /ɛ/, while /o/ had lower values than /ɔ/. Therefore, mid [+ATR] vowels can be considered as more peripheral than their [−ATR] counterparts. High vowels were not as consistently distinguished by F2 (only in one or two speakers). Unlike the mid back vowels, the high back vowels rather complied with Hypothesis H1B-1 in (18a), as /ʊ/ showed slightly lower F2 values than /u/ (either at $p<0.05$ (*) or insignificantly).

3.2.3. A1 minus A2

Results mostly complied with Hypothesis H1C in (19), with one exception (the /ɪ – i/ pair in speaker AD). In general, mid vowels were distinguished by the spectral slope better than high vowels (/ɔ – o/ being the most consistently distinguished pair). The high vowels were distinguished less consistently across speakers and with a lower significance.

3.2.4. ΔB1

Results generally complied with Hypothesis H1D in (20), with two exceptions (the /ɛ – e/ pair in speaker BO and the /u – ʊ/ pair in speaker AD). Similarly to spectral slope, mid vowels were distinguished by F1 bandwidth better than high vowels, and /ɔ – o/ was the most consistently distinguished pair. High vowels showed lower and less consistent significance across speakers. In fact, contrary to Starwalt (Reference Starwalt2008), ΔB1 did not produce more or better distinctions of vowel pairs across speakers than spectral slope.

3.3. Central vowels

3.3.1. F1 in central vowels

The F1 of /ə/ was significantly higher than that of /ᵻ/ and significantly lower than that of /a/ in every speaker ( $p<0.001$ ; ***), as predicted by H1A-2 in (17b). The differences between /ǝ/ and /a/ were still much greater (|t|-values from 6.5 to 15) than between /ǝ/ and /ᵻ/ (|t| from 4 to 7; see Table 4s in the Supplementary Material). Note that for F1 of the [ $\pm $ ATR] pairs, |t|-values were in a range from 5 to 17.5 (apart from speaker MA, who showed much smaller differences; see Table 4s and §3.2.1). In other words, the distance between the F1 values of the two interior vowels /ǝ/ and /ᵻ/ was the smallest of all the vowel pairs, which, as mentioned in §3.1, indicates their tendency towards centralisation.

3.3.2. Simulation of ATR metrics for central vowels: F2, A1−A2, ΔB1

Given that F1 alone does not distinguish between ATR and vowel height, additional metrics were used. First, we conducted a simulation experiment in which we applied the tests and hypotheses formulated in §2.3 for ATR contrasts to the pairs of central vowels. A simulation was done to see whether the pairs of central vowels, which we posited to be distinguished by height, would show any differences in these metrics as compared to the confirmed ATR pairs. The logic here is similar to the experiment reported by Kirkham & Nance (Reference Kirkham and Nance2017), where ATR tests were applied also to the [+tense] English vowels, even though no prior expectations as regards the feature of tenseness were available in some cases (e.g. for spectral slope).

For each simulation, the lower vowel in each pairing was treated as the potentially [−ATR] member of the opposition, and the higher vowel as potentially [+ATR]. This meant that /ǝ/ was treated as potentially [+ATR] in comparisons with /a/, but as potentially [−ATR] in comparisons with /ᵻ/. An exclamation point appended to the significance codes in Tables 6–7 means that the first member of comparison had a lower value than the second one.

The test for F2 (simulation of Hypothesis H1B in (18)) showed no significant differences in the pairs of central vowels at all. This is in dramatic contrast to the [ $\pm $ ATR] pairs, where quite a number of differences were expected and found (see §2.3.2).

The results of the spectral slope (A1−A2) and F1 bandwidth (ΔB1) tests applied to the pairs of central vowels are given in Tables 6 and 7.Footnote ³ They are similar to those obtained for the [ $\pm $ ATR] pairs (Tables 4 and 5). Both simulated hypotheses were generally confirmed, but not in all speakers, and with varying significance, and in one case (ΔB1 for speaker BO) the very significant result was contrary to the expected one. Note that the /ǝ – a/ pair was better distinguished by both tests than the pair of interior vowels /ǝ – ᵻ/. This correlates with a much bigger difference in F1 in the first pair than in the second one (see §3.3.1).

3.3.3. Comparison of predicted and observed B1 values in all vowels

Visualisation of the position of observed B1 values with regards to the curve of the B1 values predicted by Fant’s formula in (25) is given in Figures 7–12. The results of their statistical comparison by the Wilcoxon test are shown in Table 8 (see also Table 5s in the Supplementary Material for more detailed data).Footnote ⁴

Figure 7 F1 bandwidth: Speaker AD.

Figure 8 F1 bandwidth: Speaker HO.

Figure 9 F1 bandwidth: Speaker YT.

Figure 10 F1 bandwidth: Speaker MA.

Figure 11 F1 bandwidth: Speaker AK.

Figure 12 F1 bandwidth: Speaker BO.

Hypothesis H2 in (21), formulated on the basis of cross-linguistic observations by Hess (Reference Hess1992) and Starwalt (Reference Starwalt2008) (see §2.3.5), generally held for the Akebu vowels in that [+ATR] vowels typically stayed below, at, or immediately above the curve, while [−ATR] vowels remained at or above the curve. However, not only were the high [+ATR] vowels above the curve for some speakers (/i/ in AD and /u/ in AD and HO), as H2 predicts, but the mid [+ATR] vowels also turned out to be above it (/e/ in AD, HO, BO and /o/ in AD). In fact, in AD and HO, two of the three youngest speakers, no vowel remained significantly below the curve, although two [+ATR] vowels in HO were at the curve (see Table 8 for statistical results).

In other speakers, for whom [+ATR] peripheral vowels generally remained at or below the curve, all central vowels stayed significantly above or, in some cases (/ᵻ/ in YT and /ǝ/ in BO), at the curve, like the [−ATR] peripheral vowels. Note that the lower the phonological height of a central vowel, the higher its observed values over the predicted curve generally stayed.

4. Discussion and conclusions

4.1. The phonology of ATR in Akebu

Our field-based analysis of Akebu revealed some new phonological facts about ATR contrasts and vowel harmony in this Kwa language. This allowed us to propose a vowel system which differs both from previous descriptions of Akebu and from other known vowel systems with ATR contrasts and harmony.

Lexical contrasts showed that the Akebu inventory contains two interior (central non-low) vowels /ᵻ/ and /ə/, and, together with /a/, three central vowels in total. Such a system is rare: a recent study of 681 languages by Rolle et al. (Reference Rolle, Lionnet and Faytak2020) showed the ATR feature and the presence of interior vowels to be antagonistic patterns in the Macro-Sudan belt. Additionally, among the 29 languages which manifested a co-occurrence of these two patterns as phonemic in their sample, very few had more than one interior vowel. Our inspection of their dataset showed that none of the languages with more than one interior vowel showed a lack of the [±ATR] contrast in central vowels (see §1.2.3).

The main ATR harmony rules in Akebu could not provide a straightforward answer with respect to the [±ATR] specification of central vowels. The vowel /a/ combined with [−ATR] peripheral vowels, but the interior vowels combined either with /a/ as [+central], or with each other as [−peripheral], so their [±ATR] specification remained redundant. Still, various pieces of less direct evidence suggest that all central vowels are non-neutral [−ATR], contrasting with one another only in height. This evidence includes the harmonic behaviour of the central vowels as harmony targets and triggers; the parallel occurrence of the high interior vowel /ᵻ/ and the [−ATR] high peripheral vowels /ʊ, ɪ/ in non-harmonising contexts; and the shift of the latter vowels towards /ᵻ/ in vowel reduction. Typological considerations also rather supported the [−ATR] status of /a/ and the vowels it conditions in harmony (/ᵻ, ə/), and, consequently, the unmarked status of the [−ATR] value in the language, as such a situation was attested in the /2IU/ typological category to which Akebu belongs.

Additional hints of the [−ATR] status of the Akebu interior vowels can be drawn from their history. Some indications about their possible historical origins are found in the closely related language Animere. Heine (Reference Heine1968) postulated an eight-vowel system with two central vowels for Animere: /i, ɪ, ɛ, ə, a, u, ʊ, ɔ/, while Casali (Reference Casali2006b, Reference Casali2008b) analyses it as having a nine-vowel inventory with a cross-height ATR contrast: /i, e, ɪ, ɛ, a, u, o, ʊ, ɔ/. The vowel /ə/, attested in Heine’s data only twice, was considered by Casali to be the result of reduction: centralisation of /ɪ/ and raising of /a/ in prefixes. As stated in §1.5, the Akebu speakers who provided our data also manifested reduction (and centralisation) in the partial reduplication of verbal roots, but Akebu reduction still needs more research. The Animere vowel /a/ is described by Casali as a harmonically neutral target vowel both in static and in dynamic harmony and exhibiting two phonetic allophones, [+ATR] and [−ATR]. However, it alternates with [+ATR] /e/ in the third person plural subject prefix (/ba-/ $\sim $ /be-/) and triggers [−ATR] allomorphs of alternating affixes, thus manifesting the phonological [−ATR] value.

Casali (Reference Casali2006a, Reference Casali2008b) provides a list of 200 Animere words, including a 100-word Swadesh list. In Table 9, we list several Akebu words containing interior vowels and their attested Animere cognates.

Table 9 Examples of Akebu interior vowels and their cognates in Animere.

In most of these cases, Akebu /ə/ and /ᵻ/ correspond in Animere either to [−ATR] vowels or to sequences of a vowel and a glottal stop (/ə/ = /ɪ/, /ɛ/, /aʔ/; /ᵻ/ = /ʊ/, /ɔʔ/, /iʔ/). In only one case does Akebu /ᵻ/ correspond to Animere [+ATR] /i/. It is worth mentioning that the glottal stop in Animere occurs only word-finally in certain words (both in Heine Reference Heine1968 and in Casali Reference Casali2008b). Esling et al. (Reference Esling, Moisik, Benner and Crevier-Buchman2019: 176–177) link the development of [−ATR] vowels to the tense state of larynx, which corresponds to vocal fold adduction, epilaryngeal constriction, larynx raising and the resulting creaky voice.

In sum, all existing comparative and internal Akebu evidence rather supports the phonological interpretation of the Akebu interior vowels as [−ATR]. In such a scenario, the Akebu vowel inventory would typologically stand alone as a unique system with a [±ATR] contrast in front and back vowels, but with a total of three central harmonically non-neutral [−ATR] vowels lacking this contrast and opposed by height.

4.2. The phonetics of ATR in Akebu

In the second part of our study, we searched for acoustic evidence which could provide some phonetic details on ATR in Akebu and shed more light on our phonological analysis of the central vowels. For the [±ATR] pairs of peripheral vowels, we checked four acoustic metrics which have been commonly associated with the [±ATR] contrast, especially in languages of Africa: the frequencies of the first and the second formant (F1 and F2), spectral slope/flatness (A1−A2) and the size of F1 bandwidth (ΔB1). In all cases, the data were normalised to render them maximally comparable across speakers and to avoid various types of collinearity. Still, as in all previous research, only a difference in F1 resulted as a consistent and robust correlate of the ATR contrast, significant for all vowel qualities in every speaker. In agreement with all previous studies, the [−ATR] vowels had higher F1 than the [+ATR] vowels.

Other metrics provided varying and partially contradictory results. In earlier studies, they were not always significant in all speakers and/or in all vowel types, and in some cases even showed significant differences in the opposite direction. Our results were entirely in line with this and confirmed that these metrics should not be considered as conclusive acoustic indicators of the [±ATR] contrast in Akebu either.

Some trends are still worth highlighting. For F2, we checked two competing hypotheses: (a) whether [+ATR] vowels would always have higher F2 values, or (b) if [+ATR] vowels would be more peripheral than their [−ATR] counterparts. In Akebu, mid vowels were better contrasted by F2 than high vowels, and the [+ATR] mid vowels turned out to be more peripheral than the [−ATR] ones, in agreement with the second hypothesis. High vowels, in turn, followed rather the first hypothesis: the [−ATR] vowels generally had lower F2 than the [+ATR] ones.

Mid vowels were also better distinguished by spectral slope (A1−A2) and F1 bandwidth (ΔB1) than high vowels, which was consistent with the results of a cross-linguistic study on eleven Niger-Congo languages by Starwalt (Reference Starwalt2008). The best results across our speakers for both metrics were obtained for the pair of mid back vowels [o – ɔ]. Unlike Starwalt’s results, ΔB1 did not perform better than A1−A2, however.

Still, one speaker (AK) did not show any F2 differences between the [±ATR] pairs, while another (BO) did not reveal any spectral slope differences. As discussed by Olejarczuk et al. (Reference Olejarczuk, Otero and Baese-Berk2019), this does not necessarily mean that speakers would rely on different acoustic cues in their perception of ATR contrasts, as some of the observed significant differences might still be too small to be perceived by human ear. A separate study would be needed to understand the correlation between the acoustic and perceptual cues to ATR in Akebu.

It is also of interest that some peripheral higher [−ATR] vowels significantly overlapped in F1/F2 with lower [+ATR] vowels (/ɪ – e/, /ʊ – o/). Given that native speakers (and the first author) distinguish between such pairs well, there should still exist other important perceptual cues for the ATR contrasts.

The problem of F1 as the most robust acoustic correlate of ATR discovered to date comes to light especially clearly in our sub-study on the Akebu central vowels. The most consistent correlate distinguishing the three central vowels was again F1. However, F1 can be modulated by the articulatory mechanisms responsible for both ATR and vowel height (see §2.3.1). Given that similar acoustic effects can be attained by different articulatory mechanisms, the question arises to what extent acoustic data alone are enough for determining the phonological feature of ATR, as opposed to vowel height and possibly also to tenseness. Akebu data demonstrate that although F1 is the most robust measure for both the [±ATR] contrast and the vowel height contrast, it is not sufficient to capture the acoustic difference between the two features.

Our simulation testing of the ATR-related hypotheses for the pairs of central vowels showed that spectral slope and F1 bandwidth did not allow vowel height and ATR to be distinguished either (possibly because these metrics partially correlate with F1, especially ΔB1). Interestingly, also for central vowels, the consistency of results across speakers depended on vowel height: the lower pair /ǝ – a/ was much better distinguished by both metrics than the higher pair /ǝ – ᵻ/. Interior vowels also had very close and partially overlapping F1 values, which might indicate their tendency towards centralisation.

Paradoxically, the only metrics which gave significantly different results for the height contrast in central vowels and the [±ATR] contrast in other vowels was F2 frequency. In the case of ATR, F2 showed partially contradictory and heterogeneous results across speakers and vowel types, but in general it did distinguish between the [±ATR] pairs. In the case of the height contrast in central vowels, no F2 differences between any vowel pair in any speaker were found. This result, however, might partially stem from the position of central vowels in the articulatory space. If there is any articulatory mechanism creating a centralising tendency which makes the acoustics of [−ATR] vowels less peripheral than that of [+ATR] vowels, this tendency might not affect central vowels, which are already in the centre of the vowel space (see also the height-opposed pairs of peripheral vowels /i – e/, /u – o/, /ɪ – ɛ/ /ʊ – ɔ/ in Figures 1–6, which often showed a centralising tendency for lower vowels). More comparative research on F2 in ATR contrasts vs. height contrasts is needed to see whether our results reflect any inherent acoustic and articulatory differences between the two features in this respect.

However, some phonetic results provided acoustic support for classifying the three central Akebu vowels as [−ATR]. First, the interior vowels /ᵻ/ and /ǝ/ had much higher F1 frequencies compared to their [+ATR] peripheral counterparts of the same height (/i – u/ and /e – o/, respectively; see §3.1). Central vowels are not necessarily expected to have significantly raised F1 in comparison with the corresponding peripheral vowels – see, e.g., results on /ᵻ/ and /ǝ/ in Malaysian Hokkien (Southern Min Chinese) in Huang et al. (Reference Huang, Chang, Hsieh, Lee and Zee2011) – although this requires further research. The fact that the Akebu interior vowels occupied approximately the same belt of F1 frequencies as their [−ATR] counterparts of the same height (/ɪ – ʊ/ and /ɛ – ɔ/, respectively) rather speaks in favour of their analysis as [−ATR].

The second argument comes from the comparison of observed F1 bandwidth values (B1_obs) against those predicted by Fant’s (Reference Fant1972) formula in (25); see §3.3.3. Starwalt (Reference Starwalt2008) noted that observed values of [+ATR] vowels tended to stay at, below or (for high vowels) immediately above the curve of predicted values, while the observed values of [−ATR] vowels tended to remain at or above it. In our study, this observation was rather supported across speakers and vowel types. Notably, all the three central vowels remained at or above the curve, as expected for the [−ATR] vowels, in all speakers.

4.3. Outlook and conclusion

In a few comparative phonetic studies, the ATR contrast in an African language was opposed to the tenseness contrast in English vowels (Tiede Reference Tiede1996; Kirkham & Nance Reference Kirkham and Nance2017). However, in these studies, the investigation of differences between [ $\pm $ ATR] and [ $\pm $ tense] primarily used articulatory methods or coupled these with acoustic ones. Olejarczuk et al. (Reference Olejarczuk, Otero and Baese-Berk2019: 35), the authors of the most recent acoustic study on African ATR (in Ethiopian Como), who used many sophisticated acoustic metrics of ATR, still evaluate their results as ‘merely suggestive, not conclusive’, and call for more articulatory studies on the ATR feature. In that study, three more metrics (coded as H1*−H2*, CPP and HNR), in addition to F1, consistently distinguished between the [ $\pm $ ATR] vowel pairs. However, unlike F1, the results of distinguishing between [−ATR] and [+ATR] were not stable across speakers even for these three metrics (and an interaction between the speaker and the particular vowel type was not investigated there).

The acoustic metrics of ATR which were not part of our study should, of course, be tested for Akebu, and perceptual studies on various artificially manipulated ATR correlates should also be conducted. Likewise, the phonological analysis of Akebu vowels in terms of distinctive vowel features might be adjusted once all the particularities and irregularities of the vowel harmony system and the phonetic details of vowel reduction at the left and right word edges are determined (such research is in progress in our project).

However, it seems that all known acoustic correlates of ATR, apart from the F1 frequency, are still not consistent across speakers, and that F1 alone does not allow for distinguishing between ATR and vowel height or tenseness. The case of such a rare vowel system as that of Akebu, which likely manifests the [±ATR] contrast in peripheral vowels but not in the three central ones, makes the need for an articulatory study in support of the proposed phonological analysis especially clear. It might be that articulatory data can provide the only entirely reliable empirical evidence in such a case.

Supplementary material

The supplementary materials include the following data:

1. Table 1s represents the full qualitative and quantitative structure of the dataset, which contains 1,980 vowel tokens (30 productions per 11 vowels per 6 speakers). Column A contains the wordforms with the notation of underlying tonemes, whereas column B shows the surface realisation of these tonemes. Columns C and D contain the glosses and translations of the wordforms, respectively. Abbreviations used in glosses are given below the table. Column E shows the particular vowel(s) within the wordform which were analysed. In cases when only one of the two identical vowels of a wordform was studied, the order of the studied vowel is mentioned in parentheses (e.g., ‘ʊ (1st)’), and this vowel is additionally highlighted in bold in the wordform in column A. Columns F to K provide the number of tokens for each studied vowel in each of the six speakers.
2. Table 2s presents the statistical details on linear mixed regression modelling regarding the influence of the vowel type on each of the studied variables (F1, F2, A1−A2, ΔB1) in each speaker. Each model looked like this: [a studied variable] $\sim $ V + (1|word), data= [a subset on a given speaker].
3. Table 3s shows the detailed results of post-hoc pairwise comparisons run on the models from Table 2s across the four peripheral [±ATR] pairs /ɪ – i/, /ɛ – e/, /ʊ – u/, /ɔ – o/ within each speaker for each studied ATR correlate, respectively. The abridged version of these results is provided in Tables 2–5 in the main text (see §3.2). Significance codes for p-values throughout are as follows: *** indicates $p<0.001$ , ** $p<0.01$ and * $p<0.05$ ; when $p>0.05$ , the cell is left blank. An exclamation point (!) appended to the significance code indicates that the value for a [−ATR] vowel was lower than the value for the corresponding [+ATR] vowel; the absence of this mark signifies the opposite relation.
4. Table 4s shows the simulation of the ATR metrics (F1, F2, A1−A2, ΔB1) for central vowels (see the description of Table 3s above). The shortened version of these results is explicated in §§3.3.1 and 3.3.2 (including Tables 6 and 7) in the main text.
5. Table 5s explicates the detailed results of the statistical comparison by the Wilcoxon test between the observed B1 values of vowels and those predicted by Fant’s formula (see (25) in the main text) in each speaker. N refers to the number of tokens, and Z to the results of the Wilcoxon test. The abridged version of this comparison is provided in Table 8 of the main text, and the visualisation of this comparison is given in Figures 7–12 (§3.3.3).

Acknowledgements

We express our gratitude to all our Akebu consultants. We are also grateful to the editors-in-chief, an associate editor and three anonymous reviewers for their valuable feedback, and to Cormac Anderson for proofreading and comments.

Author contributions

NM: design and background of the study, data collection and processing, first draft, statistical analysis and visualisation (§§3.1 and 3.3.3). General responsibility for §§2 and 3.
NK: final text writing and editing, responses to reviewers, corresponding author, statistical analysis and visualisation (§§2.5, 3.2 and 3.3.2). General responsibility for §§1 and 4.

Competing interests

The authors declare no competing interests.

Footnotes

1 The measurements and scripts used in this study are available at https://osf.io/wtxfn/.

2 If the formant values yielded a standard deviation of greater than 50 Hz for F1 and/or greater than 100 Hz for F2, then both formants were assigned a zero value. This procedure is not intended to remove outliers, but to avoid false optimal ceilings for the unstable formant values which result from a set of highly dispersed values; see Escudero et al. (Reference Escudero, Boersma, Rauber and Bion2009).

3 Not enough tokens of /a/ from speaker MA were available to support statistical testing of the contrast with /ə/; see footnote 4.

4 An exclamation point appended to the significance code means that B1_obs < B1_pred. For speaker MA’s production of the vowel /a/, only two valid tokens <300 Hz were available for analysis after removing all unreliable values (see footnote 2.4.1), so the statistical tests were not conducted in this case. However, an approximate placement of the F1 bandwidth of MA’s /a/ on the basis of these two tokens is included in Figure 10.

References

Adjeoda, Dzifa (2008). Eléments de morphosyntaxe du kebu, langue dite résiduelle du Togo. Master’s thesis, Université de Lomé.Google Scholar

Allen, Blake, Pulleyblank, Douglas & Ajíbóyè, Ọládiíp (2013). Articulatory mapping of Yoruba vowels: an ultrasound study. Phonology 30. 183–210.CrossRef Google Scholar

Amoua, Kwamivi (2011). Le système nominal du kebu. Master’s thesis, Université de Lomé.Google Scholar

Archangeli, Diana & Pulleyblank, Douglas (2002). Kinande vowel harmony: domains, grounded conditions and one-sided alignment. Phonology 19. 139–188.CrossRef Google Scholar

Baayen, Rolf Harald, Davidson, Doug J. & Bates, Douglas M. (2008). Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language 59. 390–412.CrossRef Google Scholar

Bates, Douglas M., Mächler, Martin, Bolker, Ben & Walker, Steve (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software 67. 48 pp.CrossRef Google Scholar

Blench, Roger (2009). Do the Ghana-Togo Mountain languages constitute a genetic group? Journal of West African Languages 36. 19–35.Google Scholar

Boersma, Paul & Weenink, David (2017). Praat: doing phonetics by computer. Computer program. Version 6.0.33. http://www.praat.org/.Google Scholar

Bolker, Benjamin M. (2015). Linear and generalized linear mixed models. In Fox, Gordon A., Negrete-Yankelevich, Simoneta & Sosa, Vinicio J. (eds.) Ecological statistics: contemporary theory and application. Oxford: Oxford University Press. 309–333.CrossRef Google Scholar

Casali, Roderic F. (2003). [ATR] value asymmetries and underlying vowel inventory structure in Niger-Congo and Nilo-Saharan. Linguistic Typology 7. 307–382.CrossRef Google Scholar

Casali, Roderic F. (2006a). Animere phonetic word list. Ms, SIL International. Available (May 2023) at https://www.rogerblench.info/Language/Niger-Congo/GTML\%20Website/GTMLanguagepages/Animere/Animere\%20word\%20list.pdf.Google Scholar

Casali, Roderic F. (2006b). Preliminary observations on the phonology and noun class system of Animere. Paper presented at the 37th Annual Conference on African Linguistics, University of Oregon.Google Scholar

Casali, Roderic F. (2008a). ATR harmony in African languages. Language and Linguistics Compass 2. 496–549.CrossRef Google Scholar

Casali, Roderic F. (2008b). A phonological overview of Animere. Ms, Canada Institute of Linguistics.Google Scholar

Casali, Roderic F. (2016). Some inventory-related asymmetries in the patterning of tongue root harmony systems. Studies in African Linguistics 45. 96–99.CrossRef Google Scholar

Djitovi, Afi (2003). English and Akebu phonologies: a comparative analysis. Master’s thesis, Université de Lomé.Google Scholar

Eberhard, David M., Simons, Gary F. & Fennig, Charles D. (2019). Ethnologue: languages of the world. 22nd edition. Dallas, TX: SIL International. Online version: http://www.ethnologue.com.Google Scholar

Edmondson, Jerold A. (2009). Correspondences between articulation and acoustics for the feature [ATR]: the case of two Tibeto-Burman languages and two African languages. In Fant, Gunnar, Fujisaki, Hiroya & Shen, Jiaxuan (eds.) Frontiers in phonetics and speech science. Beijing: Commercial Press. 179–189.Google Scholar

Edmondson, Jerold A. & Esling, John H. (2006a). The laryngeal basis for the feature [ATR]. Paper presented at the 37th Annual Conference on African Linguistics, University of Oregon.Google Scholar

Edmondson, Jerold A. & Esling, John H. (2006b). The valves of the throat and their functioning in tone, vocal register and stress: laryngoscopic case studies. Phonology 23. 157–191.CrossRef Google Scholar

Escudero, Paola, Boersma, Paul, Rauber, Andréia Schurt & Bion, Ricardo A. H. (2009). A cross-dialect acoustic description of vowels: Brazilian and European Portuguese. JASA 126. 1379–1393.CrossRef Google Scholar PubMed

Esling, John H., Moisik, Scott R., Benner, Allison & Crevier-Buchman, Lise (2019). Voice quality: the laryngeal articulator model. Cambridge: Cambridge University Press.CrossRef Google Scholar

Fant, Gunnar (1972). Vocal tract wall effects, losses and resonance bandwidths. Speech Transmission Laboratory Quarterly Progress and Status Report 13. 28–52.Google Scholar

Fulop, Sean A. (2011). Speech spectrum analysis. Signals and Communication Technology. Berlin: Springer.CrossRef Google Scholar

Fulop, Sean A., Kari, Ethelbert & Ladefoged, Peter (1998). An acoustic study of the tongue root contrast in Degema vowels. Phonetica 55. 80–98.CrossRef Google Scholar PubMed

Gblem-Poidi, Honorine Massanvi & Kantchoa, Laré (2012). Les langues du Togo: état de la recherche et perspectives. Paris: L’Harmattan.Google Scholar

Grawunder, Sven (2010). formant\_fba\_intv\_1c.praat. Praat script. Available (May 2023) at https://github.com/sgraw/praatscripts/blob/main/formant\_fba\_intv\_1c.praat.Google Scholar

Green, Christopher R. (2015). The foot domain in Bambara. Lg 91. e1–e26.Google Scholar

Guion, Susan G., Post, Mark W. & Payn, Doris L. (2004). Phonetic correlates of tongue root vowel contrasts in Maa. JPh 32. 517–542.Google Scholar

Heine, Bernd (1968). Strength and weakness at the interface: positional neutralization in phonetics and phonology. Berlin: Dietrich Reimer.Google Scholar

Heine, Bernd (2017). Some reflections on genetic relationship in a group of West African Niger-Congo languages. STUF – Language Typology and Universals 70. 273–281.CrossRef Google Scholar

Hess, Susan (1992). Assimilatory effects in a vowel harmony system: an acoustic analysis of advanced tongue root in Akan. JPh 20. 475–492.Google Scholar

Huang, Ting, Chang, Yueh-chin & Hsieh, Feng-fan (2011). An acoustic analysis of central vowels in Malaysian Hokkien. In Lee, Wai-Sum & Zee, Eric (eds.) Proceedings of the 17th International Congress of Phonetic Sciences. Hong Kong: City University of Hong Kong. 914–917.Google Scholar

Hudu, Fusheini (2014). [ATR] feature involves a distinct tongue root articulation: evidence from ultrasound imaging. Lingua 143. 36–51.CrossRef Google Scholar

Ivanova, Margarita (2021). Akusticheskije korrel’aty priznaka prodvinutosti/otodvinutosti korn’a jazyka (

$\pm$ ATR) u glasnyh v jazyke igbo [Acoustic correlates of the advanced vs. retracted tongue root feature in Igbo]. Language in Africa 2. 45–79.CrossRef Google Scholar

Kenstowicz, Michael & Kisseberth, Charles (1977). Topics in phonological theory. New York: Academic Press.Google Scholar

Kiparsky, Paul (1982). Explanation in phonology. Dordrecht: Foris.CrossRef Google Scholar

Kiparsky, Paul (2018). Formal and empirical issues in phonological typology. In Hyman, Larry M. & Plank, Frans (eds.) Phonological typology. Berlin: De Gruyter Mouton. 54–106.CrossRef Google Scholar

Kirkham, Sam & Nance, Claire (2017). An acoustic-articulatory study of bilingual vowel production: advanced tongue root vowels in Twi and tense/lax vowels in Ghanaian English. JPh 62. 65–81.Google Scholar

Koffi, Yao (1984). Sprachkontakt und Kulturkontakt: eine Untersuchung zur Mehrsprachigkeit bei den Akebu in Togo. PhD dissertation, Universität des Saarlandes.Google Scholar

Kuznetsova, Alexandra, Brockhoff, Per B. & Christensen, Rune H. B. (2017). lmerTest package: tests in linear mixed effects models. Journal of Statistical Software 82. 26 pp.CrossRef Google Scholar

Ladefoged, Peter, DeClerck, Joseph, Lindau, Mona & Papcun, George (1972). An auditory-motor theory of speech production. UCLA Working Papers in Phonetics 22. 48–75.Google Scholar

Ladefoged, Peter & Maddieson, Ian (1996). The sounds of the world’s languages. Oxford: Blackwell.Google Scholar

Lahiri, Aditi (2018). Predicting universal phonological contrasts. In Hyman, Larry M. & Plank, Frans (eds.) Phonological typology. Berlin: De Gruyter Mouton. 229–272.CrossRef Google Scholar

Lindau, Mona (1975). Features for vowels. Los Angeles, CA: University of California, Los Angeles.Google Scholar

Lindau, Mona (1978). Vowel features. Lg 54. 541–563.Google Scholar

Lindau, Mona (1979). The feature expanded. JPh 7. 163–176.Google Scholar

Lindau, Mona, Jacobson, Leon & Ladefoged, Peter (1972). The feature advanced tongue root. UCLA Working Papers in Phonetics 22. 76–94.Google Scholar

M’boma, Komlavi Malambo (2012). Précis d’écriture de la langue akébou dans le sillage de la construction linguistique. PhD dissertation, Evangel Christian University.Google Scholar

Makeeva, Nadezhda (2022a). Floating low tone and consonant voicing in Akebu. Language in Africa 3. 272–288.CrossRef Google Scholar

Makeeva, Nadezhda (2022b). Fonetičeskije i fonologičeskije svojstva priznaka prodvinutosti korn’a jazyka v jazykah Afriki [Phonetic and phonological properties of the advanced tongue root feature in African languages]. Voprosy jazykoznanija 1. 120–150.CrossRef Google Scholar

Makeeva, Nadezhda & Shluinsky, Andrey (2018). Noun classes and class agreement in Akebu. Journal of West African Languages 45. 1–26.Google Scholar

McCarthy, John J. (1988). Feature geometry and dependency: a review. Phonetica 45. 84–108.CrossRef Google Scholar

Nearey, Terrance Michael (1978). Phonetic feature systems for vowels. Bloomington, IN: IULC.Google Scholar

Olejarczuk, Paul, Otero, Manuel A. & Baese-Berk, Melissa M. (2019). Acoustic correlates of anticipatory and progressive [ATR] harmony processes in Ethiopian Komo. JPh 74. 18–41.Google Scholar

Painter, Colin (1973). Cineradiographic data on the feature ‘covered’ in Twi vowel harmony. Phonetica 28. 97–120.CrossRef Google Scholar PubMed

Przezdziecki, Marek A. (2005). Vowel harmony and coarticulation in three dialects of Yoruba: phonetics determining phonology. PhD dissertation, Cornell University.Google Scholar

R Core Team (2020). R: a language and environment for statistical computing. Computer program. Version 4.0.0. https://www.R-project.org.Google Scholar

Remijsen, Bert, Ayoker, Otto G. & Mills, Timothy (2011). Shilluk. JIPA 41. 111–125.CrossRef Google Scholar

Rolle, Nicholas, Lionnet, Florian & Faytak, Matthew (2020). Areal patterns in the vowel systems of the Macro-Sudan belt. Linguistic Typology 24. 113–179.CrossRef Google Scholar

Rose, Sharon (2017). ATR vowel harmony: new patterns and diagnostics. In Gallagher, Gillian, Gouskova, Maria & Yin, Sora Heng (eds.) Proceedings of the 2017 Annual Meeting on Phonology. Washington, DC: Linguistic Society of America. 12 pp.Google Scholar

Shluinsky, Andrey (2020). Reduction of noun class suffixes in Akebu. Language in Africa 1. 65–88.CrossRef Google Scholar

Shluinsky, Andrey (2022). Incorporation as a nominal attribute strategy in Akebu. Language Sciences 93. Article 101484.CrossRef Google Scholar

Sossoukpe, Jacques (2012). Proposition d’orthographe du kǝkpǝǝ-kǝ (Akébou). Lomé: SIL Togo.Google Scholar

Sossoukpe, Jacques (2017). Effet voisant du ton bas flottant sur les obstruantes en akebou. In Ahoua, Firmin & Elugbe, Benjamin Ohi (eds.) Typologie et documentation des langues en Afrique de l’Ouest: les actes du 27e Congrès de la Société de Linguistique de l’Afrique de l’Ouest (SLAO). Paris: L’Harmattan. 139–146.Google Scholar

Sossoukpe, Marthe (2014). Kә lә kә, kә la ŋʊrʊ kәkpәә-kә: guide pour aider les scolarisés en français à lire et écrire l’akébou: (guide de transition français - akébou). Lomé: SIL Togo.Google Scholar

Stanley, Joey (2018). Making vowel plots in R (part 1). Blog post. https://joeystanley.com/blog/making-vowel-plots-in-r-part-1.Google Scholar

Starwalt, Coleen G. A. (2008). The acoustic correlates of ATR harmony in seven- and nine- vowel African languages: a phonetic inquiry into phonological structure. PhD dissertation, University of Texas, Arlington.Google Scholar

Stewart, J. M. (1967). Tongue root position in Akan vowel harmony. Phonetica 16. 185–204.CrossRef Google Scholar

Storch, Anne & Koffi, Yao (2000). Noun classes and consonant alternation in Akebu (Kә̀gbә̀rә̄kә). In Meißner, Antje & Storch, Anne (eds.) Nominal classification in African languages. Köln: Rüdiger Köppe. 79–98.Google Scholar