1 Introduction
The symbol r is common across linguistic descriptions, including the phonetic and phonemic inventories of language grammars. However, we suggest that it is prone to misinterpretation due to its common, implicit usage as a generic symbol rather than a specific one, representing the alveolar trill /r/. As a consequence, many practitioners (within and outside linguistics) take for granted that alveolar trills are common cross-linguistically (e.g. Winter et al. Reference Winter, Sóskuthy, Perlman and Dingemanse2022). For example, about 42 $\%$ (1332/3182) of the inventories in the PHOIBLE database (Moran & McCloy Reference Moran and McCloy2019) are reported as containing a trill. This proportion climbs up to about 57 $\%$ (359/629) of the inventories in PBASE (https://pbase.phon.chass.ncsu.edu/query). It is about 29 $\%$ (183/625) in LAPSyD (Maddieson et al. Reference Maddieson, Flavier, Marsico, Coupé and Pellegrino2013), which is arguably a more conservative database. In this paper, we show that this widespread impression that the alveolar trill is rather frequent cross-linguistically should be revisited, by carefully studying the use of r in the description of more than 200 linguistic varieties. Our results suggest that r is commonly used to denote a phonemic default rhotic, while the actual [r] (the alveolar trill) production is much less common than that of another possible allophone, namely the apical tap [ ].
While a trill is a sound that results from the vibration of a mobile articulator (most commonly, the tip of the tongue) during a continuous airstream, the tap is a sound that results from a short contact of the tip of the tongue with the alveolar ridge. Thus, trilling usually involves at least several short contacts (although one-contact trills are sometimes reported as well as approximants), while tapping only involves a single short contact (see the discussion in Section 2.2).
In this paper, which, we emphasize, is not an articulatory or an acoustic study of rhotics, we report the results of quantitative and qualitative analyses based on a systematic review of the Illustrations of the IPA published in the peer-reviewed Journal of the International Phonetic Association (JIPA). We also highlight the way in which the various (implicit or explicit) practices of language description may influence inferences concerning the cross-linguistic occurrence of trills (both phonetically and phonemically), as well as their nature: in other words, this paper addresses the question of how the symbol r and the speech sound described as a trill are related.
We argue here that the widespread perception that trills are cross-linguistically common results, in part, from the ambiguity between the grapheme <r> and the symbol used in the International Phonetic Alphabet (IPA) to represent the voiced alveolar trill [r], leading to the artificial over-representation of the phoneme /r/ as the default segment associated with the grapheme <r>.
Although the awareness that the perceived abundance of trilling might be artificially inflated is far from new (Whitley Reference Whitley2003), our contribution here provides quantitative and qualitative support based on the systematic assessment of a large sample of diverse languages. For example, Lindau (Reference Lindau and Fromkin1985: 616) stated that trilling is not as common as it ‘might be expected from descriptions of languages’, and Ladefoged, Cochran & Disner (Reference Ladefoged, Cochran and Disner1977: 46) said that ‘most languages do not trill any articulator’. While reporting, based on UPSID, that 36.4 $\%$ languages have a trill, Maddieson (Reference Maddieson1984: 89 endnote 2) was well aware that either the UPSID data suggest ‘that trills are not in fact particularly rare or that very many erroneous reports of trills occur in the literature’. Since then, many studies have focused on detailed aspects of the articulation, acoustics, aerodynamics and acquisition of the alveolar trill (Solé Reference Solé1998, Reference Solé2002; Recasens & Pallarès Reference Recasens and Dolors Pallarès1999; Boyce, Hamilton & Ahmed Rivera-Campos Reference Boyce, Hamilton and Rivera-Campos2016). While these studies usually highlighted the alveolar trill’s complexity at multiple levels, they have rarely adopted a broad cross-linguistic perspective, potentially underestimating the variation in trills among languages. This might be especially the case for less well-studied languages than, for example, Spanish or Dutch. Here we shed some light on these issues by focusing on a very rich and diverse dataset represented by all the articles published under the heading ‘Illustrations of the IPA’.
Before delving into the details, we need to clarify several terms, as defined by the International Phonetic Association itself. In its Handbook of the International Phonetic Association (IPA 1999: 27), the association defines a phoneme ‘as an element in an abstract linguistic system … which has to be realized in the physical world by an acoustic signal produced by vocal activity’, and an allophone of a phoneme as one of its variant realizations. Rhotics are phonemes that can have several allophones. It is important to note that the definitions of the ‘phoneme’ and ‘allophone’ concepts we use here do not necessarily reflect the current debates surrounding them. We refer the interested reader to Ladd (Reference Ladd, Goldstein, Whalen and Best2009, Reference Ladd, Goldsmith, Riggle and Yu2011) for an overview of current discussion concerning the nature of the phoneme and two assumptions of the International Phonetic Association, namely (i) segmental idealization and (ii) the universal categorization assumption. However, we have explicitly chosen to adopt here a point of view broadly based on these two assumptions so as to remain consistent with the view that the JIPA is the ‘gold standard’ in what concerns the use of written symbols for language description. The information contained therein may be used as a reference for typologists (especially those working with quantitative data), field linguists (especially in what concerns the meaning of symbols used for phones and phonemes in grammars), and phoneticians (e.g. concerning the production and perception of trilling in different languages). We therefore hope that our analyses here will help raise awareness of the complexity surrounding the ‘nature’ and distribution of the alveolar trill.
The paper is structured as follows. We first present an overview of past conventions for symbols representing ‘r-like’ sounds as used in two transcription traditions: the Americanist Transcription System and the International Phonetic Association. We then describe trills and taps from phonetic and cross-linguistic perspectives. These are followed by a detailed description of our dataset (the Illustrations of the IPA, hereafter Illustrations), the data collection strategy, the coding and counting of the trills and taps, and the description of the two types of narrative transcription (phonetic and phonemic) in Illustrations that were used to analyze the use of r and . We then present the general results of our data collection with regard to the year of publication, the available informant characteristics, as well as the macroarea, language family and speaker population size of the different varieties covered. We further present the results of our main study on the use of the r and in Illustrations, showing that there is more variability in the transcriptions than expected given the existing literature and ‘received wisdom’. We also show that /r/ is more frequent in transcriptions compared to [r], while / / is conversely less frequent in transcription compared to [ ]. Finally, in the Section 5, ‘Discussion and conclusions’, we suggest that care should be taken when taking the symbol r at face value in grammars, databases and other linguistic resources without further clarification. As a corollary, our findings might signal the need to re-evaluate already published research concerning the frequency of the alveolar trill within and between languages (Maddieson Reference Maddieson1984), its areal and genealogical patterning, and the forces influencing these patterns (Moran, Lester & Grossman Reference Moran, Lester and Grossman2021), its acquisition (McLeod & Crowe Reference McLeod and Crowe2018, Stemberger & Bernhardt Reference Stemberger and May Bernhardt2018) and its extra-linguistic associations (Winter et al. Reference Winter, Sóskuthy, Perlman and Dingemanse2022), among others. However, we emphasize that our work should not be taken as negative, but as a positive, constructive contribution to the establishment of clearer transcription guidelines, ensuring a better consistency between large cross-linguistic databases, and promoting the use of statistical methods that better handle the ambiguity of most existing resources.
2 What are trills and taps?
2.1 Symbols for ‘r-like’ sounds in the context of transcription traditions
A preliminary analysis shows that the choice of symbol(s) for ‘r-like’ sounds, and of r in particular, in various transcription traditions is far from trivial and uncontroversial.
The Americanist Transcription System (ATS) of 1916 gives the following instructions: ‘All rolled consonants (r-sounds), whether markedly trilled or not, are to be indicated by r or r-like characters’ (Boas et al. Reference Boas, Goddard, Kroeber and Sapir1916: 13). The ATS used the term ‘intermediate stop consonants’, that might be interpreted as taps, using the (small) capital grapheme to represent them (the lower-case d being reserved for the voiced apical stop).Footnote 1 Notably, the tap is considered as an allophone of plosives /t d/, and is not represented by an r-based segment like in the Europeanist tradition. In the latter, the symbol is found to represent at the same time taps as allophones of rhotics and as allophones of plosives. In 1949, the Europeanist tradition, in its Principles of the International Phonetic Association (IPA 1949) notes for r:
r. Rolled r as in Scottish English, Italian, Spanish, Russian. The letter is also used whenever possible to denote flapped r ( ), fricative r ( ) [sic], lingual fricationless continuant r ( ), uvular rolled r ( ), uvular fricative r ( ) [sic] or the uvular frictionless continuant ( ). (IPA 1949: Section 26)
There are also some clarification notes there:
(d) The letter may denote the fully rolled sound with two or more flaps of the uvula or the single-flap sound. In a language containing the two sounds as separate phonemes, the notation is recommended for the fully rolled sound.
(e) The letter r may, when convenient, replace , or in the transcription of a language containing one of these three sounds but not a rolled lingual r.
(f) The flapped sound can generally be represented by r. In a language such as Spanish, where a single flap and the fully rolled sound occur as separate phonemes, the notation rr is recommended for the fully rolled sound.
(IPA 1949: Section 27)
Thus, the symbol r can be used to represent no fewer than seven different phones with different manners and places of articulation (diacritics were not yet a norm at that time, so as to represent a more open fricative or approximant, with the fricative being next to and , and s and z in the 1949 table). The convention to duplicate the symbol does not seem clearly specified, but is left to the authors’ interpretation. The use of a single symbol for these phones may be based on the idea that these sounds really belong to a single phonological class, called rhotics, which has been widely used in the literature (Barry Reference Barry1997, Scobbie Reference Scobbie and Brown2006, Magnuson Reference Magnuson2007, Chabot Reference Chabot2019) but without reaching a universal and consensual definition. While reflecting a certain theoretical stance, this ‘homonymy’ can lead to misunderstandings and to a potential bias favoring the interpretation of trill (or rolled) r as the typical and most common phoneme represented by symbol r despite a more nuanced situation.
The 1999 edition of the Handbook of the International Phonetic Association (IPA 1999) says:
Trills are sounds like [r] is Spanish perro ‘dog’ in which the air is repeatedly interrupted by an articulator (in this case the tongue tip) vibrating in an airstream. A very short contact, similar in duration to one cycle of the vibration of a trill, is called a tap, such as the [ ] in Spanish pero ‘but’. (IPA 1999: 8)
Note: most forms of English, French, German, Swedish do not have trills except in over-articulated speech, for instance when trying to be clear over a poor telephone line. (IPA 1999: 19)
In this edition, there is no explicit guideline on using r as a generic symbol to denote different phones, but there is no indication to avoid this practice, either.
In conclusion, the IPA mandates a generic use of the r symbol that blurs what was actually analyzed: while it conveys the broad idea that this is a rhotic, it is left to the author(s) to somehow clarify the place(s) and manner(s) used. This imprecision, unfortunately, can be quite misleading for typological, comparative work, especially when large databases and quantitative methods are used (but even more focused, manual comparisons might be affected as well).
2.2 ‘Trills’ and ‘taps’
The phone alveolar trill is usually described in terms of articulation, acoustics and aerodynamics, and most of the authors refer to Spanish (or other Romance languages such as Catalan; Recasens Reference Recasens1991) for its characterization. In the Handbook of the International Phonetic Association (IPA 1999), the trill is described articulatorily, and the tap is described as similar to the trill but with one cycle only (using Spanish as reference).Footnote 2 One can wonder whether trills are produced in the same way across all languages that use them, and not just in those Romance examples. While not explicitly addressed, we may expect that [r] should refer to roughly the same speech sound in languages across the world.
On the phonemic level, it is cross-linguistically infrequent to find contrasts between two trills with different places of articulation (Ladefoged et al. Reference Ladefoged, Cochran and Disner1977, Maddieson Reference Maddieson1984), but some languages, such as Russian (Yanushevskaya & Bunčić Reference Yanushevskaya and Bunčić2015) and Toda (Spajić, Ladefoged & Bhaskararao Reference Spajić, Ladefoged and Bhaskararao1996), are reported to contrast between an alveolar trill without secondary articulation and a palatalized alveolar trill. From an articulatory point of view, ‘trills can be produced with any of the three : the lips, the tongue, the uvula’ (Ladefoged et al. Reference Ladefoged, Cochran and Disner1977: 49; emphasis added by the present authors), but only the last two are usually considered to belong to the rhotic class, as r-like segments, written with a <r> in the Latin-based alphabet. The lingual trill can be laminal or apical: for example, in the Illustrations of the IPA, laminal trills (also called fricative trills in there) can be found in Mbarrumbathama (Verstraete Reference Verstraete2019) and in Czech (Dankovičová Reference Dankovičová1997, Šimáčková, Podlipský & Chládková Reference Šimáčková, Jonáš Podlipský and Chládková2012). Apical trills are found, for example, in Spanish, and are generally assumed to be the prototypical trills; they can be linguolabial, dental, alveolar or postalveolar, but linguists, in general, only consider the dental and the alveolar places (Ladefoged et al. Reference Ladefoged, Cochran and Disner1977). Solé’s (Reference Solé1998, Reference Solé2002) studies of the aerodynamics of the trilling action made with the tip of the tongue in Spanish suggest that the initiation phase is critical:
The conditions for initiating lingual trilling involve (i) muscle contraction of the tongue to assume the position, shape and elasticity requirements and (ii) a sufficient pressure difference across the lingual constriction. Once trilling is initiated, tongue-tip vibration is maintained as a self-sustaining vibratory system. Articulatorily, trills exhibit more predorsum retraction than taps, thus leaving more room for the vertical movements of the tongue tip and blade, and more retracted alveolar closure. In addition, the tongue body is more highly constrained for the trill than for the tap and the former articulates less with neighboring vowels. (Solé Reference Solé1998: 404)
Two parts of the tongue are involved in the alveolar trill: the tip of the tongue, and the root of the tongue. Phonetic trills are produced with the tongue body more constrained and the predorsum more lowered than for taps (Recasens & Pallarès Reference Recasens and Dolors Pallarès1999). The predorsum retraction leading to a pharyngeal constriction may be a characteristic of some rhotic segments including at least the trill (Boyce et al. Reference Boyce, Hamilton and Rivera-Campos2016) and could explain the late acquisition of trills observed crosslinguistically (McLeod & Crowe Reference McLeod and Crowe2018).
Phonemic trills are no exceptions when it comes to allophony. Solé (Reference Solé1998: 412) notes that in Spanish, taps, approximants and fricatives occur as allophones, and there is variation between dialects (Penny Reference Penny2000). Variation in production is found in other well described languages. Rennicke (Reference Rennicke2015: 239 Figure 8.1) observes that in Brazilian Portuguese (which contrasts two rhotics), the rhotic allophonic family can be very large, as judged from productions ([ ]). Sebregts (Reference Sebregts2014) for Dutch reports twenty possible variants and highlights the fact that taps that are highly frequent are not ‘failed trills or even successful trills with a single contact’ but intended as such. He favors the hypothesis that taps might historically originate from trills without excluding the possibility of a ‘reverse directionality’ (Sebregts Reference Sebregts2014: 179). The articulatory reduction and the reinterpretation of the allophonic alveolar trill can result in the emergence of allophones; for example, the change of place of articulation from alveolar to uvular, or the emergence of fricatives as ‘“failing” trills’ (Solé Reference Solé1998: 412–413). The interactions and relationships between different variants of the same phoneme /r/ are also studied (for example, for Greek (Baltazani & Nicolaidis Reference Baltazani and Nicolaidis2013), for Dutch (Sebregts Reference Sebregts2014), or for South-Tyrol Italian (Spreafico & Vietti Reference Spreafico and Vietti2016) in an attempt to understand the linguistic and social factors that may play a role in the variation found on the surface, and the place of the trill [r] in this allophone network.
Some authors suggest that, from an acoustic point of view (which is, to state the obvious, not necessarily the same as an articulatory point of view), a trill is basically a sequence (or series) of taps and, conversely, that a tap is a trill with a single period (Lindau Reference Lindau and Fromkin1985, Ladefoged & Maddieson Reference Ladefoged and Maddieson1996). However, there does not seem to be a consensus on the number of periods that an allophonic trill should have, varying from two or three closures up to between six and eight, and it may depend not only on the variety, but also on the context and the speaker (Henriksen & Willis Reference Henriksen, Willis and Ortega-Llebaria2010). Moreover, the aforementioned references suggest that in a linguistic system one can observe a coexistence of a single-closure allophonic trill with a genuine tap, leading to another source of confusion. The phonetic alphabet does not capture these fine differences in the number of closures in part because there are no conventions on how to represent them.
One can note several conditioning factors that affect the distribution of the allophones mentioned above. The most important are probably the phonetic/phonological environment and the broader context (i.e. the discourse settings and the communicative intentions). The phonetic and phonological environment includes the position within the syllable and the word, with intervocalic and final position favoring taps, while word- and syllable-initial positions favoring trills; trills are also more frequent in stressed syllables (Eades & Hajek Reference Eades and Hajek2006, Iskarous & Kavitskaya Reference Iskarous and Kavitskaya2010, Mooney Reference Mooney2014). There is indeed an association between stress and the increase in subglottal pressure (Lieberman Reference Lieberman1966), with a minimum difference of two to three centimeters H2O for voiced trills across the glottis (Solé Reference Solé2002). The wider context most frequently refers to careful or emphatic speech, and to the use of citation forms, all favoring the production of trills (Breen & Dobson Reference Breen and Dobson2005, Baker Reference Baker2016). Inter-speaker variation in the production of allophonic trills and taps is also somehow overlooked despite being widely accepted: ‘taps are not produced in the same way in different languages nor are they always produced in the same way by different speakers of the same language’ (Lindau Reference Lindau and Fromkin1985: 161). This ambiguous recognition status is probably related to a lack of studies addressing this type of variation. When considering the phonemic trill, an additional (sociolinguistic) source of inter-speaker variation is that individuals that fail to master the trill in languages that normatively use it, are usually considered as having a speech disorder (Romano Reference Romano2013), and may undergo speech therapy. Such individuals may systematically use sounds that are similar to the target trill but might be easier to articulate (e.g. alveolar approximants), but these substitution patterns seem to differ across languages and speakers (e.g. the allophonic palatal approximant [j] studied in Alfwaress et al. Reference Alfwaress, Emad Al Maaitah and Abu Zama2015 for native Arabic speakers).
2.3 The cross-linguistic occurrence of ‘trills’
For several reasons, which we detail below, we concur with the following assessment in Lindau (Reference Lindau and Fromkin1985: 161):
An actual trill realization of an /r/ is not as common as might be expected from descriptions of languages, where an /r/ is often labeled as a ‘trill’. Even in languages where a possible realization is a trill, not all the speakers use a trill, and the speakers that do, have a tap and approximant allophones as well as the trill.
In fact, the orthography used by the IPA – based on the Latin alphabet extended with additional symbols – probably had a huge impact: some symbols from the extended set might appear ‘stranger’ than others, encouraging the preponderant use of the ‘basic’ Latin IPA characters. It seems that some authors may have a preference towards the use of the symbol r even if the actual sound is realized with one closure only, while others use the same symbol to report all ‘r-like’ sounds (r in some cases referring to speech sounds made at the uvular place of articulation). The historical development of the IPA itself may have contributed to this situation, as there was no symbol for the tap before 1908, with being used for the ‘untrilled lingual r’ (the alveolar approximant). Even after its proposal in 1908, the symbol had to wait until 1928 before appearing in Le Maître Phonétique in the IPA chart included in reprinted Daniel Jones’ article ‘Das System der Association Phonétique Internationale’ (Weltlautschriftverein; Jones Reference Jones1928).
While in some languages, such as Spanish and Catalan, the presence of phonemic trills and taps has been intensely studied, for many others this is still not the case. This is further complicated by contextual, dialectal, sociolectal and idiolectal variation, as well as by language contact and ‘rarity’: for example, in Bearnais (Gascon), a standard French [ ] might be produced as an allophone of the apical rhotics /r/ and / / (Mooney Reference Mooney2014); and in Isthmus (Juchitán) Zapotec, an alveolar trill is present in ‘less than a half dozen words’ (Pickett, Villalobos Villalobos & Marlett Reference Pickett, Villalobos Villalobos and Marlett2010: 366). In some language varieties, an alveolar trill allophone of the phonemic rhotic may be stigmatized, as in, for example, Japanese (Vance Reference Vance2008, Labrune Reference Labrune2012, Ooigawa Reference Ooigawa2015) and French (Premat & Boula De Mareüil Reference Premat and Boula De Mareüil2018). The allophonic tap can also be a source of stigmatization as in Cagliari Sardinian where it is an ‘stigmatized’ allophone of the alveolar plosives but not for the intervocalic /r/ (Mereu Reference Mereu2020). Scottish English is a good illustration of the discrepancy between reality and a metalinguistic representation biased in favor of trills: it turns out that trills are not common in Scottish English. Lawson, Scobbie & Stuart-Smith (Reference Lawson, Scobbie, Stuart-Smith and Lawson2014: 57) report that ‘many speakers, when questioned, will say that a typical Scottish /r/ is a trilled /r/, even though this is rarely the case nowadays’, and many recent studies systematically show the rarity of trilling in Scottish English (Pukli Reference Pukli2006, Stuart-Smith, Lawson & Scobbie Reference Stuart-Smith, Lawson, Scobbie, Celata and Calamai2014, Jauriberry Reference Jauriberry2016). The large variation observed in the phonemic trill may be in part explained by the fact that allophonic trills are articulatorily more complex than allophonic taps (the latter being acquired earlier than the former – McLeod & Crowe Reference McLeod and Crowe2018), require more energy and a precise control of the parameters involved in its aerodynamics, which can lead to trilling failure.
Therefore, it seems that more than a century after the publications of the first international guidelines on phonetic transcription, the way the symbol r is used throughout the linguistic community is still not consistent and universally shared. We argue here that this situation is highly detrimental and significantly hinders the capacity to develop a typology of the [r] sound and of the rhotics from a phonetic perspective. Answering questions such as whether or not [r] is rare becomes unnecessarily complicated, and querying databases needs extra care to identify and control for the noise induced by the (sometimes implicit) use of different guidelines and expectations concerning the way sounds should be represented.
3 Data and method
To address the problem with the generic use of r, we conducted a full analysis of all the articles published in the peer-reviewed Journal of the International Phonetic Association in the collection Illustrations of the IPA.Footnote 3 The journal began at the end of the 19th century, initially titled Le Maître Phonétique and fully written in the Alphabet Phonétique International (API), or International Phonetic Alphabet (IPA) in English, changing its name to JIPA only in 1971, and moving away from articles written in IPA (in different languages) and towards standard orthographies. The Illustrations of the IPA section first appeared in Volume 20 of JIPA, in 1990, following a decision of the 1989 Kiel Convention (Roach Reference Roach1989: 77–80; IPA 1990). Since then, Illustrations has been compiled in two volumes, in 1995 and 1999. Prior to the creation of the Illustrations section, transcriptions of ‘The North Wind and the Sun’, dating back to the earliest days of Le Maître Phonétique, were published variously in The Principles of the International Phonetic Association (IPA 1949) and particular volumes under various sections, such as (Specimen).
The original aim of Illustrations was to offer a linguistic foundation to the written symbols, or graphemes, chosen to represent the sounds of spoken languages, with each illustration focusing on a single variety (‘language’, dialect or sociolect). The idea was and still is that each such contribution should define the phonemic inventory (or, if not possible, at least to sketch a first version) using the IPA symbols. Some illustrations also contain a summary of the studies on the variety, and a brief sociolinguistic sketch (including speaker population size, socioeconomic status of the informants, bi-/multilingualism, and relationships to an official variety). Over the years, Illustrations has provided invaluable insights into the sound systems of a large number of languages and varieties.
Each illustration provides a narrative which is, in general, based on a single speaker who pronounces the poem ‘The North Wind and the Sun’ in their own variety, but some authors chose to include a different story. This narrative is recorded and then transcribed. The most recent instructions for contributors (IPA 2021) indicate that the transcription must be ‘phonemic’ (highlighting the contrasts in the language) but that a ‘narrow’ transcription can also be added (highlighting the phonetic specificities of the variety and/or of the idiolect(s) of the speaker(s)).
3.1 Data collection
The primary data consists of all the relevant information extracted from illustrations available on the JIPA website. Due to a change of style after 2000, when Cambridge University took over the production of JIPA (IPA 2000), we split them into those published before 2000 (42 illustrations), and those published between 2000 and 2020 (168 illustrations). We checked the consistency of the pre-2000 illustrations with their re-edited version in the Handbook of the International Phonetic Alphabet: A guide to the use of the International Phonetic Alphabet in 1999 (IPA 1999), and we included two additional illustrations present only in this source but absent from the JIPA website (American English by Ladefoged Reference Ladefoged1999: 41–44 and Portuguese (European) by Cruz-Ferreira Reference Cruz-Ferreira1999), resulting in a total of 213 illustrations, including 46 published before 2000.
We compiled the extracted information in a table (in CSV format, available in the Supplementary Materials accompanying this paper), with the following information:
-
the title of the illustration (which corresponds to the name of the lect/variety)
-
the year of publication of the illustration (for those recent illustrations included in the FirstView section of the JIPA website (Arvaniti Reference Arvaniti2019: 472), we used the year the illustration appeared online which does not necessarily coincide with the year of publication)
-
the author’s/authors’ name(s), affiliation(s), e-mail address(es)
-
the geographical area, the country/countries, and the specific region(s) in which the variety is spoken
-
the size of the speaker population, if available
-
if available, information concerning the informant(s), such as the sex, age and
-
socio-economic status
-
the section(s) of the narrative transcription which allowed us to infer if the transcription is ‘phonemic’ (i.e. using a phoneme representation, usually known as a ‘broad transcription’) or rather ‘phonetic’ (i.e. using phonetic representation, usually known as a ‘narrow transcription’) (IPA 1999: Section 5). While some illustrations have one transcription only, a few of them include both phonemic and phonetic transcriptions (for more details, see Section 2.5 above).
We manually added the location where the variety is spoken, and, when there was no indication of the actual geographical coordinates, we used Google maps and Open Street maps to estimate the latitude and longitude, starting from the reported name of the place. We used a broad classification of macroareas based on that proposed by Glottolog (Hammarström et al. Reference Hammarström, Forkel, Haspelmath and Bank2020).
3.2 Defining and coding ‘r-like’ sounds
We decided to focus on two characters, r and (respectively, lower-case R and fish-hook R; Pullum & Ladusaw Reference Pullum and Ladusaw1996), and the characters derived from these through the addition of diacritic(s). In our analysis we refer to the lower-case R as the , and to the fish-hook R as the .
We looked for their presence in the consonant chart, and the transcription(s) of a narrative given at the end of the illustrations. We then considered the ‘descriptive labels’, the way authors refer to symbols, following, in most cases, the phonetic categories given to the symbols in the consonant chart found at the beginning of each illustration (place of articulation, manner of articulation and voicing). For some older illustrations no such chart is given, leading us to retrieve the labels from the indications present in the text. Our analysis is focused on the alveolar trill and the alveolar tap, leaving the other rhotics for future studies, but we nevertheless listed the other segments mentioned in the illustrations when they can potentially be considered as ‘rhotics’ based on Magnuson’s (Reference Magnuson2007) classification (for example , , , , ), to provide an overview of the rhotics and ‘r-like’ sounds present in the illustrations and to facilitate future work.
3.3 Counting ‘r-like’ sounds
As we are mostly interested in the use of trill and tap in the transcribed narrations of Illustrations here, we manually counted them specifically focusing on:
-
the number n[r] of tokens of r in the phonetic transcription
-
the number n/r/ of tokens of r in the phonemic transcription
-
the number n[ ]of tokens of in the phonetic transcription
-
the number n/ /of tokens of in the phonemic transcription
-
the number of rhotic tokens: This is based on the segments we considered as rhotics in the transcriptions given by the authors. We looked for the different symbols that were used for the trill, tap, and other r-based segments, checking what the authors wrote about whether we should consider them as rhotics or not (for example, we excluded from this count the in phonetic transcriptions when they were an allophone of a plosive). In some cases, we looked at the orthographic transcription that was provided by the authors to give us clues about <r> segments that could be omitted or realized differently from what was expected. When several transcriptions are provided, we have used the highest number nR across the different transcriptions.
An additional logical variable denoting the presence or absence of <r> in the orthographic system of the language or in the transliteration was added if available. For example, in some illustrations where the language described uses a Cyrillic alphabet, we find in the transliterations <r> standing for <p>, in which case we consider that the language does contain <r> in its orthography.
Although this data was collected by a single coder (RA), this was done in several separate ‘rounds’, which enabled quality checking and, if necessary, correction of earlier rounds in subsequent later rounds. The first round involved the collection of metadata for the illustrations and segments of interest; the second involved the manual addition of geographic locations. The collection of quantitative data was done over two rounds, which consisted of counting r tokens in the transcriptions, followed by counting tokens.
3.4 Counting all segments nSeg
Finally, in order to normalize the number of occurrences of rhotics across the illustrations, we also counted the total number of segments in each transcription. Although all the transcriptions come from Illustrations, there were inconsistencies in the format in which we recovered them:
-
some transcriptions are available in text format (.txt) from Baird, Evans & Greenhill (Reference Baird, Evans and Greenhill2022) (available at https://github.com/SimonGreenhill/jipa)
-
when possible, transcriptions were copy-pasted from the PDF of the illustrations or from the Illustrations web page hosted by the JIPA
-
some transcriptions from which copy-pasting was not possible were submitted to Optical character recognition (OCR) using the R package tesseract (Ooms Reference Ooms2021) and corrected for potential errors or were manually typed
If there was more than one transcription in an illustration (for example, one phonemic and one phonetic transcription, or for different varieties), we counted the number of segments in the broadest transcription available, or by averaging the counts found in the transcriptions when they had the same level of phonetic precision. We wrote an R script to automatically count the alphabetical segments, using regular expressions excluding numbers, diacritics, second manners of articulation, suprasegmental symbols and punctuation.
3.5 Phonetic and phonemic transcriptions
One of the most important and tedious steps was to classify the transcriptions according to their degree of detail. For this, we decided to look specifically at the information provided by the authors, mostly in the section headers, without any interpretations or inferences of our own. Because of the great variability in the naming of the section headers, it was nevertheless necessary to harmonize them. Therefore, we defined a binary distinction between ‘phonemic’ and ‘phonetic’ transcriptions, as follows. A phonetic transcription is inferred for section headers containing the terms ‘phonetic’, ‘narrow’, ‘allophonic’, ‘semi-narrow’, and/or ‘detailed’, while a phonemic transcription corresponds to the sections containing the following terms ‘phonemic’ and/or ‘broad’. In some cases, the authors specify that their transcription was either ‘broad phonetic’ or ‘narrow phonemic’ [瀃椀怃]; here, we only kept the terms ‘phonetic’ and ‘phonemic’ to characterize these transcriptions.
Seventy-three transcriptions were still difficult to classify due to a lack of information, in which case, by default, we made the choice to systematically consider the transcriptions phonetic. This stems from the assumption that the purpose of Illustrations is to convey the phonetic realization of the language, as it is explicitly mentioned in the report on the 1989 Kiel Convention that the transcription should represent ‘what is actually recorded on the tape rather than an idealization of what might have been uttered’ (Roach Reference Roach1989: 77), even if this representation may be rather broad.Footnote 4 This methodological choice is furthermore coherent with an approach in which the burden of the proof is ours: apparent trills are considered as trills as long as we cannot provide any strong arguments in favor of a non-trilled interpretation.
4 Results
4.1 General results
4.1.1 Year of publication
It can be seen in Figure 1 that the Illustrations section started to be systematically published each year in 1990, with rather sporadic occurrences before this date. In total, we collected 213 illustrations covering 51 years (from 1971 to 2021), ranging between one and 22 per year with an average of 5.3. There is a difference between the date of publication on the Cambridge website and the date of publication in a JIPA issue, usually about once year (see Section 3.1 for details concerning the date of appearance in FirstView vs. the publication date).
4.1.2 Informant characteristics
For our purposes here, there might be speaker-related differences that are relevant, such as in age and gender (Figure 2). It is not clear if many authors are aware of any idiosyncrasies their informant(s) may have: this can be so because this is not the purpose of their illustration, or because they do not consider this as an important aspect when capturing the overall phonology/phonetics of the variety. In some cases, the authors explicitly state that they want speakers as ‘representative of the variety’ as possible (Namboodiripad & Garellek Reference Namboodiripad and Garellek2017: 109).
Finally, since the authors are themselves academics, they tend to select speakers that are familiar with this environment, so most informants are considered as (well) educated and with a ‘correct’ pronunciation (according to the authors). Because the accessibility of the informants is an important factor, it is not uncommon for them to be students, researchers (some being the authors themselves) or teachers.
4.1.3 Geographic distribution, families and population size
There is an over-representation of Eurasian languages, in particular Indo-European languages, and just a few Australian languages (see Figures 3 and 4). Among the illustrations reviewed, a few languages are spoken by large populations and a few have very small speaker populations: the population of a variety covered by Illustrations is about 12,000,000 speakers on average, with a median of 175,000 (Figure 5).
4.2 Transcriptions
4.2.1 Phonetic and phonemic transcriptions
Of the 213 illustrations, most had only one type of transcription (120 only had a phonetic transcription and 38 only a phonemic one) while 54 had both (phonetic and phonemic). The latter are the most informative for comparing the choice of the phonemes with their actual phonetic realizations. Finally, one single illustration did not have any transcribed narrative, this illustration corresponding to ‘Notes of a Westmeath dialect’, published in 1971, in the first volume of the Journal of the International Phonetic Association.
4.2.3 Orthographic transcription and transliteration
Among the illustrations, there are several orthographic systems besides the Latin alphabet, and some of the studied languages of oral tradition do not have any orthographic system. Of the 213 illustrations, there are 134 (63 $\%$ ) where we found an <r> in the orthography or in the transliteration. Among these, there are 47 (35 $\%$ ) languages where a double <rr> was used (we do not exclude the possibility that other languages could also use <rr> in general, but the short length of the narrative does not allow us to provide a definite answer). There are 13 (6 $\%$ ) illustrations where the language does not have an <r> in the orthography, and 66 (31 $\%$ ) where either the language has no written tradition, or does not use the Latin alphabet and does not provide any transliteration in the Latin alphabet.
4.3 ‘r-like’ sounds
4.3.1 r in transcriptions
We focus on the illustrations where the symbol r is present: there are 136 illustrations (64 $\%$ ) where r is present in the consonant chart (105 illustrations) or there is at least an [r] or an /r/ in one of the narrative transcriptions (31 illustrations). The analyses presented in the rest of the article are based on this corpus. Among the 105 illustrations of the first type, 84 associated r with the ‘trill’ or ‘rolled’ manner of articulation, based on the descriptive label derived from the consonant chart. In 16 cases, the usage of r does not match with an actual trill description. An overview of the consonant charts leads us to consider that the r is used for several manners of articulation (as an ‘approximant’, ‘tap or flap’, ‘plain tap’, ‘flap’, ‘fricative or approximant’), different places of articulation (ranging from ‘dental’ to ‘velar’), or sometimes is only just (under)specified as a ‘rhotic’. As an example, in the Shipibo illustration (Valenzuela, Márquez Pinedo & Maddieson Reference Valenzuela, Márquez Pinedo and Maddieson2001: 282) it is stated that ‘[t]he symbol /r/, chosen for its simplicity, also represents a highly variable segment’. There are also five cases where there is no descriptive label that can be derived from the consonant chart or the illustration, preventing us from inferring what r stands for. We did not change the symbols when the r did not match with an actual trill description.
We therefore divide the illustrations in three categories according to the type of transcription they include, and we present below the results per category, starting with the more informative with regard to phonetic substance.
4.3.1.1 Illustrations with both phonemic and phonetic transcriptions
The first category includes 37 illustrations where the authors provide both phonemic and phonetic transcriptions, and which include at least one occurrence of an r either in one of the transcriptions or in the consonant chart. Having both these transcriptions allows us to compare the phonemic transcription (which is the result of the linguistic analysis of which contrasts are important in the language) and the phonetic transcription (which arguably should be closer to the phonetic reality of what the speakers actually produced).
We show a summary of these 37 illustrations in Figure 6. There are some interesting asymmetries between the use of r in the phonetic and the phonemic transcriptions. For example, 24 illustrations contain an /r/ but do not contain an / / in their phonemic transcription (left). All of these 24 illustrations include an r in their consonant chart (left → middle). Ten of these 24 illustrations have a [r] but no [ ] in their phonetic transcription, 11 illustrations have a [r] and a [ ] in their phonetic transcription, one that in its phonetic transcription does not have any [r] or [ ], and two that have no [r] but only [ ] (middle → right). Also, there are two illustrations where both segments (trill and tap) are absent from the consonant chart (left → middle): in both cases, the phonemic transcriptions do not contain the segments, but the phonetic transcription for one illustration contains a [r] and for the other illustration it contains both segments [r] and [ ].
As shown in Figure 6, some authors do not always mention the r in the consonant chart but do use the symbol in one of the transcriptions. In other cases, the use of r in a phonemic transcription can be associated with a [r], with a [ ], with both, or even with none of these two segments. These results show that the presence of a [r] cannot be directly associated with a /r/ phoneme, in some cases the phonetic trill being associated with a /r/ but also with a / /, with both segments or even with none.
We use the varieties of English as an example of how the use of the symbol r instead of the symbol for the alveolar approximant is driven by simplicity of use and not by phonetic or phonological reasons. This is clearly the case for Australian English and British English Received Pronunciation, where r is found in the phonemic transcription while [ ] is the main allophone found in the phonetic transcription. Things are more complicated for Liverpool English, where [r], [ ] and [ ] are used in the phonetic transcription while the author only mentions [ ] and [ ] as allophones, leading to difficulties in interpreting the symbol r since it is not specified in the illustration as a possible allophone of /r/.
The use of the symbol r is not necessarily transparent in phonetic transcriptions. The symbol r is found for Zurich German but its interpretation is made difficult by the fact that while /r/ may have alveolar and uvular allophones, [r] is not specified as one of them. It is therefore not possible to say with certainty which allophone has been represented.
The use of r does not necessarily imply the occurrence of [r] in phonetic transcriptions. The Sasak, Meno-Mené dialect illustration uses the symbol r for the phoneme in the phonemic transcription, but the symbol is absent from the phonetic transcription. The authors do specify that ‘/r/ is sometimes produced as an alveolar tap’ (Archangeli, Tanashur & Yip 2020: 97) while an actual trilled realization occurs in word initial and word final positions. All occurrences of /r/ in the transcription appear in intervocalic positions and are realized as [ ], with one occurrence of /r/ in a pre-nasal context being omitted. [ ] seems to be a common realization of /r/ when the latter has more than one allophone in the phonetic transcription.
In Table 1, the non-colored (i.e. white background) rows represent the illustrations where the number of r tokens is lower in the phonetic transcription than in the phonemic one. The last column indicates how the segments are realized and hence highlights the variation in the realizations of the rhotics that the authors reported. The grayed rows in Table 1 highlight the cases where in there are more r in the phonetic transcription than in the phonemic one: except for Itunyoso Trique (where there is a difference of two more occurrences of [r]), for the rest of the illustrations the difference consists of one more [r].
Finally, it is important to highlight the red rows in Table 1 (11 illustrations) which denote when the same number of r tokens is present in both transcriptions. This can be interpreted either as there being no explicitly reported variation among rhotics in these illustrations, or that all the trills in the phonemic transcription are realized as phonetic trills. In three of these cases, there is a contrast between / / and /r/, and in eight cases (including the three cases aforementioned), the transcription is said to be ‘narrow’, ‘semi-narrow’ or ‘detailed’, supporting our second hypothesis that /r/ are realized as [r]. For example, in Kalabari-Ilo, there are 50 tokens of /r/ in the phonemic transcription, realized as 50 tokens of [r] in the narrow transcription also provided. As allophones other than the trill are mentioned, we infer that all tokens of /r/ are trilled.
4.3.1.2 Illustrations with phonetic transcriptions only
Illustrations containing only an r in their consonant chart have different types of possible production (Figure 7). We found illustrations where only [r] is present in the transcription (27 illustrations). In this first case, based on the author’s transcription, we can consider that speakers are trilling for all the instances of their ‘r’. The second case corresponds to cases where there both [ ] and [r] are present (11 illustrations), both phones expected to be allophones of a /r/. The third case corresponds to the few illustrations which contain only [ ] in their transcription when a r was presented in the chart (five illustrations), and we finally find two illustrations where there is neither [r] nor [ ] in the transcription although the segment was present in the consonant table. Regarding the 10 illustrations where both r and are present in the consonant chart, there is no [r] in the transcription for two of them while both phones [r] and [ ] are found for the other eight.
4.3.1.3 Illustrations with a phonemic transcription only
Finally, it is important to also take into account the illustrations that are grouped in our third category, where we find only a phonemic transcription (Figure 8). Although their transcriptions refer to a level more abstract from the phonetic reality of the speaker’s production, one can still see differences between what is expected from the consonant chart and the symbols that are present in the transcription. In all the illustrations of this category except one, there is at least one /r/ in the phonemic transcription. In the remaining illustration, /r/ is absent from the phonemic transcription while / / is present, in contrast with the consonant chart where there is only r and not .
4.3.2 ‘r-like’ sound frequencies
We then looked at the full transcriptions of the illustrations where there was at least one r, in order to get at a first approximation of their frequency while taking all rhotic segments (i.e. all allophones of a rhotic phoneme) into account. Since the total number of segments and the number of rhotics differs across the illustrations, it is important to consider these in order to be able to meaningfully compare the frequency of r among them.
4.3.2.1 Overall rhoticity
The overall rhoticity, defined as freqrhot = nR/nseg, was computed based on all the possible segments that could be considered as rhotics in the transcription. Critical analysis of ambiguous segments (like the which can be and allophone of a plosive, or that can behave like a fricative but not like a rhotic) was done on the basis of the full illustration, so as to only select those segments that can be considered rhotics. For languages with two or more contrastive rhotics, this measure conflates all rhotic phonemes – it does not contain information on the relative frequencies of each rhotic phoneme individually.
Unsurprisingly, Figure 9 illustrates that not all transcriptions have the same length (minimum: 210 segments, mean: 477.5 segments, median: 449 segments, SD (standard deviation): 148.5, IQR (interquartile range): 122.5 and maximum 1210 segments). By dividing by the number of segments we control for text length in order to have a normalized estimate of rhoticity. There is variation in the rhoticity, freqrhot (minimum: 0.0, mean: 0.048, median: 0.05, SD 0.023, IQR: 0.035, and maximum: 0.1). Figure 9 shows that it is not because a transcription contains a lot of segments that its overall rhotic frequency is high. For example, the transcription of the illustration of Shipibo (Valenzuela et al. Reference Valenzuela, Márquez Pinedo and Maddieson2001) contains 1,116 segments while its rhoticity is about 0.03; on the other hand, the illustration of Standard Georgian (Shosted & Chikovani Reference Shosted and Chikovani2006) contains 432 segments but its rhoticity is about 0.09 (one of the highest). A rhoticity of 0.0 corresponds to cases where there was an r in the illustration but there wasn’t any rhotic in the narrative, possibly due to its very low frequency. This measure of rhoticity can be compared with estimates of trill and tap frequencies reported below.
4.3.2.2 Trill frequency
Figure 10 illustrates that the trill [r] frequency varies between a maximum of 0.084 (the dialect of Hasselt) and a minimum of 0.0 (for 12 illustrations that do not contain [r] in their transcription), with a mean of 0.024, a median of 0.018 (both lower than for overall rhoticity), a standard deviation of 0.023, and an interquartile range of 0.031. A Wilcoxon test shows that the trill frequency is significantly lower than the mean of rhoticity 0.048 (p < .0001). This can be explained by the fact that in many cases the trill is not the only allophone used in the transcription to represent a rhotic segment (there are 43 illustrations where there is only one rhotic segment in the consonant chart, 28 where there are more than one rhotic segment in the consonant chart, and two illustrations where there is no rhotic segment in the consonant chart).
Figure 11 illustrates that trill /r/ frequency varies between a maximum of 0.088 and a minimum of 0.0 (four illustrations where there is no /r/ in the transcription of the narrative), with a mean of 0.037, a median of 0.035 (both lower than for overall rhoticity), a standard deviation of 0.025, and an interquartile range of 0.04.Footnote 5 It can be seen that phonemic trill average frequency (0.037) is smaller than the average rhoticity (0.048), but higher than that of the phonetic trill (0.024), suggesting that the trills are more present in the phonemic transcriptions than in the phonetic ones. Figures 11 and 12 show that trill [r] and /r/ frequencies have different distributions when normalized by the total number of segments in the text. A Kruskal–Wallis test shows significant differences for frequency values of phonemic trill, phonetic trill and rhoticity (H(2) = 57.549, p < .0001), and a post-hoc Dunn test shows that all the pairwise differences are statistically significantly different (p < .01).
4.3.2.3 Tap frequency
We move now from the illustrations with at least one r, to those with at least one in either of the transcriptions or in the consonant chart. This new sample was composed of 95 illustrations with on average 488 segments per the transcription, and an average rhoticity of 0.048. A maximum of 2,410 segments was obtained for the illustration of Dàgáárè (Central).
When looking at the taps in phonetic transcription (Figure 12), the tap [ ] frequency varies between a maximum of 0.083 and a minimum of 0.0 (six illustrations where there is no [ ] in the transcription of the narrative), with a mean of 0.034, a median of 0.032 (both lower than for overall rhoticity), a standard deviation of 0.024, and an interquartile range of 0.045. On the other hand, in phonemic transcription (Figure 13), the tap / / frequency varies between a maximum of 0.07 and a minimum of 0.0 (22 illustrations where there is no / / in the transcription of the narrative), with a mean of 0.02, and a median of 0.0 (both lower than for overall rhoticity), a standard deviation of 0.024, and an interquartile range of 0.039. The frequency of [ ] (0.034) is lower than the rhoticity average (0.048), but is higher than the frequency of / /. Phonetic taps are more frequent than phonemic taps. A Kruskal–Wallis test shows significant differences for frequency values of phonetic tap, phonemic tap and rhoticity (H(2) = 43.5, p < .0001), and a post-hoc Dunn test shows that all the pairwise differences are statistically significantly different (p < .01).
Finally, we compared the mean frequency of trills, taps and overall rhoticity: the Kruskal–Wallis test shows significant differences for frequency values (H(4) = 82.2, p < .0001). A post-hoc Dunn test showed that phonetic trills [r] frequency was not significantly different from the phonemic taps frequency / /, that phonemic trills /r/ frequency was not significantly different from the phonetic tap [ ] frequency, but that /r/ is more frequent than / / (p < .01) and [r] is less frequent than [ ] (p < .05).
It can be seen that, compared to trills, taps are under-represented in phonemic transcriptions, but they are over-represented in phonetic transcriptions. The perceived abundance of trills is mainly due to their presence in the phonemic transcriptions, where the r symbol encompasses, in fact, a multitude of possible phonetic realizations. One of these possible realizations is a tap, and even if phonetic trills are indeed present, they are less frequent than phonetic taps. These results do not explain why linguists choose certain graphic representations of the phonemes they describe, as the illustrations do not provide information on what makes a phonemic trill different from a phonemic tap. In some of the illustrations containing phonemic trills, we found both phonetic trills and taps, and similarly in some illustrations with phonemic taps, we found both phonetic taps and trills. An unequivocal analysis of a lect sound system based solely on its illustration may be unreachable in some cases due to linguist-specific graphemic conventions.
5 Discussion and conclusions
Illustrations of the IPA is an invaluable source of information for many types of studies in linguistics and the language sciences in general, as they tend to offer a standardized description of the sounds of the languages concerned. These articles stand on their own in providing the reader with the keys to understanding the conventions used in the phonetic and phonemic transcriptions. Importantly, they are also instrumental in linguistic studies – such as comparative studies – beyond their primary illustrative goal.
While, for some phonemes, transcription conventions are rich and accompanied by detailed descriptions of their different allophones, their context, and even, in some cases, by acoustic measurements, it appears that for rhotics such detailed information is often missing. This contributes to a lack of clarity in the literature concerning their characteristics and cross-linguistic frequency. While this situation seems to be improving in more recent illustrations, this lack of clarity was the reason for our study here. Rhotics are often represented through the symbol r, but behind this seemingly simple choice there is a bewildering heterogeneity and complexity of realizations, as has been clearly shown for several languages (Spanish: Blecua Reference Blecua Falgueras2002, Bradley & Willis Reference Bradley and Willis2012, Henriksen Reference Henriksen2015, Vigil Reference Vigil2018; Dutch: Sebregts Reference Sebregts2014; Portuguese: Rennicke Reference Rennicke2015; English: Stuart-Smith et al. Reference Stuart-Smith, Lawson, Scobbie, Celata and Calamai2014, Jauriberry Reference Jauriberry2016; Italian: Romano Reference Romano2013; Japanese: Magnuson Reference Magnuson2008; and Persian: Rafat Reference Rafat2010).
Our article quantifies, upon a careful analysis of 213 Illustrations of the IPA, the impression – shared by several linguists – that trills might be over-represented in the literature. We find that although the phoneme /r/ is not rare (105 illustrations do have an /r/), [r] seems to become less and less frequent as one approaches the phonetic reality (here, as captured to various degrees by the phonetic transcriptions present in some of the illustrations). In other words, even with a transcription material as short as ‘The North Wind and the Sun’, we find consistent clues that the r trill symbol tends to be used as the default symbol for rhotics, echoing for instance Sebregts (Reference Sebregts2014) position:
[T]he high cross-linguistic frequency of trill phonemes as reported in UPSID may not in fact be at odds with the low frequency of “actual trills” mentioned by Ladefoged et al. (Reference Ladefoged, Cochran and Disner1977), as long as the term “trill phoneme” in the sense of the UPSID is taken to mean a phoneme that is potentially realised as a trill, for which the number of actual trill realisations may be quite low. (Sebregts Reference Sebregts2014: 157)
As explained in this paper, we consider that this ambiguity results from several factors that led the linguistic community to tolerate a blind spot which mirrors the difficulty of handling rhotics in general and trills in particular. We consequently suggest that clarifying the guidelines on rhotics would improve the gold standard against which Illustrations is published as already proposed by Whitley (Reference Whitley2003: 84) by ‘[a]dopt[ing] the macron for specifying a trill [ie. r̄], leaving plain r entirely free for its widely understood value of “any rhotic”’. Beyond this aspect, we would like to emphasize that the illustrations are also integrated ‘as they are’ in secondary sources such as phonological databases. There has indeed been a recent explosion in the availability and use of such databases (Moran & McCloy Reference Moran and McCloy2019, Mortensen et al. Reference Mortensen, Xinjian Li, Alexis Michaud, Antonios Anastasopoulos, Black and Neubig2020), and in their relevance to the exploration of large-scale typological and historical questions. Such databases usually compile data from different sources, among which we can find papers published in JIPA (as, for example, in PHOIBLE and ALLOVERA). Irrespective of their intended nature (phonemic, allophonic or phonetic), they make use of the symbols proposed by transcription systems, and the symbols of the International Phonetic Alphabet (IPA) are omnipresent and currently almost universally used in linguistics. While they are arguably robust enough for large-scale studies concerning broad phonemic classes and their distribution across language families and geographical areas, our findings highlight that taking them at face value might not be sufficient for research aiming at uncovering subtle effects involving actual acoustic and/or articulatory features (such as those involving sound symbolism, the acoustic adaptation to the environment or weak effects of normal variation in the anatomy of the vocal tract). We hope that, in the near to medium future, this type of large-scale study will be able to better take into account the uncertainty concerning the actual sounds purportedly described by a symbol (and, of particular importance here, the trill), either through the availability of finer-grained data or of ways to probabilistically map a given symbol to a set of possible realizations in a given context.
Thus, our work should help better understand the current theories concerning the visual representation of languages, to judge their advantages, disadvantages and limits (Whitley Reference Whitley2003, Esling Reference Esling, Hardcastle, Laver and Gibbon2010, Anderson et al. Reference Anderson, Tiago Tresoldi, Anne-Maria Fehn, Forkel and List2018).
To conclude, while we do not think there is a single ‘silver bullet’ that can solve all these issues, we suggest extra caution in the way phonetic symbols are interpreted, especially for older sources, and to contextualize them relative to the various transcription systems in use at the time and even relative to the reforms of the International Phonetic Association.
Acknowledgements
We wish to thank Ioana Chitoran, Bob Ladd and James Kirby for feedback on earlier versions of the manuscript. We are grateful to the JIPA Associate Editor Alexei Kochetov and the reviewers for their suggestions and comments, and Jayden Macklin-Cordes for proofreading the manuscript. The authors RA and DD were funded by the IDEXLYON Fellowship Grant 16-IDEX-0005. The authors are grateful to the ASLAN project (ANR-10-LABX-0081) of the University de Lyon, for its financial support within the French program ‘Investments for the Future’ operated by the French National Research Agency (ANR).
Author contributions
RA and DD designed the study. RA collected, coded data, and performed the quantitative analysis. RA drafted the paper. All authors discussed the results and contributed to the paper’s final text.
Supplementary material
To view supplementary material for this article, please visit https://doi.org/10.1017/S0025100322000238