1. Overview and structure of the volume
Prosody and Prosodic Interfaces, edited by Haruo Kubozono, Junko Ito and Armin Mester, is a collection of 15 chapters on tone, stress, intonation and prosodic domains. Prosody is a vast topic, and thus there are a diverse range of sub-topics covered in the chapters, including typological categorisation; metricality; representations; the interactions of tone, intonation and stress; diachrony; diagnosing prosodic domains; the phonetic realisation of prosodic features; the nature of the syntax–phonology interface; and the comparison of specific formalisms that derive prosodic domains. Most chapters focus on prosodic patterns found in a single language or across dialects of a single language.
The volume is structured into three parts. Part I contains six chapters on word- and phrase-level prosody in individual languages. These chapters are primarily descriptive, but many also make a typological, historical or representational contribution. Part II includes five chapters on tone and intonation. These chapters are primarily descriptive. Part III (four chapters) focuses on the syntax–prosody interface and is the most theoretically oriented part of the volume, with all four of its chapters making specific claims about the architecture of grammar and specifically the derivation of prosodic structures.
Across the volume, there is broad areal coverage, with chapters on prosodic systems of European languages (Basque, Serbian, Swedish and English), African languages (Bantu, Rere and Nubi), American languages (Uspanteko and Choguita Rarámuri), a Papuan language (Poko) and Asian languages (Persian and Japanese). Japanese is comparatively overrepresented (chapters 10, 11, 14 and 15). Some parts of the world that are not well represented are Australia and Oceania, the Macro-Sudan Belt of Africa and East Asian languages outside of Japanese, many of which have well-documented tonal systems, but where there is much less extant work on the interaction of tone and intonation. One highlight of this volume is the careful description of many lesser-studied languages and varieties, which makes for an important empirical contribution.
2. Content summary
2.1. Part I: Word and phrase prosody
Laura McPherson’s chapter (ch. 1) on word tone in Poko (Skou, Papua New Guinea) sets the tone for Part I by revisiting Pike’s (Reference Pike1948) classification of word-level prosodic systems into syllable-tone and word-tone categories, where syllable-tone languages contrast tone on every syllable, while word-tone languages have only one tone-bearing syllable per word, or have pitch patterns or melodies that apply to an entire word rather than a syllable or mora. McPherson shows that apparent word tone in Poko is epiphenomenal and can be attributed to constraints on initial and final tones. The broader typological implication is that there is not a binary typology between word-tone and syllable-tone languages, but rather a spectrum of restrictions on tonal distribution. She proposes that a property-driven typology of prosodic systems be adopted rather than a binary categorisation system.
Many of the following chapters also suggest that continua rather than binary categorisation is a better way to conceptualise the typology of prosodic systems, including José Ignacio Hualde’s chapter (ch. 2) on accent shift in Basque, and Sara Myrberg’s chapter (ch. 4) on South Swedish. All Basque varieties have an accent contrast that could be categorised as word tone; however, there are many distinct patterns among Basque varieties, suggesting that a property-driven typology would be more informative than a binary categorisation system. Hualde compares synchronic patterns across Basque, which differ in whether position, contour shape and presence/absence of tone are lexically contrastive. He reconstructs the accent system (which can be analysed as tone, as stated on page 34) of Old Common Basque, arguing that phonological and phonetic ambiguity has led to reanalysis in the modern-day systems. Myrberg shows that Swedish and Norwegian dialects fall on a continuum between two-peaked and one-peaked accents, rather than showing a binary distinction. Myrberg analyses the South Swedish systems as derived from H drift and truncation of a H applying to two-peaked contours, which then can surface as one-peaked. West Swedish and East Norwegian fall in the middle of the continuum (showing drift but not truncation); South Swedish is the least two-peaked (with drift and truncation both applying) and Central Swedish is the most two-peaked.
Draga Zec and Elisabeth Zsiga (ch. 3), like Hualde and Myrberg, also use synchronic dialect variation to draw diachronic conclusions about the development of prosodic systems. They focus on the interactions of tone and stress across Stokavian dialects of Serbian, showing that two distinct dialect continua, one for tone and one for stress, interact and lead to dialect variation. They propose that stress is a word-level phenomenon while tone is a phrase-level phenomenon, and develop an Optimality-Theoretic analysis in which the same set of constraints applying at distinct levels of grammar can derive the surface patterns across dialects. Dialect differences arise in response to tonal crowding at the right edge, which involves a lexical H and the declarative intonational boundary L%. Generalisation to smaller domains that lack the original trigger (L%) occurs to a different extent in different dialects: declarative-final retraction is generalised to retraction from a word edge, which is generalised to retraction from a foot edge. Each stage of this derivation exists synchronically in some dialect. The data support domain generalisation as a possible driver of change, which is important for our understanding of sound change, since very few clear cases of domain generalisation are attested, though it is predicted by Becker (Reference Becker1977), Hyman (Reference Hyman and Greenberg1978) and others.
In chapter 5, Larry Hyman discusses prosodic differences between nominal and verbal phrases in Bantu languages, building on Smith (Reference Smith, van Oostendorp, Ewen, Rice and Hume2011) and others, who point out phonological differences between nouns and verbs in which nouns tend to license and preserve more contrasts than verbs across languages. Most work to date has focused on words rather than phrases, so Hyman considers whether these differences generalise to the phrasal level. He finds that verb roots tend to show fewer prosodic contrasts than nouns in Bantu, with a historical pathway to complete loss of contrast, as in Lulamogi. However, verbal words and nominal phrases have much more tonal activity than nominal words or verbal phrases. He hypothesises that this might be due to a tendency for more inflection on verbal words than on nouns, and an often-fixed order of nominal modifiers that resembles a morphological template, distinct from verbal phrases which have more flexible word order. His findings bring into question the generalisability of Smith’s (Reference Smith, van Oostendorp, Ewen, Rice and Hume2011) findings that nominal words tend to license more contrast than verbal words.
In chapter 6, Carlos Gussenhoven concludes that underlying metrical structure (such as metrical grids or trees) does not help to explain metrical patterns better than non-metrical structure. This chapter describes the metrical systems of Persian, English and Nubi, an Arabic-lexifier creole spoken in Uganda and Kenya. It is not clear exactly why these three languages were chosen rather than others, or how far the typological conclusions generalise beyond these three languages. The very helpful chart on page 161 (Table 6.1) shows which types of information (lexicon, default accent, morphology, syntax, phonology and focus) co-determine surface pitch accent in each language. Gussenhoven finds that morphosyntax plays a stronger role than phonology in determining accent. The influence of morphosyntax on accent placement suggests that accent should not be post-lexical, since post-lexical information should not have access to morphosyntactic structure. This means that the morphosyntactic generalisations could be translated into prosodic generalisations (as analysed in Part III of this volume), or pitch accents could be morphemes themselves. Gussenhoven concludes that there is no evidence for a metrical grid or metrical tree in phonological representations and suggests considering the prosodic hierarchy without strong/weak labels, default heads, and metrical grids or trees, thus trimming down the prosodic theoretical apparatus. While this is a somewhat extreme proposal, Gussenhoven still assumes the need for the prosodic hierarchy and the ability to reference prosodic categories such as feet.
2.2. Part II: Lexical tone and intonation
Chapter 7, by Ryan Bennett, Robert Henderson and Megan Harvey, gives a nice overview of the Uspanteko language (Mayan, Guatemala) and a careful description of complicated phonetic and phonological facts. Uspanteko is ‘one of the few Mayan languages to have innovated a robust, grammaticised system of lexical tone’ (p. 188), where tone interacts with stress, vowel length, and morphology. The presence of tone is lexically or morphologically conditioned, but its position is determined phonologically. There is at most one tone per word, with H tone – if it is present – on the penultimate mora. $\textrm {F}_0$ differences between contrastive tone heights are small but stable, and tone bears a low functional load. This chapter features the results of a production study on word-level prosody, run with 12 native speakers, which explores the effect of word-final intonational contours on the phonetic realisation of word-level prosody. The results show declination across the utterance, focus associated with pitch raising and stressed vowel lengthening, and preservation across conditions of tonal distinctions on long and short vowels. This chapter serves as a nice model for running phonetic studies with understudied minority language communities.
In chapter 8, Gabriella Caballero, Yuan Chai and Marc Garellek discuss the interactions of stress, tone and intonation in Choguita Rarámuri (Uto-Aztecan, Northern Mexico). Stress appears within an initial three-syllable window, and the three lexical tones HL, L and H surface only on stressed syllables, with one lexical tone per word and no toneless words. A right-edge H% tone surfaces if the intonational-phrase-final lexical tone is H or L but not HL. Phrase-finally, HL tones are rearticulated and L tones are lengthened. The authors ran a production study with four native speakers, in which they found that H tones spread to the post-tonic vowel even if that vowel is in the following word. The fall of a HL is realised over the post-tonic vowel, and pretonic positions are often low-pitched before H and HL tones and H before L tones. Tone has a high functional load in the language, but, as described for Uspanteko in chapter 7, there are very small (but stable) $\textrm {f}_0$ differences between the three lexically contrastive tone heights. Phonation helps to cue tonal differences among three of the four speakers. $\textrm {F}_0$ effects on surrounding syllables, along with phonation differences, seem to help enhance the tonal contrasts.
Chapters 9 and 10 focus on lexical and phrasal tones in Japanese. In chapter 9, Haruo Kubozono shows that when word-final lexical accents and boundary tones coincide, with the possible result of tonal crowding, different varieties of Japanese resort to different repairs, including lengthening the final vowel to host both tones, or retracting the H tone to penultimate position resulting in a possible neutralisation with lexical penult Hs. In Tokyo Japanese, there are three possibilities: final lengthening, deletion of the lexical tone or tonal coalescence. Whenever a tone deletes in tonal crowding contexts, it is always the lexical tone rather than the phrasal tone. This is consistent with an approach where phrasal tone applies later in the derivation than lexical tone, overriding tones or tonal requirements present at a previous stage. In chapter 10, Yosuke Igarashi discusses prosody in accentless varieties of Japanese. Some varieties of Japanese have no lexical contrast in accent; the pitch peak at the word level varies. The accentual phrase in some of these varieties shows a long rising $\textrm {f}_0$ . Focus is marked structurally at the left edge in these varieties and not by phonetic salience; the left edge is the structural focus position despite the pitch peak surfacing at the right edge. The author concludes that this supports the arbitrariness of sound/meaning relationships: intonation is not automatic or predictable from meaning, but is grammatical.
Chapter 11 wraps up Part II with a look at prosody in declaratives and questions in Rere, also called Koalib, a Niger-Congo language of the Nuba Mountains in Sudan. The authors, Yuan Chai, Titus Kubri Kajo Kunda, Alejandro Rodríguez and Sharon Rose, describe the Rere tone system, which contrasts H, L and HL tones and displays extensive grammatical tone (e.g., the remote perfect is all-L). They consider how lexical and grammatical tone interact with intonation in declaratives and questions. In declaratives, there is a steady pitch range, with no drift, declination or boundary tones; however, there is a final fall, with lexical or grammatical H being lower utterance-finally than elsewhere. Final Ls lower only optionally. Multiple linked Hs are all lowered, even across word boundaries, showing nice evidence for the Obligatory Contour Principle. Regular raising of successive Hs within nominal phrases, or upsweep, is found. In polar questions, there is a final low-toned [=à] which blocks final H lowering because the final tone-bearing unit is L. Pitch raising is also found in polar questions, where the rightmost H of the final word raises. Wh-in-situ questions have the same prosodic contour as declaratives. Ex-situ questions have initial Wh-words marked with a H at the left edge. A grammatical H tone at the left edge of verbs in A′-extraction contexts is posited. This left-edge H overwrites grammatical L tones but has no effect on grammatical/lexical H. While this chapter does not draw major theoretical conclusions, it is a clear exposition of using differences in tone/intonation interaction to diagnose whether specific pitch effects are morphosyntactic (like the left-edge H in A′-extraction contexts) vs. intonational (like pitch raising and boundary effects, which affect all Hs or all tones), assuming that this is a meaningful split.
2.3. Part III: The syntax–prosody interface
In chapter 12, Seunghun J. Lee and Elisabeth Selkirk formalise an indirect mapping approach between morphosyntactic structure and phonological constituency, in which the interface between morphosyntactic output (MSO) to phonological input (PI) is the sole locus of structure-to-prosody building. In this approach, MatchPhrase can be understood as a constraint on spell-out of MSO as PI. Languages differ in whether all syntactic phrases (as in Xitsonga) or just lexical ones (as in Irish) are matched to prosodic phrases. Later, phonological and prosodic markedness constraints may manipulate prosodic structure, resulting in mismatches between syntactic and prosodic structure. One possibility not considered in this chapter is whether the language-specific difference in whether all phrases or only lexical phrases correspond to prosodic phrases may be due to a difference in what gets spelled out. That is, perhaps languages differ in how much syntactic structure is spelled out at one time, and spell-out domains are matched to prosodic phrases. One point that remains unclear in the analysis is where morpheme-specific prosodic requirements, such as Bennett et al.’s (Reference Bennett, Harizanov and Henderson2018) prosodic smothering, are situated in the grammar.
Gorka Elordieta and Elisabeth Selkirk (ch. 13) use the MSO–PI prosodic mapping model introduced in chapter 12 to account for mismatches between syntax and prosody in Lekeitio Basque (LB). In LB, every prosodic phrase must contain at least one accented word. They show that the morphosyntactic structure is preserved in the prosodic phrasing except when that would lead to an entirely unaccented prosodic phrase. In the MSO–PI framework, where there are mismatches in prosodic and syntactic structure, these mismatches should be phonologically optimising, which is argued to be the case in LB. It seems clear that the MSO–PI proposal introduced in chapter 12 will continue to be developed and tested in other languages and may prove to be a viable competitor to the similar yet distinct Match Theory.
In chapter 14, Shinichiro Ishihara shows that while the constraint MatchClause predicts syntactic clauses to correspond to intonational phrases, there is no evidence for intonational phrases that correspond to clauses in Japanese. Instead, he concludes that intonational phrases are not mapped from syntax but are due to discourse–prosody mapping using Match(illocutionary clause, ι) or Match(speech act, ι); only words and phrases are mapped from syntax. It is not clear at what point in the architecture of grammar the discourse–prosody mapping takes place. I am sympathetic to the view of an anonymous reviewer, mentioned in §14.3.3, that the lack of clear evidence for clauses mapped to intonational phrases in certain varieties of Japanese does not mean that this is not a useful theoretical mechanism for deriving prosodic facts in other languages. Many languages make little to no reference to moras, for example, and Hyman (Reference Hyman, Kaye, Koopman, Sportiche and Dugas1983, Reference Hyman2011) has argued that in Gokana there is no need to refer to syllables. Perhaps all steps on the prosodic hierarchy are accessible to speakers of any language but not all are employed in every language.
The final chapter of the book, by Jennifer Bellik, Junko Ito, Nick Kalivoda and Armin Mester, reanalyses Japanese syntax–prosody matches and mismatches, considering all possible prosodic parses using automatic candidate generation and evaluation in SPOT (Syntax–Prosody in Optimality Theory), a tool developed by the authors and colleagues and available online. They conclude that Match and Align approaches are independently insufficient to derive the Japanese facts, but a framework that allows both Match and Align constraints is adequate. Conceptually, Match reflects the syntactic importance of constituency and Align reflects the phonological importance of edges, justifying the use of both. The SPOT tool and the proposal for a model that incorporates both Match and Align constraints have important implications for prosodists.
3. Overall evaluation
In sum, Prosody and Prosodic Interfaces is an important collection of descriptive and analytical work on prosody. This book is relevant to a wide range of scholars, including syntacticians, morphologists, phonologists, prosodists, typologists and historical linguists. It is more useful as a resource for prosody researchers than as a reference volume, though individual chapters may be useful for students studying prosody at the graduate level. Perhaps the most significant contributions are the thorough descriptions of tone, stress, intonation and prosody in many under-studied languages and varieties. This fits with the stated goals of the editors in the introduction to the volume: to provide a platform for cross-linguistic study of prosody. Additionally, contributions to the typological, diachronic and formal understanding of prosody are made throughout. Due to the breadth of the topic of prosody, these analytical contributions tend to be a bit scattered, without a single focus throughout the volume, though the editors have done a good job of grouping similarly themed chapters together. Surely the contributions in this volume will have an impact on the prosody literature moving forward.