1. Introduction
Most constraint-based frameworks embrace Richness of the Base (Prince & Smolensky [Reference Prince and Smolensky1993] 2004, §9.3) – the assumption that no interesting generalisations are stated as constraints on the lexicon (a.k.a. morpheme structure constraints, or MSCs). The main argument against MSCs is that they introduce duplication into the theory. When the same constraints define the shapes of morphemes and restrict derived words, surface-oriented constraints should be sufficient. Unlike MSCs, surface-oriented constraints are less abstract, and are independently necessary. This echoes earlier criticisms of MSCs (e.g., Shibatani Reference Shibatani1973, Clayton Reference Clayton1976): they are redundant and abstract.
This article revisits MSCs in the context of positional neutralisation. As I demonstrate, positional neutralisation presents an analytic problem when the affected contrasts are gapped. Analysing such neutralisation without MSCs runs into duplication. My specific focus is on Russian voicing, which was Morris Halle’s (Reference Halle1959) original battleground against structuralism – which he, incidentally, also criticised for having a duplication problem. By treating contrastive oppositions differently from non-contrastive ones, structuralism fails to capture the generalisation that Russian voicing assimilation works on all obstruents alike, whether they contrast for voicing phonemically (/b/ vs. /p/) or are obligatorily voiceless (e.g., /ʧ/). My concern is not the undergoers; rather, it is the lack of certain contrasts predicted by the popular Positional Faithfulness account of voicing neutralisation in Optimality Theory (Lombardi Reference Lombardi1999; Steriade Reference Steriade1999; Padgett Reference Padgett2002; Rubach Reference Rubach2008; Beckman et al. Reference Beckman, Jessen and Ringen2009). I show that even though this account captures the phonetics and typology of voicing contrasts, it has a problem with Russian: certain consonants need to be handled twice in the analysis. As an alternative, I argue for MSCs against those consonants in the lexicon.
Unlike other analyses, my account explains facts such as the handling of loanword [ʤ], which is borrowed as a CC cluster in Russian, and which behaves as though it is never represented as an affricate in the system. Activity in loanword adaptation is sometimes presented as an argument against MSCs (e.g., by Clayton Reference Clayton1976): if loanwords are adapted to avoid some configuration, there must be a rule; an MSC alone would not be enough. For the Russian case, this argument does not quite work. As is common, Russian loanword adaptation employs rules different from anything evidenced in the native phonology. Moreover, the patterns of adaptation are inconsistent, defying a uniform grammatical account. I argue that the patterns support the existence of an MSC in the lexicon, but should not be connected to grammatical rules for resolving violations.
The key facts are in (1)–(3). Russian has a voicing contrast in most obstruents, and this contrast is neutralised word-finally (devoicing):Footnote 1
Russian also has regressive assimilation in obstruent sequences. Affricates are obligatorily voiceless except in assimilation contexts; thus, [ʤ] never occurs in presonorant position, but it does occur as an allophone of /ʧ/ in assimilation (see (2d)). The alternation [noʧ] $\sim $ [noʤ] only ever goes in the direction shown in (2); there are no morphemes that have the alternation [noʧ] $\sim $ *[noʤ-am], which would require the UR */noʤ/.
When [ʤ] is borrowed, it is usually split into two segments, [d] and [ʐ] (see (3)). The sequence is heterorganic, with [d] being dental and [ʐ] retroflex. I argue that analytically, the sequence must be two underlyingly voiced obstruents, just like /zg/ in /mozg/ ‘brain’: [dʐ] devoices word-finally as any two-consonant sequence would, and in presonorant position and in regressive assimilation position, its parts voice or devoice as other consonant sequences would. The sequences behave as if they are never represented as affricates.
The remainder of the article starts with an overview of the problem in the context of complementary distribution vs. positional neutralisation. Then I turn to Halle’s argument (§3.1). I analyse voicing neutralisation using MSCs in §3.2. I then consider the complexities of the Russian consonant inventory, and in particular the behaviour of [ʣ], [ɣ] and [ʒʒ] (§3.3). The goal is to make a theoretical contribution as well as an empirical one: Halle’s (Reference Halle1959) presentation of the facts is incomplete and does not reflect the present state of the language. Moreover, some of the facts prove problematic for alternative analyses. §4 presents the loanword adaptation facts and argues that they are problematic for the rich base assumption. In §5, I turn to alternatives, which include specific markedness constraints (Hall Reference Hall and Radišić2007), comparative markedness and Stratal OT (Mackenzie Reference Mackenzie2024).
2. Neutralisation and morpheme structure constraints
In rule-based phonology, one of the roles of MSCs is to define the phonemic inventory of a language (Halle Reference Halle1959; Stanley Reference Stanley1967; et seq.). It is commonplace for certain sounds to occur as allophones but to not have contrastive status. Rule-based analyses exclude such sounds from URs, which implies MSCs (even if they are not overtly stated). Rules then introduce restricted sounds only in certain positions. Another role of MSCs in analysing neutralisations is that they allow rules to be simpler. Since rules often encode instructions for input-dependent changes, the more restricted the inputs, the broader the rules can be.
By contrast, in OT, many common distributions do not require MSCs, as I show next. It is positional neutralisation that presents the analytic problem, and only when the affected contrasts are gapped.
2.1. Distributions that do not require MSCs
Consider the textbook example of vowel nasalisation in which nasal vowels occur only in assimilation environments but are banned otherwise: [pa], [mã] vs. *[pã], *[ma] (see, e.g., McCarthy Reference McCarthy2002a: 83). In a rule-based account following Chomsky & Halle (Reference Chomsky and Halle1968), nasal vowels might be assumed to be systematically absent (i.e., banned) from URs (/pa, ma/, but not */mã/) and derived by a contextual nasalisation rule. In OT, UR bans are not needed for analysing such cases. Surface distributions are a matter of markedness and faithfulness rankings only. The OT grammar in (4a) derives the surface inventory [pa, mã] from the rich base /pa, ma, pã, mã/. The general schema for complementary distribution is in (4b):
One attraction of this approach is that it uses formally simple constraints: the general constraint legislates feature co-occurence, and the specific constraint governs bigram sequencing. Another attraction is that both constraints are perceptually grounded and receive robust typological support. Third, the ranking tells an intuitive story: ‘no nasal vowels, except adjacent to nasal consonants’. In contrast to rule-based treatments that insist on setting up a unique UR, the account in (4) does not have to decide which vowel is the underlying one, sidestepping the often difficult logic that is usually external to the analysis. In a rule-based framework, it is just as easy to set up a rule deriving oral vowels from nasal-only URs, so the decision to posit oral UR vowels usually recruits considerations such as typology.
Pertinent to the concerns of this article, the analysis in (4) also avoids the duplication problem, because it does not need to assume a constraint against nasal vowels in the UR, and it also does not need to sneak in limitations on inputs via representational assumptions (e.g., ‘only consonants can be specified for nasality contrastively in the input’). There is no duplication here; the analysis works and is elegant.
Positional neutralisation is similarly unproblematic under the rich base assumption, provided the affected contrasts do not have gaps. For example, in Nancowry, stressed syllables contrast oral and nasal versions of [i, e, ɛ, æ, u, ə, a, u, o, ɔ], but unstressed syllables allow only [i, e, ə, a, u] (Radhakrishnan Reference Radhakrishnan1970: 19). McCarthy (Reference McCarthy2002a), 88) posits the following ranking for Nancowry:
Under this ranking, inputs such as /batã/ and /bata/ will map faithfully (assuming final stress), but /bãta/ will neutralise to [batá]. There is no need to restrict the distribution of nasal and oral vowels in the input, or to guess what becomes of the hypothetical nasal vowels in unstressed positions. Since we do see vowels in those positions, and they are always oral, the direction of neutralisation is determined by the system. As we will see in the next section (and also in §5.5), this is not true when the system contains gaps.
2.2. Positional neutralisation with gaps
The problem arises in cases where a contrast involves a gap that is filled in assimilation environments. A classic case is nasal place assimilation, which often creates allophones confined to assimilation contexts: [m] and [n] have a free distribution except in place-assimilated clusters (*mt, *np, etc.), but [ŋ] occurs only before [k, g]. Languages with this pattern include Standard Italian (Bertinetto & Loporcaro Reference Bertinetto and Loporcaro2005), Lithuanian (Kenstowicz & Kisseberth Reference Kenstowicz and Kisseberth1979: 216) and Turkish (Kornfilt Reference Kornfilt2013: 486). As with the vowel nasalisation example, analysing the basic distribution does not require MSCs, as illustrated in (6).
In the absence of alternations, we may speculate that in non-assimilation environments, underlying /ŋ/ maps to, say, [n], as in (7):
But this analysis is incomplete: it is silent on the direction of assimilation. Why not change the oral consonant’s place instead, as in /pan-ka/ $\rightarrow $ [panta]? Jun (Reference Jun1995) argues that plosives carry place cues better than nasals, since the plosive is in prevocalic position. This perceptual property is encoded in a positional faithfulness constraint Ident-Onset (Beckman Reference Beckman1997, among others). Ident-Onset must outrank *[ŋ] to allow assimilation to create velar nasals – otherwise assimilation would be progressive if and only if the onset is underlyingly velar, as in candidate (8c).
The problem is that this ranking is incompatible with the assumption that /ŋ/ maps to [n] except in assimilation contexts. Once the analysis is augmented with positional faithfulness, onset [ŋ] sneaks back in from the rich base:
The obvious way to save the situation is to plug the gap with more positional markedness, for example, by ranking a constraint against onset [ŋ] above Ident-Onset. On the upside, *[ŋ]/onset might be independently necessary: [ŋ] often has a distribution different from other nasals; English is a familiar example (Chomsky & Halle Reference Chomsky and Halle1968 et seq.). On the downside, this approach does not generalise. Nasal place assimilation often creates segments such as [ɱ] or [ɲ], which have less subtlety to their distribution than [ŋ]. They are absent except in assimilation environments. This analysis would have to use multiple markedness constraints just for the gapped segments in the environment protected by the positional faithfulness constraint, and rank them in an order that mirrors the positional $\gg $ general faithfulness order:
If the argument for the rich base is that it avoids the duplication problem, then cases like this defeat it; surely it is more elegant to have a single MSC against velar nasals at the UR level. One of the selling points of Optimality Theory is that ‘the constraints provided by Universal Grammar are simple and general; interlinguistic differences arise from the permutations of constraint-ranking’ (Prince & Smolensky Reference Prince and Smolensky1993 2004: 6). But in this case, constraints on restricted allophones cannot be simple or general. Intuitively, the distribution is simple: [ŋ], etc., are banned except where nasal place assimilation requires them. In this analysis, however, the prohibitions on these segments must be stated twice: both below and above positional faithfulness.
It is a well-known feature of positional faithfulness constraints that they determine positions of static contrast as well as direction of assimilation. This is a feature, not a bug, and trying to solve the problem by getting rid of positional faithfulness is unlikely to work (see §§5.4 and 5.5.3). Nor is it likely that the answer lies in splitting Ident[place] into separate constraints for different manners of articulation, as suggested by a reviewer: NasPlaceAssim, Ident[place]obstruent $\gg $ *[ŋ] $\gg $ Ident[place]nasal. There are various issues that this alternative would have to resolve. The generalisations about direction of place of assimilation in NC clusters hold pretty robustly even when the consonants in the clusters are not faithful to their underlying manner features. For example, in Tswana, onsets determine the place of nasals even when the onsets are unfaithful to their underlying manner: [ɸula] ‘shoots’ $\sim $ [m-pʰula] ‘shoot me’, [rut’a] ‘teaches’ $\sim $ [n-tʰut’a] teach me’ (Gouskova et al. Reference Gouskova, Zsiga and Tlale2011). In Japanese, nasal codas assimilate to following onsets even when the codas are underlyingly non-nasal: /job-/ [jo\uline{b}-u] ‘call-pres’ $\sim $ [jo\uline{n-d}a] ‘call-past’, cf. [ʃin-u] ‘die-pres’ $\sim $ [ʃin-da] ‘die-past’ (Martin Reference Martin1975). Onsets can even determine the place of a following nasal, as in German [geb-] ‘give’, [tʀɑ:g-\textsyllabic{ŋ}] ‘carry’ (Wiese Reference Wiese1996). Positional faithfulness to onsets predicts all of these patterns.
Is there a way to reconcile the analysis of complementary distribution and this positional gap problem? The alternative I advocate is to bring back MSCs. Removing /ŋ/ from the input in an Italian-style grammar removes the need to guess as to its fate in the system, and it also simplifies the analysis of neutralisation. In the analysis in §3, adding MSCs allows for an elegant and unified analysis of Russian and Polish: they have the same input–output grammars (regressive assimilation, final devoicing), even though their inventories differ.
While I will devote effort to motivating MSCs in the case of positional neutralisation with gaps, I will not offer a definitive proposal on whether MSCs should be used in other cases, such as complementary distribution or positional neutralisation without gaps (though see Rasin & Katzir Reference Rasin and Katzir2016). MSCs are justified whenever the analyst has to guess about the fate of unseen segments. But they are not necessary in Nancowry-style positional neutralisation, since the grammar suggests the direction of assimilation. In §5.5.3, I discuss root–affix positional asymmetries, whose analysis also benefits from entertaining hypothetical inputs of affixes that contain segments seen in roots (a limited rich base). Likewise, MSCs are not needed for complementary distribution, even if they do no harm. Indeed, if we assume that MSCs are drawn from the pool of plausibly universal markedness constraints, they solve the problem that lacks a principled solution in rule-based analyses: how to decide which segments to rule out from URs. Typologically, nasal vowels are marked, justifying both the MSC and the surface markedness constraint *\~{V}. If URs are limited to oral vowels only, /pa, ma/, the ranking *NVoral $\gg $ Ident easily derives the distribution [pa, mã, *pã, *ma]. The difference between MSCs in rule-based theories and this updated approach to MSCs is that the MSCs can be drawn from the set of markedness constraints. If a constraint is an MSC, it bans certain things from the lexicon, without specifying how to remove them. If it is a constraint in the grammar, it interacts with faithfulness.
2.3. Morpheme structure constraints, not rules
Early discussions of MSCs debated whether they should be construed as rules or constraints. Rules supply instructions for removing the offending structure. Constraints simply ban the structure, leaving multiple avenues for removing it. Halle (Reference Halle1959) and Stanley (Reference Stanley1967) disagreed on this point: in Halle’s original formulation, morpheme structure rules were formally similar to regular phonological rules that legislated redundancies (e.g., if [+strident], then [–voiced]). But Halle’s MSCs could also be feature-changing (e.g., if [+strident, +voiced], then become [–voiced]). Stanley proposed to formulate all MSCs as ‘redundancy rules’, which are essentially constraints: they specify what feature combinations and sequences may occur, but not instructions for removing offending structures. Translating this into OT terms, we would say that MSCs are markedness constraints that do not interact with faithfulness, because they hold at a level where no mappings happen.
An alternative to this view is Stratal OT (§5.5), where the role of MSCs is subsumed into the stem level. There, markedness and faithfulness constraints filter the rich base. My approach and this alternative agree on the need to filter the input before neutralisation applies. The disagreement lies in whether the analysis takes a guess about what happens to the illicit segments. My analysis is agnostic of their fate: I would argue that neither the analyst nor the learner has a way of figuring out what happens to /ŋ/ in Standard Italian, or to /ʤ/ in Russian. The evidence for learners not knowing the fate of ROTB inputs is inconsistent behaviour in borrowing, which I attribute to conventional rules (§4). Russian cannot lexicalise a loanword with /ŋ/, but it also does not have a single set of grammatical instructions for removing it. As I show in §5.5, however, the specifics of Russian preclude an internally consistent analysis in Stratal OT. Both the native phonology and the loans involve heterogeneous sets of segments; no one mapping can be justified analytically. There is evidence for a constraint, but no evidence that it lives in a coherent ranking.
3. Plugging the gaps in positional neutralisation
I now turn to Russian. As explained in §3.1, Russian voicing supplied the earliest argument for MSCs and was instrumental in Halle’s framing of the duplication problem. I present my update to the basic OT analysis of Russian in §3.2. §3.3 delves deeper into the distribution of the gapped segments, both to provide an accurate description of contemporary Russian and to clarify which facts the alternatives have to confront.
3.1. Halle’s argument
The duplication problem in Russian voicing phonology was first spotted by Halle in his criticism of the structuralist phonemic level (Halle Reference Halle1959; Anderson Reference Anderson1985). As shown in (11), Russian voicing assimilation affects all obstruents, whether they contrast for voicing (e.g., /k, g/) or not (e.g., /ʧ/). In a structuralist treatment, contrastive distinctions are represented differently from non-contrastive ones. So, Halle points out, the voicing assimilation rule must be stated twice: once at the phonemic level, to cover contrastively voiced phonemes, and then again, to cover non-contrastively voiceless segments such as /ʧ/. A simple generalisation is missed:
Halle’s (Reference Halle1959: 61–63) analysis includes a morpheme structure rule that requires /ʦ, ʧ, x/ to be unspecified for contrastive voicing at the UR level, as well as a phonological rule of voicing assimilation for obstruent clusters. My analysis is similar to Halle’s: I also assume that at the UR level, certain segments are banned (viz., */ʣ, ʤ, ɣ, ʒʒ/), although this claim requires some caveats (see §§3.3.3 and 3.3.4). Two-consonant sequences such as /dz, dʐ/ are allowed in URs.
Halle’s argument for MSCs is strikingly similar to the one advanced by Prince & Smolensky ([Reference Prince and Smolensky1993] 2004) for Richness of the Base (and by extension, against MSCs). MSCs pose a duplication problem in cases where morphemes obey the same restrictions as words. As I will show, though, the popular account of Russian voicing in OT also runs into a duplication problem. The difference lies in where the duplication happens: in the constraint/rule system, or between the rule system and the MSCs.
My analysis is cast in OT for several reasons. First, within rule-based analyses, the need for MSCs is usually taken for granted, and the argument would be preaching to the converted. Not so in OT approaches, which either take Richness of the Base for granted or explicitly argue for it (Davidson et al. Reference Davidson, Jusczyk, Smolensky, Kager, Pater and Zonneveld2004; Jarosz Reference Jarosz2006a, Reference Jarosz, Wicentowski and Kondrak2006b; Tessier & Jesney Reference Tessier and Jesney2014). Second, the OT analysis of Russian voicing is appealing: it has robust typological support and is grounded in the phonetics of contrast. Consequently, it is a rare point of near-universal agreement in OT. And yet this analysis has a problem with facts known since Halle (Reference Halle1959), something that has largely gone unnoticed in the literature on voicing neutralisation (one notable exception is Hall Reference Hall and Radišić2007, discussed in §5.2). One of my goals is to revive the argument, and to contribute some descriptive depth to it. Ultimately, I believe the point about MSCs holds regardless of whether the grammar of mappings is construed in rule- or constraint-based terms.
3.2. A constraint-based analysis of Russian voicing with MSCs
The analysis of voicing neutralisation in Russian follows similar lines in many OT treatments (Lombardi Reference Lombardi1999; Steriade Reference Steriade1999; Padgett Reference Padgett2002; Rubach Reference Rubach2008; Beckman et al. Reference Beckman, Jessen and Ringen2009; Padgett Reference Padgett, Borowsky, Kawahara, Sugahara and Shinya2012). It is uncontroversial that the voicing contrast is limited to presonorant position.Footnote 2 This arises through the interaction of *ObsVoice (defined in (12a)) with a positional and a general Ident (defined in (12b)–(12c)).
The ranking is in (13a); voiced stops contrast in presonorant position, as in (13b), but devoice word-finally, as in (13c).
To enforce assimilation, Agree[voice] must dominate Ident[voice] and *ObsVoice, assuming that each obstruent in a cluster such as [db] violates *ObsVoice (see (15)). Agree is undominated: voicing always assimilates. Ident-Preson[voice] is also undominated: it ensures that the presonorant obstruent determines voicing of the cluster. The ranking derives the right results for sequences of multiple obstruents (/mozg bɨ/ $\rightarrow $ [mozg bɨ] ‘brain irr.’; /vosk bɨ/ [vozg bɨ] ‘wax irr.’; /mozg to/ [mosk to] ‘brain topic’) without embellishments.
Where I depart from Lombardi (Reference Lombardi1999); Padgett (Reference Padgett, Borowsky, Kawahara, Sugahara and Shinya2012), and everyone else is in accounting for forms such as [noʤ bɨ]. I argue that inputs with /ʤ/ are disallowed by the morpheme structure constraint in (16), */ʤ/, which prohibits voiced affricates in the lexicon.
As explained in §2.3, the MSC */ʤ/ does not interact with faithfulness or markedness constraints. It might seem intuitive that */ʤ/ is in conflict with Agree[voice], but Agree does not dominate */ʤ/. The affricate [ʤ] is created by assimilation on the surface, where the MSC does not apply. The only way to satisfy */ʤ/ is not to have lexical entries with the proscribed segments (we will see this idea applied in §4). I show below why it cannot be the case that /ʤ/ maps to [ʧ] or to [dʐ] in hypothetical forms such as /ʤip/. The claim is that the grammar does not know how to map such forms to outputs; all it knows is that they are not legal inputs.
In my analysis, there is just one constraint on voiced affricates in the grammar of Russian, and it holds of lexical entries only, where it is fully satisfied. This analysis would need to be augmented by some additional constraints such as */ɣ/ and */ʒʒ/ to be complete; I consider those in §3.3.
The factorial typology of this constraint set predicts the seven systems in Table 1 (cf. Lombardi Reference Lombardi1999, Petrova et al. Reference Petrova, Plapp, Ringen and Szentgyörgyi2006).Footnote 3
MSCs are by nature language-specific, since lexicons are language-specific. But MSCs are not completely typologically inert, despite not being part of any factorial typology. The presence of the MSC will have consequences, restricting the distribution of the segment in question. While Russian and Polish have the same input–output mappings, the presence of the MSC */ʤ/ in Russian means [ʤ] is not freely distributed. Polish lacks an analogous MSC, and so it has a full two-way voicing contrast in stridents (Gussmann Reference Gussmann2007, among others). Limiting the input changes the surface contrast possibilities but does not affect the alternation patterns predicted for voicing.
My analysis includes just one constraint against /ʤ/, but suppose there is a redundant set of markedness constraints identical to MSCs. If so, the output constraint *[ʤ] must be dominated by Agree and Ident-Preson[voice] – otherwise /noʧ bɨ/ could not map to [noʤ bɨ] (as shown in (18)). As the tableau shows, *[ʤ] favours no winners and does no work in the analysis. There is no reason to include it in the analysis if the MSC is assumed – the statement about gaps only needs to be made once.Footnote 4
Crucially, however, the ranking in (18) would yield the wrong result if a rich base were assumed. Agree[voice] and Ident-preson[voice] must dominate *[ʤ]: voiced affricates are tolerated when the alternative is voicing disagreement or devoicing a presonorant obstruent. But this ranking creates a problem under standard OT assumptions: while the ranking works for the inputs considered so far, it rules in the wrong forms if the base is rich. Input /ʤ/ is predicted to map faithfully in presonorant position, which is incorrect.
The problem is a general one for Richness of the Base. When a structure does not occur on the surface in a language, the usual OT explanation is to say that markedness outranks faithfulness. But in the Russian case, the native faithfulness ranking derives the wrong outcome: the wrong thing is ruled in. If we rejigger the ranking to map /ʤ/ to something else – say, a sequence such as [dʐ], which is how it is pronounced in loanwords (see §4) – then it breaks the account of native assimilation, since the option to decompose the affricate into a CC sequence now needs to be ruled out for native voiced affricates in contexts where they are derived by regressive assimilation.
The usual avenues for saving a theory are to question the facts (e.g., by questioning the phonetic reality of assimilation) or to modify the theory by complicating it (e.g., by adding elaborate constraints, or entire derivational levels as in Stratal OT). The first strategy does not seem promising. All sources that discuss the assimilation pattern in sufficient detail agree on the facts (Halle Reference Halle1959; Comrie et al. Reference Comrie, Stone and Polinsky1996; Garde Reference Garde1998, among others): Russian has non-contrastively voiced allophones in assimilation contexts (see also Figures 1–4). This contrasts with other claims made about Russian voicing, such as sonorant transparency, which has engendered far more controversy (see Kulikov Reference Kulikov2013 for a review). The second strategy of complicating the theory comes in many flavours, as I discuss in the section on alternatives in §5. The problem with these, I argue, is that they either introduce the duplication problem or do not succeed on OT terms. Before discussing alternatives, I consider the full complexity of the Russian facts: the other inventory gaps and loanword adaptation.
3.3. Russian voicing assimilation in detail
The consonant inventory of Russian is in Table 2. (Orthographic representations are given in angle brackets where Russian/Slavicist readers might find them useful.) There is a [ $\pm $ back] (palatalisation) contrast for most consonants (Rubach Reference Rubach2000; Padgett Reference Padgett2003). There is also a voicing contrast for most obstruents.Footnote 5 The strident and velar series have gaps in voicing and backness (in the shaded regions of the table). Both affricates are voiceless, and they lack same-place backness counterparts: [tsˠ, tʃʲ] but not *[tsʲ, ʈʂˠ]. The consonant [ʃʃʲ] lacks a robust voiced counterpart; [ʒʒʲ] is at best marginal (see §3.3.5). Finally, there are no retroflex/velarised affricates, *[ʈʂ ɖʐ] (this is unlike Polish, whose strident affricate and fricative contrasts have no gaps in place or voicing; see Gussmann Reference Gussmann2007; Padgett & Żygis Reference Padgett and Żygis2007; Żygis & Padgett Reference Żygis and Padgett2010).
Table 2 includes transcription details that are often omitted, and I suppress them in the remainder of the article (thus I follow the convention of marking palatalisation but not velarisation, and only where there are robust contrasts; see Padgett Reference Padgett2003; Iskarous & Kavitskaya Reference Iskarous and Kavitskaya2018; Gouskova & Stanton Reference Gouskova and Stanton2021).Footnote 6 I transcribe vowels phonemically, except for [i] and its back allophone, [ɨ], which occurs after velarised consonants. (I ignore the analogous retraction of [e], [kˠófʲe] ‘coffee’ vs. [kˠafˠé̠] ‘cafe’.)
The paradigm in (20) shows assimilation patterns for a fuller range of consonants. The unpaired consonants appear in the last four sets of forms, in (20e)–(20h).
3.3.1. An aside on morphology and prosodic phrasing
The examples in (20) show voicing assimilation to clitics. These clitics have a syntactically determined distribution, occurring roughly in second position of the clause (attached to the topicalised or focused constituent). Clitics may occur later in the clause, too, and they can be multiply instantiated. They are not selective as to the category of their hosts, occurring on nouns, verbs, adjectives and some prepositions (Gouskova Reference Gouskova2019).
Clitics such as [bɨ] and [ʐe] provide the most abundant examples of assimilation due to their wide distribution, but voicing assimilation also applies in a variety of other contexts, as shown in (21). Assimilation can occur morpheme-internally (with vowel deletion), at prefix and suffix boundaries, and in truncated compounds.Footnote 7
Assimilation can optionally apply at phonological word (PWd) boundaries, as well, as in /vraʧ bɨl/ [vraʤ bɨl] ‘doctor was’, shown in the spectrogram in Figure 1 (see §3.3.2).
The position for devoicing is word-final, not coda (as sometimes erroneously claimed). Obstruents contrast in medial codas if a sonorant follows: witness [bm] in (22a), which must be heterosyllabic as it cannot start a word. Pre-sonorant obstruents also contrast for voicing in CR codas, such as the [gr#] in (22b). By contrast, obstruents in PWd-final position devoice even before a sonorant-initial enclitic, as in (22c). Enclitics are external to the PWd unless apocope applies, in which case they undergo devoicing, as in (22d); see Gouskova (Reference Gouskova2019) for details and analysis.
Steriade (Reference Steriade1999) argues convincingly that these patterns motivate presonorant faithfulness rather than onset faithfulness/licensing (contra Lombardi Reference Lombardi1995). She suggests that presonorant faithfulness is grounded in perception, since following sonorants allow the maximum expression for voicing cues.
3.3.2. Phonetics
Before considering the individual consonants in more detail, it is important to establish that the assimilation is phonetically real – if assimilation is incomplete, the patterns in (20) could be dismissed as weak phonetic effects (see Padgett Reference Padgett, Borowsky, Kawahara, Sugahara and Shinya2012). It is well-known that word-final devoicing can be incomplete (Warner et al. Reference Warner, Good, Jongman and Sereno2006; Dmitrieva et al. Reference Dmitrieva, Jongman and Sereno2010; Roettger et al. Reference Roettger, Winter, Grawunder, Kirby and Grice2014). Burton & Robblee (Reference Burton and Robblee1997) report that assimilation, too, is sometimes incomplete in Russian, and that it depends on manner. They found that fricatives were less likely than stops to assimilate completely: ‘There was less voicing in /sd/ than in /zd/ and more voicing in /zt/ than in /st/’ (Burton & Robblee Reference Burton and Robblee1997: 109). But their study did not include affricates. It is therefore important to verify that /x, ʃʃ, ʦ, ʧ/ undergo the rule: for this pattern to merit the status of a theoretical problem, Russian speakers must encounter [ɣ, ʒʒ, ʣ, ʤ] in natural speech.
Studying voicing neutralisation in the lab is methodologically difficult: it is affected by proficiency in English, orthographic presentation of materials and pragmatics. The best source of evidence, therefore, is fluent speech produced outside the lab.Footnote 8 If voiced segments such as [ɣ, ʤ] occur in assimilation contexts in such speech, then Russian speakers must encounter them, and they must be able to derive them in their grammars. I looked for examples of assimilation-derived voiced allophones in the multimedia section of the Russian National Corpus (https://ruscorpora.ru/new/search-murco.html). Figure 1 shows [ʤb].Footnote 9 An example of the speaker’s voiceless [s] appears later in the same utterance; note the contrast between the amount of voicing in [ʤ] and its absence in [s].
Similar examples (from different speakers) in Figures 2–4 show [ʣ], [ɣ], and [ʒʒ].Footnote 10 Note the presence of voicing pulses in the labeled consonant portions, and compare them to voiceless segments in nearby words. These allophones are voiced, though in the affricates, the voicing is slightly less pronounced (Żygis et al. Reference Żygis, Fuchs and Koenig2012 discuss the reasons).
Recall that Halle’s original argument concerned not just /ʧ/ but also other non-contrastively voiceless consonants, /ʦ/ and /x/ (and it could have included /ʃʃ/ as well). In the next sections, I consider the voiced allophones of these consonants [ʣ, ɣ, ʒʒ], and the additional complications they present. The purpose of considering [ʣ, ɣ, ʒʒ] is twofold. First, the facts are historically important in phonology, but often mischaracterised. Second, some details of the individual cases make a unified account difficult, which presents a challenge for alternatives (§5).
3.3.3. The sequence [dz]
I start with [dz]. One would expect it to parallel [ʤ], but the distributions differ: [dz] does occur outside the assimilation context at morpheme boundaries, in loanwords, and in onomatopoeia. Crucially, [dz] is often ambiguous between an affricate and a CC cluster (see (23)). One telling piece of evidence that [dz] is a CC cluster is that its parts can disagree in palatalisation (see (23c-i)). By contrast, tautomorphemic [ʦ] is velarised throughout. Mismatched [tˠ-sʲ] sequences occur only at morpheme boundaries and are clearly CC clusters (e.g., [otˠ-sʲel] ‘sat away from’).
In my analysis, the affricate /ʣ/ violates the MSC in (16), just as /ʤ/ does. But there is little harm, or evidence either way, in assuming that /ʣ/ can just map faithfully – albeit in very few morphemes. Surface sequences much like [ʣ] occur in Russian, and the phonology provides few arguments for affricates: even the arguments for [ʧ, ʦ] are delicate, since Russian is phonotactically very permissive (Halle Reference Halle1959; Gouskova & Stanton Reference Gouskova and Stanton2021).Footnote 11 Since /ʣ/ has an ambiguous status, it offers few insights into MSCs.
3.3.4. The voiced velar fricative in religious exceptions
The status of [ɣ] is complicated for a different reason. Just like [ʤ], [ɣ] occurs as an allophone derived by regressive assimilation: /mox bɨ/ $\rightarrow $ [mo\uline{ɣ} bɨ] ‘moss irr.’. But in many dialects (e.g., southern ones), [ɣ] occurs where Moscow Russian has [g]. The fricative variant is stigmatised (Avanesov Reference Avanesov1984: 111–112). But some prescriptive Moscow speakers have [ɣ] in religious contexts such as those in (24), with [x] as its devoiced allophone (24c):
For many Moscow Russian speakers (including me), [ɣ] occurs only in voicing assimilation contexts. For such speakers, the distribution of [ɣ] presents the same challenge to the standard OT account of voicing neutralisation as that of [ʤ]. Under my analysis, these speakers’ grammars would include an MSC */ɣ/, following the same reasoning as for /ʤ/.
Speakers with religious [ɣ] are more challenging: is theirs a limited contrast, or limited lenition? Under the first possibility, /ɣ/ and /g/ contrast, but only religious words have /ɣ/. This analysis could be criticised, because the contrast is confounded with a stylistic difference. As long as the analyst is not bothered by missing the stylistic generalisation, there is a simple ROTB analysis: Ident[continuant] dominates surface markedness constraints against [g] and [ɣ], and velars shed no light on the MSC */ɣ/.
A more interesting (and baroque) possibility is that (24) results from variable lenition of /g/, confined to a religious sublexicon. This would capture the religious generalisation, but the analysis cannot be integrated with the ranking in (17). The problem is similar to the one presented by /ʤ/ in a rich-base account (recall (19)). Analysing regular neutralisation of /g, ɣ/ to [g] requires the ranking *[ɣ] $\gg $ Ident[cont]. But this ranking incorrectly allows /x/ to become [g] in voicing assimilation contexts, too. Let us unpack this.
One argument for the lenition analysis is that it simplifies the treatment of (24). Such an analysis does not have to decide whether those forms have lexically specific lenition of /g/,Footnote 12 or constitute a small club of morphemes that have underlying /ɣ/:
This is formalised in (26), using lexically indexed constraints (Pater Reference Pater and Parker2008, among others). Key here is *[g ], violated by instances of [g] in religious words. For these words, *[g ] triggers lenition to [ɣ]. Crucially, *[ɣ] must also outrank Ident[cont] to force hypothetical inputs like /roɣa/ to harden to [g], as in (26b-ii).
Perhaps it is apparent why this is incompatible with the analysis in §3.2. If avoidance of [ɣ] in the general lexicon can compel violations of Ident[cont], this should allow hardening in voicing assimilation contexts: /mox bɨ/ $\rightarrow $ *[mog bɨ]. The actual output [moɣ bɨ] ‘moss irr.’ violates *[ɣ], and it cannot win if the ranking of Ident[cont] and *[ɣ] is as shown in (26). We have arrived at a ranking contradiction: *[ɣ] and Ident[cont] cannot be ranked the same in (26) and (27e).
Thus, whether the MSC */ɣ/ is needed depends on the dialect under discussion, and on the assumptions about how [ɣ] is derived in non-assimilating contexts: if analysed as a contrast, there is no need for the MSC, but there is no explanation for why the contrast is so marginal. If we analyse it as lexically specific lenition, the rich-base analysis does not work.
By contrast, my account can accommodate religious exceptions in two ways. The first is admitting /ɣ/ URs as exceptions. This entails enriching the theory of MSCs to allow lists of diacritic exceptions:
This move might be independently necessary: languages sometimes violate existing MSCs by borrowing foreign segments (e.g., voiced stops in Quechua; see Gouskova Reference Gouskova, Ackema, Bendjaballah, Bonet and Fábregas2023 for a review). Another option is to treat */ɣ/ as inviolable, listing ‘God’ as /bog/ , and implement lenition using the ranking in (27). The MSC analysis does not encounter a ranking paradox because inputs such as /roɣ-a/ are not entertained, and those inputs are the only reason to rank *[ɣ] above Ident[cont].Footnote 13 Thus, the MSC theory offers multiple accounts for the full range of facts.
3.3.5. The voiced counterpart of [ʃʃ]
The last gap is [ʒʒ], the voiced counterpart of [ʃʃ] (which Halle Reference Halle1959 does not mention). Both sounds historically occurred in derived environments; [ʒʒ] has become marginal, while [ʃʃ] is now standard. There is some confusion in the Western literature on the status of [ʃʃ], due to an obsolete dialect split between St. Petersburg and Moscow (Comrie et al. Reference Comrie, Stone and Polinsky1996: 33–35). St. Petersburg retained the [ʂtʃ] or [stʃ] pronunciation for some time, and it was apparently common among post-1917 émigrés. This might be why certain Western sources, such as Garde (Reference Garde1998), do not include [ʃʃ] in the consonant inventory. Garde (Reference Garde1998, §85) characterises [ʃʃ] as a variant of [sʧ]. But Comrie et al. note that by the 1970s, the [ʃʃ] pronunciation was the only option acceptable to Russian speakers, as documented in contemporary studies by Russian linguists.
The examples in (29) show that [ʃʃ] occurs in a variety of environments. It still alternates with clusters, but it is also found in many etymologically opaque words such as (29c)–(29f). Moreover, [ʃʃ] occurs obligatorily in many morphemes in contemporary Russian, whereas [ʒʒ] is both rare and variable – there is always an option to pronounce [ʒʒ] as something else, as shown in (30).
By contrast, [ʒʒ] mainly occurs as an allophone in assimilation – but some Moscow speakers also have it in morpho-phonologically derived environments (see (30)). There are very few, if any, morphemes with instances of [ʒʒ] that could be argued to be underlying. The best candidate is [droʒʒ-i] $\sim \,$ [droʐʐ-ɨ] ‘yeast’, historically derived from /ʐd/ but synchronically opaque. The [ʒʒ] $\sim $ [ʐʐ] alternation is not well described. It seems to apply in intervocalic position and only stem-finally, conditioned by select suffixes (thus plural /-i/ conditions it, but nominaliser /-izm/ does not, as in [voʐd-izm] ‘leaderism’ *[voʒʒizm], and root-internal [ʐd] in [iʐdiv-en-eʦ] ‘dependent’ cannot become [ʒʒ]).
The eulogies for [ʒʒ] (Avanesov Reference Avanesov1984; Comrie et al. Reference Comrie, Stone and Polinsky1996: 35–Reference Iosad and Morén-Duolljá36; Padgett & Żygis Reference Padgett and Żygis2007) appear to be premature, just like those for religious [ɣ]. I found several contemporary hits in the RNC.Footnote 14 And, just as many speakers lack religious [ɣ], many speakers also lack morphologically derived [ʒʒ]. For grammars that only allow [ʒʒ] in assimilation contexts, the argument for the MSC */ʒʒ/ is parallel to that for */ʤ/. For grammars that allow [ʒʒ] in morphologically derived environments, the analysis is complicated by the well-known issues raised by such phenomena (Kiparsky Reference Kiparsky1985; Łubowicz Reference Łubowicz2002; Wolf Reference Wolf2008). But banning /ʒʒ/ from the lexicon via an MSC removes certain challenges in analysing its surface distribution. It has long been known that Richness of the Base complicates the analysis of segments that only occur in derived environments (see, e.g., Wolf Reference Wolf2007, §6), so proscribing them in the input simplifies the account. For either group of speakers, the restricted distribution of [ʒʒ] follows from it being absent from lexical representations.
3.4. Local summary
Four Russian obstruents can be argued to lack a systematic voicing contrast: [ʧ, ʦ, x, ʃʃ]. All four have voiced allophones in regressive assimilation. But in other contexts, the consonants vary. On the surface, [dz] does occur outside the assimilation environment, although it might be analysed as a CC sequence there. As for [ɣ] and [ʒʒ], they definitely occur in ambient speech, but their variable presence and restricted stylistic/morpho-phonological distribution complicate their analysis. By contrast, [ʤ] alone is found only in regressive assimilation contexts; it is this allophone, therefore, that presents the clearest argument for an MSC. Its special status is also supported by loanword phonology, as I show next.
4. MSCs in loanword phonology
4.1. Borrowing data
The preceding discussion argued for MSCs on analytic grounds: an insightful analysis of Russian voicing neutralisation must rule out gaps at the UR level. MSCs allow for a simple, cross-linguistically valid analysis of voicing neutralisation in the grammar of input–output mappings. Unlike some OT alternatives in §5, the MSC account uses formally simple constraints that can be motivated substantively.
In this section, I consider the role of MSCs in loanword adaptation, with a somewhat narrow focus on Russian. The broader question is what Richness of the Base and MSCs predict for the handling of segments that a language lacks entirely. The usual OT explanation is that inventories are determined by markedness and faithfulness rankings; if a segment is missing, it violates an undominated markedness constraint. But which faithfulness constraint is violated in the mapping from a rich input to the output? This question rarely receives a clear answer.
In the case of Russian, we will see that foreign [ʤ] is adapted as though the phonology cannot even represent it as a single sound, and the handling of loan [ʤ] leaves few avenues for saving an analysis of voicing neutralisation that does not rely on MSCs.
A striking feature of Russian loanword adaptation is that some segments are borrowed as consonant clusters. For example, Russian lacks a velar nasal [ŋ], even in place assimilation contexts, where many languages require it. Thus, in borrowings from English and German, [ŋ] maps to [ng] or [nk], as in the examples in (31a). But Chinese and Korean loans follow a different convention: source [ŋ] maps to [n], as in (31b).
Borrowing [ŋ] as [nk] could be orthographically motivated: German and English lack a single letter to write [ŋ], so Russian speakers render the orthographic cluster as a sequence of two sounds. If they were basing their pronunciations on perceptual input, they might be expected to render [ŋ] as [nʲ]. I suggest an explanation for this differential adoption pattern in §4.3.
But orthographic borrowing cannot explain what happens to [ʤ].Footnote 15 Russian borrows [ʤ] from a variety of languages, including ones where it has no single consistent spelling (English) or where orthography is unlikely to have been the main mode of contact (Turkic, Arabic via Persian). Almost without exception, [ʤ] is borrowed as [dʐ]:
Word-finally, the sequence becomes [tʂ], not [ʧ]. This is systematic and does not depend on the source language:
By contrast, voiceless [ʧ] and [ʦ] are borrowed as affricates:
There are comparatively few borrowings with [dz], and the Russian string’s status is unclear (recall §3.3.3). One source is Polish, whose [ʣ, ] are borrowed as [dz] and [dzʲ], respectively. Another source is Japanese, whose affricated /d/ before [i] is borrowed as [dzʲ], with palatalisation disagreement. It is unclear whether this pattern is guided by perceptual similarity or convention (see Kang Reference Kang, Oostendorp, Ewen, Hume and Rice2011 for more).
While [ʤ] is usually borrowed as [dʐ], there are exceptions (see (36a)). Some of these are very recent (‘gender’, ‘digitiser’). Some older borrowings from English have [ʐ]ː [ʐokej] ‘jockey’, [ʐuri] ‘jury’, [piʐama] ‘pyjamas’, [sufraʐɨzm] ‘suffragism’.Footnote 16 Vasmer (Reference Vasmer1958) speculates that [ʐokej] was borrowed from English via French, which would explain the [ʐ], but it is not clear that this holds for the other examples. This kind of inconsistency is not unusual; recall [ŋ] in (31). Consider also [h], which is [g] in older borrowings but [x] in contemporary ones (see (36b)). The glide [w] is borrowed as [v] or [u] in contemporary Russian, sometimes in the same word (see (36c)). In places where English orthography is under-informative, for example, as to the voicing of [s, z], Russian sometimes borrows [s] as [z] (see (36d)). This would be surprising if borrowing happened via perception, since Russian has both [s] and [z], and speakers should be able to distinguish voicing contrasts. I think a better explanation is that borrowing is agrammatical, and I analyse it as such in the next section.
Another argument against a perceptual account is that English, the source of the vast majority of recent borrowings into Russian, does not have true voicing in [ʤ]. The English contrast is one of aspiration. If Russian speakers were using their perception, as opposed to orthographic and metalinguistic conventions, we would expect them to (occasionally) borrow [ʤ] as [ʧ], and they do not.
The main significance of the borrowing facts is that [ʤ] is not borrowed as an affricate, even in cases where it could be devoiced to a native sound (in words like image). Most of the time, it is decomposed into a stop and a fricative, and sometimes it is borrowed as other sounds. The most troubling aspect of this pattern for an OT account is that the ranking suggested by the native voicing alternations cannot be reconciled with loanword adaptation, as I explain in §4.3. I argue instead that [ʤ] is mapped to /dʐ/ at the point of lexicalisation, by a conventional mapping rule. This rule exists to enforce the MSC, but it is not part of the grammar of voicing neutralisation.
4.2. Evidence for analysing [dʐ] as a CC sequence
The argument that loanwords are a problem for Richness of the Base needs some evidence that [dʐ] is indeed a CC cluster. This is not a foregone conclusion: the only Western study that addresses [ʤ] loans into Russian, Benson (Reference Benson1959), characterises [dʐ] as an affricate. This is, in my opinion, incorrect.
First, consider the place of articulation change in borrowing [ʤ]. If viewed as a phonological mapping, the treatment of place of articulation is inconsistent and puzzling. Russian systematically maps [ʃ] to a retroflex [ʂ] when borrowing from English, German, French and other European languages. When borrowing from Japanese, the sound variably transcribed as [ʃ] and [ɕ] maps to Russian [sʲ]: [xirosʲima] ‘Hiroshima’, [xonsʲu] ‘Honshu’. When Russian borrows English [ʤ], it renders the affricate as a sequence of dental and retroflex articulations. By contrast, Russian borrows [ʧ] without major alteration (it is phonologically palatalised, but the minor place distinction is probably too subtle to detect in the acoustics; see Jongman et al. Reference Jongman, Wayland and Wong2000; Żygis Reference Żygis2003). It is certainly not [tʂ] or [ʈʂ] in loanwords. The simplest explanation for the fate of [ʤ] is that it is conventionally mapped to stand-alone sounds available in Russian, [d] and [ʐ]. The heterorganicity of this sequence is therefore one of the best arguments for its analysis as a CC cluster.
Next are some language-internal arguments for analysing [dʐ] as two segments, following Trubetzkoy (Reference Trubetzkoy1939). First is the diagnostic of phonotactics. Phonotactic arguments work well in languages like Fijian, in which words cannot start with CC sequences, but they may start with [mb] and [nr]; the analysis of the phonotactics is simpler if these sequences are prenasalised consonants rather than CC clusters. In Russian, by contrast, phonotactic patterns are rather permissive: words can start with many different sequences; even [ʧ] and [ʦ] cannot be distinguished from stop-fricative sequences on this basis (Gouskova & Stanton Reference Gouskova and Stanton2021). Thus, phonotactics is of little help.
But phonotactic permissiveness no doubt facilitates the borrowing of sequences that do not occur in native morphemes. This is relevant to another of Trubetzkoy’s (Reference Trubetzkoy1939) diagnostics: affricates are supposed to be freely distributed within morphemes, while CC clusters might be more likely to occur at morpheme boundaries only. In native Russian words, [dʐ] occurs across morpheme boundaries (e.g., [pod-ʐog] ‘arson’), but not morpheme-internally. The native portion of the Russian lexicon suggests that [dʐ] must be two consonants. Benson (Reference Benson1959), by contrast, speculates that occurrence at morpheme boundaries facilitates borrowing of [dʐ] as an affricate. If this were a legitimate pathway towards borrowing non-native sounds, there would be no debate in English phonology about the status of [ts], given words like ou\uline{t-s}ide and ca\uline{t-s}. As it is, most analyses of English treat [ts] as a CC sequence partly because it almost never occurs morpheme-internally (see Gouskova & Stanton Reference Gouskova and Stanton2021). Thus, morphological distribution diagnoses [dʐ] as a CC sequence.
The third Trubetzkoyan diagnostic is phonetic duration: CC sequences should be longer than single Cs (affricates). Brooks (Reference Brooks1964) shows that duration correlates with the trzy/czy distinction in Polish; the CC parts of [tʂɨ] ‘three’ are longer than the parts of the affricate in the question particle [ʈʂɨ] (though in Polish, the difference mostly affects the fricated portion, not closure). This diagnostic is problematic when applied cross-linguistically (Arvaniti Reference Arvaniti2007; Stanton Reference Stanton2017; Gouskova & Stanton Reference Gouskova and Stanton2021), but for what it is worth, it goes in the same direction in Russian as in Polish: the affricates are shorter than stop-fricative clusters. Figure 5 shows loanword [dʐ] and native heteromorphemic [d-ʐ], from recordings of the same speaker (the actor Oleg Tabakov).Footnote 17 The [d] in [dʐ] is fairly long.
Figure 6 shows this same speaker’s intervocalic [ʧ], which has the very short closure that appears to be typical in Russian (recall also Figure 1). One cannot draw conclusions about C vs. CC status from acoustics alone, especially when the sequences differ in voicing and constriction location, but this is still suggestive.Footnote 18
Fourth is a morpho-phonological diagnostic: in Russian, the allomorphy of the diminutive. The allomorph [-ok] tends to not attach to CC-final stems, while the allomorph [-ik] is found on disproportionately many CC-final stems (Gouskova et al. Reference Gouskova, Kasyanenko and Newlin-Łukowicz2015). The [-ik] allomorph occurs on one [dʐ]-final noun, [kotédʐ] ‘cottage’: RNC has 14 instances of [kotédʐ-ɨk] ‘cottage dim’, and none of [kotedʐ-ók]. This is consistent with [dʐ] being a CC sequence, although more systematic study is needed.
The last Trubetzkoyan diagnostic is inventory structure. In Russian, voice contrasts are mostly symmetrical: [b, p], [d, t], etc. Of course, many inventories have gaps – but the Russian system is less typologically odd if only [ʧ, ʦ] are affricates. Under the alternative analysis, the affricate [dʐ] has a relatively free distribution (albeit mostly in loanwords), while its voiceless counterpart, [tʂ], occurs only as a devoiced allophone (also in loanwords, such as [imitʂ] ‘image’), and at morpheme boundaries (in native words such as [ot-ʂelʲnik] ‘hermit’). That is an odd distribution. By contrast to this hypothetical, the analysis of [dʐ] as a cluster treats Russian as a typologically typical gapped system (per Żygis et al. Reference Żygis, Fuchs and Koenig2012).
To conclude, the evidence points to analysing [dʐ] as a CC cluster. When it devoices, the result is also a cluster, [tʂ].
4.3. Is this fission?
Superficially, the loanword pattern seems like a conspiracy: Russian avoids voiced [ʤ], devoicing it in native contexts and fissioning it in non-native morphemes. But this intuitive characterisation does not translate into a neat analysis under Richness of the Base. There are two problems:
Put in formal terms, Integrity (McCarthy & Prince’s Reference McCarthy and Prince1995a anti-fission constraint) has no good ranking position in the standard analysis. If Agree[voice] and *[ʤ] dominate Integrity, we expect fission in assimilation contexts (problem (37a)).Footnote 19 If underlying /ʤ/ devoices except in assimilation contexts, then we expect devoicing in loanwords as well, which is wrong (problem (37b)). Parallel OT encounters another problem, namely, fission in word-final position in loanwords such as [imitʂ] ‘image’ constitutes overkill. Regardless of the ranking of Integrity, devoicing should be enough if /imiʤ/ is the UR: *[imiʧ] changes just one feature, while [imitʂ] changes voicing, number of segments, place and [back]. Whatever is happening in loanwords is not a straightforward extension of the native pattern. This is not rare cross-linguistically, of course (Broselow Reference Broselow2004; Kang Reference Kang, Oostendorp, Ewen, Hume and Rice2011; Simonović Reference Simonović2015).
4.4. Conventional mappings
I argue that loanword [ʤ] must be decomposed before a morpheme enters the lexicon. I adopt Simonović’s (Reference Simonović2015, ch. 6) conventional mappings, which are agrammatical rules, established in the community by convention to map foreign structures to native ones. Community conventions can differ between dialects of the same language even if there are no relevant grammatical differences between the dialects. Simonović discusses Belgian and Netherlandic Dutch, which borrow English [æ] as [ɑ] and [ɛ], respectively (by using different conventional mappings). Russian motivates the following conventional mappings:
There is no direct connection between these mappings and MSCs. It so happens that all MSCs are satisfied, but this is not a necessary feature: a language may borrow a foreign segment and restructure its inventory, in which case MSCs might eventually change. Simonović has many arguments for this view of loanword adaptation, which I will not rehearse here. Conventional mappings do explain several intractable puzzles.
First, this view of loanword lexicalisation straightforwardly explains why loan [iməʤ] does not devoice to *[imiʧ]. In my account, the UR is /imidʐ/. Its mapping to *[imiʧ] is ruled out by Uniformity (the anti-fusion constraint) and Ident[back].
Second, conventional mappings can be mutually inconsistent, as in the differential adoption of [ŋ]. Since [ng] does not result from fission, we do not need to ponder why the segments appear in that order, or why [dorsal] is preserved in English/German borrowings but not Chinese ones. The agrammatical account allows for conventional mappings to arise because of different borrowing channels (with and without exposure to orthography, perhaps). It can also be influenced by extralinguistic factors suh as prestige. There is compelling evidence for such influences: in Lev-Ari & Peperkamp’s (Reference Lev-Ari and Peperkamp2014) experiment, French listeners adopt the fake loanword [ʤenna] more faithfully when presented as the name of a prestigious item (Italian ice cream) than when it is a non-prestigious Italian beer.
Conventional mappings illuminate another mystery: the inconsistent handling and lack of fission in English interdentals (see (39)). Mapping [θ] to /t/ and [ð] to /z/ is mutually inconsistent. Zooming out, if [ŋ] and [ʤ] undergo fission, then we expect fission for interdentals – perhaps to /tx/ or /dv/, which preserve [coronal] and [continuant]. Instead, [z] adds [strident], and [t] removes [continuant]. My explanation is that these are conventionalised mappings, which are under no requirement to be consistent with anything. Supporting this, Greek [θ] is borrowed as [f] in religious vocabulary (e.g., [anafema] ‘anathema’). And some English loans are inexplicable exceptions, such as the mapping of [ð] in [golsuor\uline{s}i] ‘Galsworthy’.
Conventional mappings exist because the language lexicalises morpheme representations in forms that use native segments. Loanwords show that the grammar is not equipped to handle a rich base; there is just no evidence here for a grammar where the regularities are enforced by a consistent ranking of subordinated faithfulness constraints (Broselow Reference Broselow2004).
To be clear, I would not claim that all loanword adaptation works via conventional mappings, either in Russian or in other languages. Our current understanding of loanword adaptation suggests that the nature of contact influences the mechanisms of adaptation. Some loanword adaptation patterns have roots in perception, but it is controversial whether they interact with the grammar (Silverman Reference Silverman1992; Kang Reference Kang2003; Peperkamp et al. Reference Peperkamp, Vendelin and Nakamura2008, among others). There are also numerous examples of adaptation that do not lend themselves to a grammatical explanation (see Kang Reference Kang, Oostendorp, Ewen, Hume and Rice2011; Simonović Reference Simonović2015). Loanword adaptation is likely not one thing but many things.
5. Alternatives
5.1. More on the duplication problem
The duplication problem, as framed both by Halle (Reference Halle1959) and by Prince & Smolensky ([Reference Prince and Smolensky1993] 2004), is a problem of theoretical economy. A theory misses a simple generalisation about a pattern, requiring two separate devices. Halle’s critique rests on the intuition that there should be one treatment for all segments, contrastive or not. He criticises structuralism for needing two voicing-assimilation rules, but his own account also handles non-contrastively voiceless consonants in several places. First, a morpheme structure rule requires /ʦ, ʧ, x/ to lack a voicing specification. Then, a phonological rule gives these consonants redundant features. His voicing assimilation rule is general, but the phonological system is not simple. Of course, as we have just seen, the Russian system is more complex than Halle’s presentation suggests, so a simple analysis is unlikely.
The problem for Optimality-Theoretic analyses is more dire, I think. All of the constraints in the positional faithfulness analysis have been recruited in the analyses of languages other than Russian. They are well-motivated substantively and typologically. Voiced obstruents are aerodynamically difficult, and many languages avoid voicing in stops (Westbury & Keating Reference Westbury and Keating1986, among others). The prohibition on voiced affricates is similarly well-grounded, and the Russian-style gapped inventory, where affricates are voiceless, is typologically common (Żygis et al. Reference Żygis, Fuchs and Koenig2012). So this is an analysis worth saving. It is interesting, therefore, that most OT analyses either cannot handle these facts or encounter a duplication problem. Duplication is a problem for Disalign (§5.2), positional markedness (§5.4) and comparative markedness (§5.3). By contrast, Stratal OT (§5.5) fails to supply an internally consistent account of Russian and makes some odd typological predictions, depending on the specific internal assumptions.
5.2. An alternative: Hall’s Disalign
The shape of the nasal assimilation problem in §2.2 suggests a general solution: ban the gapped segment in the environment where positional faithfulness protects contrasts. This solution is obviously duplicative: assimilation is handled once for all segments, and then again just for the gapped segments. A non-hypothetical example of such an analysis is Hall (Reference Hall and Radišić2007), who identifies the problem presented by voiced affricates in Russian and Czech (the latter only involves /ʦ, ʧ/). Hall’s solution is to augment the analysis with Disalign (40):
Unlike *[ʤ], which is simple paradigmatic feature co-occurence constraint, Disalign penalises (i) affricates as sole sponsors of [+voice], or (ii) sequences assimilated for [+voice] and [del rel]: [ʣʤ, pʤ] are bad, [bʤ, ʤb, dʤ, ʤʧ] are good. Hall is himself skeptical, noting that Disalign cannot distinguish the legitimate Czech word [le:ʤba] ‘cure’ from banned *[le:bʤa]. His fix is a directional Disalign-R, which penalises *[bʤ] but not [ʤb].
This style of approach can be criticised for any number of reasons. First, it does not escape the duplication problem for the reasons already explained. Second, the constraint is complex and stipulative; it is unclear why [ʤb] should be preferred to [bʤ] at all, or why either is better than, say, intervocalic [ʤ]. Third, an OT account should be judged on its predicted typology. The basic typology of the constraint set {Agree[voice], Ident, Ident-Preson, *ObsVoice} generates seven attested patterns (recall Table 1). By contrast, adding {Disalign-R, Disalign} predicts 19 systems, including some rather intricate patterns that are, I think, unattested. Two such systems are illustrated in (41) and (42). The first language has regressive assimilation in obstruent clusters in general (e.g., [atpa], [adba]: Agree dominates Ident), and [ʤ] occurs as a singleton (Ident dominates Disalign). But in clusters containing /ʤ/, there is wholesale devoicing. This is the opposite of Russian: [ʤ] is allowed except in assimilation contexts, and the presonorant contrast is conditional on a nearby stop. The second language has word-final devoicing and limited regressive assimilation. It is not triggered by presonorant obstruents except for /ʤ/.Footnote 20
Another odd prediction of Disalign is spread of [del rel] in voice-unassimilated clusters as a way to avoid voicing assimilation. The mapping /paʤta/ $\rightarrow $ [paʤʧa] satisfies Disalign, since [ $+$ del rel] is linked to the entire cluster, while [+voice] is linked only to the first segment.
As was shown in §2.2, such solutions do not generalise. A specific Disalign constraint on affricate voicing in certain clusters might suffice for Czech, whose gaps *[ʤ, ʣ] form a natural class, but additional Disalign constraints would be needed for Russian, where the gaps are not a natural class. Russian [x, ʃʃ] lack robust voiced counterparts just like the affricates do—but the entire set of gaps cannot be isolated with one phonological feature, and it does not make much sense phonetically, as the gaps are non-contiguous in the articulatory tract (recall Table 2).
Thus, in addition to requiring a specific Disalign constraint for affricates (picked out by their [delayed release] feature), this analysis would have to recruit two additional, even more specific constraints: Disalign[+voice, dor, +cont] to govern the distribution of [ɣ], and Disalign[+voice, $-$ ant, +distributed, +cont] for [ʒʒ]. This is a general problem for a surface-constraint account: no simple constraint will do.
5.3. Comparative markedness
Another alternative in the category of adding constraints was suggested to me by Andrew Lamont: McCarthy’s (Reference McCarthy2002b) Comparative Markedness. In this theory, every markedness constraint is split into two versions: old and new. In Russian, *[ʤ]-old would penalise faithful voiced affricates (e.g., /noʤ-am/ $\rightarrow $ [noʤam]). The new version, *[ʤ]-new, penalises a voiced affricate not present in the fully faithful candidate – such as one derived by voicing assimilation, /noʧ-bɨ/ $\rightarrow $ [noʤbɨ]. The Russian pattern could be analysed in terms of the ranking *[ʤ]-old, Agree[voice] $\gg $ *[ʤ]-new: old/underlying voiced affricates are not allowed, but new/derived ones are.
The question is what happens to these underlying affricates, and how the learner would ever figure this out. The ranking established so far suggests that Ident-[voice] is the crucially dominated faithfulness constraint: underlying /ʤ/ devoices. But, as McCarthy himself points out, rankings of the shape Mark-old $\gg $ Faith $\gg $ Mark-new are not learnable through basic recursive constraint demotion in phonotactic learning (see McCarthy Reference McCarthy2002b: §6.3). The only examples where old markedness transitively outranks new markedness in McCarthy’s catalogue involve counterfeeding opacity – evidence for which must come from morphophonemic alternations, not phonotactics. The nature of the Russian problem is simpler: there are several inventory gaps, but the pattern is transparent, treating all underlyingly voiceless consonants alike.
McCarthy himself anticipates the criticism that Comparative Markedness introduces duplication by splitting every markedness constraint in two. McCarthy counters that, unlike MSCs, old and new markedness constraints can compel and block alternations. But, as we saw in the discussion of loanword adaptation, in Russian, *[ʤ]-old – if it were to exist – does not compel the right kind of alternation, since borrowed affricates undergo fission rather than devoicing. Handling this would require multiple faithfulness constraints in addition to splitting markedness constraints into two, way beyond duplication.
5.4. Positional Markedness
Positional neutralisation can often be analysed either in positional faithfulness or markedness terms. I took it for granted that the positional faithfulness account is right, so here, I show that even the positional markedness alternative does not escape the duplication problem.Footnote 21 The obvious alternative to *ObsVoice $\gg $ Ident[voice] is to ban word-final voicing instead:
Voiced affricates could then be banned by a general constraint *[ʤ], which is outranked by Agree, as in (44):
As with nasal place assimilation in (8), we face problems in analysing the direction of assimilation. To get that right, Ident-Preson could be ranked anywhere, as it just breaks the tie between two assimilated cluster candidates (45c,d). But to get the correct results for clusters that include an affricate, Ident-Preson must dominate *[ʤ], and this leads to a ranking contradiction. If the presumed fate of /ʤop/ in (44) is to devoice, then the ranking cannot be as in (45). We would need a constraint against [ʤ] in presonorant position, and with it returns the duplication problem.
A reviewer challenges the assumption that devoicing rather than fission is the right outcome in this analysis (recall that any conclusions about these hypothetical inputs with voiced affricates are guesses, since we never see evidence for them from alternations).Footnote 22 If we suppose that underlying /ʤ/ maps to [dʐ] in the grammar in (45), it would have to be because fission is less costly than devoicing – that is, Integrity is the bottom-ranked constraint. But if that were true, then /ʧ/ would split into [dʐ] in assimilation contexts, too (recall (27) and (37)).
To summarise, the analytic role of positional faithfulness is not only to protect contrasts in certain positions but to determine the direction of assimilation. This aspect of the constraint family makes it necessary for complete analyses of assimilation patterns. Whenever these assimilation patterns involve gapped inventories – which is not rare – positional faithfulness needs to be augmented with a constraint, or several constraints, to block the gappy segments from occurring in positions protected by positional faithfulness. I argue that this situation is unsatisfactory. A simpler analysis is to ban the non-contrastive segments from URs across the board, and to understand their absence in various environments as a consequence of their derived status. If theoretical economy is the goal, then MSCs achieve it better than surface-oriented constraints.
5.5. Stratal OT
The last alternatives I consider are cast in Stratal OT (Kiparsky Reference Kiparsky2000; Rubach Reference Rubach2000; Bermúdez-Otero Reference Bermúdez-Otero, Hannahs and Bosch2018), a constraint-based descendant of Lexical Phonology and Morphology (Kiparsky Reference Kiparsky1982; Mohanan Reference Mohanan1982; Kaisse & Shaw Reference Kaisse and Shaw1985). A Stratal OT grammar usually assumes a serial interaction between morphology and phonology. A stem is phonologised in the stem level/stratum. It is then concatenated with affixes, and the result passes through another, possibly different phonological grammar. Clitics and phrasal phonology apply in the postlexical stratum. Stratal OT is argued to be a theory of inventory restrictions: the stem level enforces them.
I discuss three stratal analyses. In the first one, affricates fission into stop–fricative sequences, before either assimilation or devoicing apply. The next two options are suggested by Mackenzie (Reference Mackenzie2024): in the first, affricates become voiced fricatives, and in the second, they devoice. I will argue that fission and spirantisation fail to capture the facts I laid out earlier. The predictions of the third analysis are examined in the last subsection, where I evaluate Stratal OT as a general theory of morpheme shape. Depending on the specific assumptions, Stratal OT either makes the wrong predictions for Russian, or has nothing to say about well-documented root–affix asymmetries, which positional faithfulness to roots explains well. Worse still, various published Stratal OT accounts either explicitly or covertly assume MSCs, suggesting the stem level does not suffice as a theory of morpheme shape.
5.5.1. Fission again
Three reviewers suggest that the loanword facts point to the content of the stem stratum. Thus, /ʤ/ fissions to [dʐ] at the stem level (*[ʤ] $\gg $ Integrity), and then at some later level Integrity and Agree are promoted above *[ʤ]. The fission analysis of /ʤ/ cannot be maintained once we zoom out from the posterior affricate to the other gaps, and especially to other loanword facts.
An OT analysis of an unfaithful mapping requires two components. First, unfaithfulness must be driven by a markedness constraint. For /ʤ/ $\rightarrow $ [dʐ], this is *[ʤ] (unproblematically). Second, fission must be the ‘cheapest’ faithfulness violation: Integrity must be ranked below Ident[voice], Ident[cont], Max, etc. Given Freedom of Analysis, Gen will proffer fission candidates for every input, and it is in dealing with other gaps that this analysis is going to encounter difficulties. The intuition behind fission of /ʤ/ is simple enough: [dʐ] preserves the plosiveness of the first half of the affricate and the stridency of the second half, at the cost of being unfaithful to each half’s place of articulation (and backness, since [ʧ] and [ʤ] are palatalised, unlike [dʐ]). But what about the fission option for other segments that Russian lacks or restricts on the surface? Some of these are listed in (46):
The challenge for a fission analysis is finding a consistent ranking of various Ident constraints that would allow fission for /ʤ/ but rule it out for segments that show no evidence of splitting. Thus, the analysis must countenance the option of /ɣ/ mapping to [xg], a sequence that should be phonotactically licit provided it eventually assimilates in voicing. The same goes for /θ, ð/ splitting into various sequences of stops and fricatives, strident or not—again, there is no evidence for this whatsoever. Some internal logical contradictions arise when we compare loan handling of /w/ and /ŋ/: the dorsal component is preserved in /ŋ/ $\rightarrow $ [nk] (which looks like fission) but not in /w/ $\rightarrow $ [v]. Why not [xv], a perfectly good Russian onset cluster? It is indeed these internal logical inconsistencies that led me to reject a phonological account of loanword adaptation; no one ranking simultaneously favours /ð/ $\rightarrow $ [z], /θ/ $\rightarrow $ [t], /ŋ/ $\rightarrow $ [nk], etc.
Thus, the fission analysis encounters a dilemma: the ROTB theorist must either make guesses in the absence of evidence as to what illicit inputs map to, or pick and choose which patterns of loanword adaptation constitute phonology, as opposed to being dismissed as analytic residue. The MSC analysis does not attempt to make sense of the chaos.
5.5.2. Mackenzie: affricates become fricatives
According to Mackenzie’s (Reference Mackenzie2024) analysis, /ʤ/ is fricated to [ʒ] at the stem level (see (47a)). Then, in a later stratum, assimilation creates [ʤ] (see (47b)). Just as in my analysis, all instances of [ʤ] are derived from /ʧ/. But ROTB requires the analyst to identify a way to remove the offending structure, with no evidence of its fate. As I show below, this approach encounters a problem anticipated in §2.3: it is impossible to identify one consistent way to remove all the gaps.
This analysis predicts that /ʣ/ and /ɣ/ should map to [z] and [g], respectively. But the handling of /ʤ/ and /ʒʒ/ is a problem. Mackenzie explains, ‘nothing crucial hinges on the relative ranking of Ident[voice] and Ident[continuant]. Filtering the rich base to the Russian inventory requires input /dʒ/ to map to some output segment present in the language, whether [ʒ] […] or [ʧ], as would be expected if Ident[continuant] outranked Ident[voice]’ (Mackenzie Reference Mackenzie2024: 12). But, as I explained in §3.3, there is no freely distributed phone in Russian that matches /ʤ/ in everything but [continuant]. Russian has the [+back] [ʐ], and the [–back] [ʒʒ] – which is homorganic to [ʤ, ʧ], but which also must be banned at the stem level. Mackenzie does not discuss this, and neither do Stratal OT accounts of backness (Rubach Reference Rubach2000; Blumenfeld Reference Blumenfeld, Browne, Kim, Partee and Rothstein2003). Rubach does not mention [ʃʃ, ʒʒ], while Blumenfeld sneaks in MSCs: his account incorrectly predicts that /ʂ/ should palatalise to [ʃʃ] at the word level (e.g., [mɨʂɨ] ‘mice’ should be *[mɨʃʃi]), so he suggests that [ʃʃ] should be stored as /ʂtʃ/ (Blumenfeld Reference Blumenfeld, Browne, Kim, Partee and Rothstein2003: fn. 13). This requires the MSC */ʃʃ/, and presumably, similar logic extends to */ʒʒ/.
Fixing this account requires pinning down the stem-level phonology more explicitly than existing Stratal OT attempts have done. No Stratal OT analysis of Russian tackles even a partial set of inputs required by Richness of the Base, or attempts to correlate properties of supposed stem-level or word-level affixes with each other (e.g., backness alternations, First Velar Palatalisation, conditioning and undergoing yer deletion, and stress assignment). Critics note that these properties do not cluster together in a way that facilitates a phonological account (Iosad & Morén-Duolljá Reference Iosad and Morén-Duolljá2010; Padgett Reference Padgett, Browne, Cooper, Fisher, Kesici, Predolac and Zec2010; Jurgec Reference Jurgec2016; and beyond Russian, Benua Reference Benua1997, among others). For all the phenomena that were once the purview of Lexical Phonology, there are developed alternatives that allow for better empirical coverage, such as floating features or indexed constraints. Even Stratal OT proponents admit that floating features are needed (Blumenfeld Reference Blumenfeld, Browne, Kim, Partee and Rothstein2003: 8; Bermúdez-Otero Reference Bermúdez-Otero, Hannahs and Bosch2018: 123), so it is not clear what is left for the stem level to do.
5.5.3. Stratal OT as a theory of morpheme structure
The function of the stem level can be challenged from a different angle. Suppose the gapped segments /ʒʒ, ɣ, ʣ, ʤ/ devoice to their voiceless counterparts [ʃʃ, x, ʦ, ʧ], as Mackenzie (Reference Mackenzie2024) suggests in passing. Analytically, this is more viable than fission or a change in [continuant]. The way to distinguish this option from my account, I argue, is by considering the larger implications of using the stem level as a theory of morpheme shape.
Kiparsky has suggested in writing on both Lexical Phonology (Kiparsky Reference Kiparsky1982: 53–54) and Stratal OT that generalisations previously attributed to lexical redundancy rules are enforced at the stem level. Kiparsky (Reference Kiparsky2000): 361–362) writes:
[ $\ldots $ T]he view that stems are domains of constraint evaluation is supported by phonological evidence independent of issues of opaque and cyclic constraint interaction. Indeed, the well-documented existence of well-formedness constraints that hold specifically for stems [ $\ldots $ ] is a major problem for parallelism, and constitutes another telling body of evidence for the stratification of phonology that LPM-OT envisages.
A fundamental assumption of LPM is that acquiring the stem-level phonology is tantamount to learning the constraints on lexical (underlying) representations (Kiparsky Reference Kiparsky1982). Though this is conceptually akin to OT’s Lexicon Optimisation and Richness of the Base, it differs in relating the lexicon specifically to the stem level constraint system, which can crucially differ from the word-level and postlexical constraint systems.
What Kiparsky alludes to in the first quote is presumably minimal-size constraints and prosodic shape generalisations, which often hold of stems but rarely, if ever, of affixes (see Gouskova Reference Gouskova, Ackema, Bendjaballah, Bonet and Fábregas2023 for a recent review). To enact this, illicit stems are filtered out at the stem level, which affixes skip. Affixes are added after a stem-only pass of evaluation, and clitics are added later still, at the postlexical level. This is known as level ordering. While abandoned in some versions of the theory, it still figures in recent Stratal OT analyses (Jaker & Kiparsky Reference Jaker and Kiparsky2020). The problem is that level ordering cannot explain another well-documented class of asymmetries between stems and affixes: stems often license a superset of the segments allowed in clitics and affixes, but the reverse does not happen.
I am not the first to point out that Stratal OT predicts affixes and clitics to license segments banned from stems (Benua Reference Benua1997: 87 ff.; Fitzgerald Reference Fitzgerald2002: 267–268; McCarthy Reference McCarthy2007, §3.6.1). Filtering gaps at the stem level in a level-ordering theory predicts that Russian should allow [ɣ, ʤ, ʒʒ, ʣ] in affixes and clitics. This is clearly wrong: these morphemes allow [x, ʧ, ʃʃ, ʦ], but not their voiced counterparts except where they are derived by assimilation. The recognition of this prediction has led some versions of Stratal OT to abandon level ordering (Bermúdez-Otero Reference Bermúdez-Otero, Hannahs and Bosch2018; Staroverov Reference Staroverov2020). The problem is that the alternative does not succeed in capturing the existing typology of stem–affix asymmetries.
The two best-established typological asymmetries are (i) that roots (or stems) can be subject to a size minimum, while affixes are not size-restricted; and (ii) that roots license more segmental contrasts than affixes. The size minimum is analysed successfully in Prosodic Morphology without assuming strata (Selkirk Reference Selkirk1995; McCarthy & Prince Reference McCarthy, Prince and Goldsmith1995b, among others). The inventory asymmetries have been analysed in positional faithfulness terms (Parker & Weber Reference Parker and Weber1996; Beckman Reference Beckman1998; Baković Reference Baković2000; Urbanczyk Reference Urbanczyk2006). Root–affix asymmetries can be static or dynamic. Quechua is a famous example of a static distributional asymmetry: roots contrast ejective, aspirated and plain stops, while affixes have only plain stops (Parker & Weber Reference Parker and Weber1996; Gallagher Reference Gallagher2016, among others). Harar Oromo is an example of a dynamic pattern (Owens Reference Owens1985: 22; Lloret Reference Lloret1997): progressive laryngeal assimilation from root-final consonants to suffix ones (see (48a)–(48c)). Harar also has root-controlled manner assimilation of sonorants (see (48d)–(48i)). Owens’s (Reference Owens1985) grammar suggests a static asymmetry, too: roots contrast ejectives, plain and voiced stops, while affixes only have plain stops, unless derived by assimilation (exception: reduplicative prefixes).
A non-stratal classic OT account attributes both static and dynamic asymmetries to positional faithfulness to roots (McCarthy & Prince Reference McCarthy and Prince1994; Beckman Reference Beckman1997; Urbanczyk Reference Urbanczyk2006). The static asymmetry justifies the ranking Ident-Rt[lar] $\gg $ *[voice], *[cg] $\gg $ Ident[lar]: plain stops in affixes, a three-way contrast in roots. The dynamic asymmetry in direction of assimilation requires Agree[laryngeal], Ident-Rt[lar] $\gg $ Ident-PSon[lar]. This analysis must entertain hypothetical input affixes with ejective and voiced stops – a limited rich base. But, unlike a fully Richness of the Base-compliant OT account of absolute neutralisation between /n/ and /ŋ/ in Italian (§2.2), this analysis predicts rather than guesses the direction of neutralisation: affixes have plain stops. This analysis extends without embellishments to cases of static distributional restrictions, such as Quechua and Navajo (Alderete Reference Alderete2003).
By contrast, Stratal OT offers no obvious account of static or dynamic root–affix asymmetries. If level ordering is assumed, then it is not clear why affixes ever show a less marked inventory than roots. Depending on the differences in ranking between the stem level and later strata, affixes are predicted to have more contrasts or the same contrasts, but not fewer contrasts. Bermúdez-Otero (Reference Bermúdez-Otero, Hannahs and Bosch2018): 111) suggests this problem is alleviated by requiring affixes to pass through the stem level as separate entities, just as stems do – and as evidence, he observes that in some languages, some affixes ‘behave like miniature stems’. But this cannot cause all affixes to neutralise contrasts that stems preserve. The theory also fails to explain why in languages like Quechua, affixes are demonstrably non-stem-like in their phonotactics (see (49)). The vast majority of roots are templatic (most are CV(C)CV) and respect the phonotactics of words (e.g., they cannot begin in CC). Quechua affixes, on the other hand, can be subminimal (-C) and begin in consonant clusters:
The explanation for the reduced inventory in affixes cannot be that they lose those segments while passing through the stem level – the segmental restrictions on stems are too liberal, and prosodic ones are too stringent. There are ways out, of course. Introducing positional faithfulness to roots would do it, and is presumably independently necessary to deal with Harar Oromo-style root-controlled assimilation. But then the stem level does no work in explaining segmental asymmetries.
Echoing Blumenfeld’s (Reference Blumenfeld, Browne, Kim, Partee and Rothstein2003) use of (covert) MSCs, there is a long tradition of non-OT approaches resorting to representational explanations for dynamic asymmetries in terms of underspecification (see Baković Reference Baković2000 for a critique). Underspecification allows directionality to follow from the need to build (but not change) structure. For example, one could claim that Harar Oromo stops are obligatorily unspecified for laryngeal features in affixes. This is not a straw man: in a Stratal OT account of Tetsǫ́t’ıné, Jaker & Kiparsky (Reference Jaker and Kiparsky2020) and Jaker (Reference Jaker2022) attribute various root–affix asymmetries to underspecification. Affixes are argued to lack moras, and Jaker formulates the underspecification requirement as an MSC. Thus, Stratal OT does not succeed in capturing generalisations about morpheme shapes by means of the stem level alone.
To summarise, I think there are several critical problems with the idea that the stem level can replace the function of MSCs in Stratal OT. MSCs are either overtly assumed in Stratal OT analyses or are hidden in the background, but Russian has a sufficiently rich phonology to falsify guesses about the fate of hypothetical rich inputs.
6. Conclusion
This article revisited an old phonological debate: should the input to the grammar be restricted on a language-specific basis, or is it enough for the grammar to refer only to surface representations? My argument was based on Russian facts, whose significance was originally pointed out by Morris Halle in a critique of structuralist phonemics. Halle (Reference Halle1959) noted a duplication problem in structuralist approaches to gapped inventories: some rules must be stated twice. I argued that the same problem arises in constraint-based grammars. Most constraint-based theories require constraints to refer to surface representations—the input is unrestricted. I suggested that restricting the input offers the best analysis of positional neutralisation with inventory gaps. This proposal requires abandoning the putatively simpler theory where markedness constraints refer only to outputs, and faithfulness constraints refer to input-output disparities. The addition of input-only constraints (MSCs), I suggested, offers a simpler analysis of segments that occur only in assimilation contexts. Moreover, I adduced evidence from loanword adaptation that gaps are enforced by constraints that do not interact with a faithfulness ranking; there is no consistency in how the illicit segments are handled. I argued instead that the loanword patterns involve conventional mappings, as in Simonović’s (Reference Simonović2015) theory of loanword integration. These mappings serve to enforce MSC restrictions but are agrammatical, which explains their internally inconsistent character. The argument is that MSCs are constraints without a specific recipe for ridding the language of the offending structures.
Duplication is a general problem in positional neutralisation of gapped contrasts. If the argument for Richness of the Base is that it avoids the duplication problem, then this class of cases constitutes a counterargument. Assuming unconstrained inputs requires the analysis to handle certain segments twice, just as in pre-generative structuralist phonemics. The way forward is to accept that there are, indeed, interesting generalisations to be made about the shapes of morphemes in languages, and some of these generalisations might be stated at a fairly abstract level. We cannot do all of phonology by referencing only surface phonological words; we need to worry about the lexicon.
Acknowledgements
This article owes its existence to the workshop on Internationalisms in Slavic as a Window into the Architecture of the Grammar at InterSlavic 2021 in Graz, Austria. I also received useful feedback from the audiences at New York University, the University of Massachusetts, Amherst, and UC Santa Cruz, especially Maggie Baird, John Kingston, Andrew Lamont, Jaye Padgett and Rachel Walker. Maddie Gilbert, Kate Mooney and Guy Tabachnick offered useful comments on early drafts of the manuscript.
Competing interests
The author declares no competing interests.