1 Introduction: coined words
The reader of children's books by Dr. Seuss (Theodore Seuss Geisel, 1904–91) cannot help but notice the great number of words Seuss coined himself. In (1) I give some examples, specifically the full set of coined words in If I Ran the Circus (1956).
(1) The Seuss coinages in If I Ran the Circus
Here is a passage employing two of these coined words:
(2) From a country called Frumm comes this Drum-Tummied Snumm
Who can drum any tune that you might care to hum.
Doesn't hurt him a bit, cause his Drum-Tummy's numb.
In coining words, Seuss was hardly alone among authors of fiction; more exalted figures of English literature such as Jonathan Swift or James Joyce did the same. However, words are also coined by ordinary people from time to time (Marchand Reference Marchand1960: 320–1; Malkiel Reference Malkiel1990: 105). When adopted generally, these words end up in the dictionary, listed as words of obscure origin. Thus, for instance, Eisiminger (Reference Eisiminger1981) compiled a list of English words that have no etymology, and it abounds in slangy, obviously new forms. The same is true for older forms that gradually lost their slangy tinge and settled into standard usage (see, e.g., the Oxford English Dictionary's entries for boy, girl, big and bad). So research questions about word coinage are not confined to literature but are part of the study of language in general. A prolific word-coiner like Seuss can help us to explore word coinage in one ‘idiolect’ for which ample data are attested.
The freedom available to word-coiners is in principle very broad: it seems that they need only find some string that conforms to the phonotactics of their language enough to be pronounceable by other speakers. But this greatly oversimplifies the question. Word-coiners create their words for a reason, and they make substantial use of the phonological resources of their language when they create a novel phonological form. This point is well established, I believe, by recent research on word coinage, notably the extensive current research program on the creation of Pokémon names (for overviews, see Shih et al. Reference Shih, Ackerman, Hermalin, Inkelas and Kavitskaya2018 and Kawahara Reference Kawahara2021a). However, we will see that the Seuss corpus has idiosyncrasies that justify a slightly different analytical approach.Footnote 2
Before plunging into the Seuss coinages, I should offer a couple of clarifications. First, when I say ‘coinage’ here, I obviously am using it in a restricted sense, namely ‘made up de novo’, since we can also say that word creation in the normal way – application of the language's word formation rules, as in Seuss-ian or un-mute – counts as coinage.Footnote 3 Second, I acknowledge that word coinage often relies on lexical as well as phonological resources: a coined word is sometimes perceived to be similar in its phonology and semantics to an existing word (MacDonald Reference MacDonald1988: 67; Magnus Reference Magnus2001: 140). For simplicity, the discussion below will ignore this factor.
2 The Seuss coinages: an attempt at precise description
I adopt the strategy of Shih and Kawahara, employing a digital data corpus and statistical modeling in order to obtain objective testimony about issues that can become subjective very easily. The modeling described here employs the technique of logistic regression (on which see, e.g., Johnson Reference Johnson2008: 159–74). The purpose of my models is to predict for any given word, on the basis of just its phonological form, whether it is a Seuss coinage or a real word (for similar applications in other domains, see Hayes Reference Hayes2016). It is, of course, impossible to achieve perfect prediction, and what the model really does is to assign a ‘probability-of-Seussian’ value to every form, so its predictions are gradient. The deeper purpose of the modeling is that, once a model has been optimized, we can make useful inferences from its internal structure, specifically the degree to which the model attributes explanatory importance to principles hypothesized to underlie Seuss’s coinage practice.
I employed readily accessible data. The Seuss coinages, which number about 435 in his complete oeuvre, were carefully collected and described by Lathem (Reference Lathem2000). I extracted the coinages from Lathem's work and rendered them in phonetic transcription by hand. I believe the latter task is not difficult or controversial, in light of Seuss’s clear use of orthography and the additional clues provided by rhyme. For English in general I employed my own version of the Carnegie-Mellon pronunciation dictionary (www.speech.cs.cmu.edu/cgi-bin/cmudict). My edited version, with 17,744 entries, includes only words that have a frequency of one or above in the English CELEX database (Baayen et al. Reference Baayen, Piepenbrock and Gulikers1995). This is meant to restrict it to words likely to be familiar to English speakers.Footnote 4 I also excluded words formed with highly productive suffixes such as inflectional [-z/-s/-əz] (plural, possessive, 3rd sg. pres.) or [-d/-t/-əd] (past tenses and participles). This is important because there are sequences that are very unusual in stems, but common in inflected forms. For instance, [ts] is rare in stems (e.g. Katz, Hertz) but is ordinary in inflected forms like cats or hurts. I argue below that Seuss indeed uses [ts] as a basis for coinages.
All the analytic work I did for this article (lexical databases with phonetic transcription, R scripts, spreadsheets) may be accessed in the Supplemental Materials at https://linguistics.ucla.edu/people/hayes/papers/SeussSupplementalMaterials.zip
2.1 What principles might be used to characterize the coined words?
Following a few pilot efforts, I settled on the following procedure to guide the work: I searched a fairly large preliminary set of predictive principles, then narrowed it down to a smaller set with just the most effective ones.
My pilot studies indicated that several strongly predictive traits consisted of word-initial syllable onsets, such as the [sn-] of Snumm. To be thorough, I searched the entire set of 73 attested word-initial onsets, irrespective of whether they occur in the real-word corpus or the Seuss corpus (some onsets occurred only in Seuss). I also found that particular vowels, occurring in the main-stressed position of a word, were sometimes highly predictive of Seuss, such as [ʌ]. Thus, in my more serious search, I included all 15 main-stressed vowels from the corpora as potential predictive factors.Footnote 5
I also incorporated into my search some ideas taken from the research literature on sound symbolism, mostly from work on phonesthemes. These are short segmental sequences that don't fully qualify as morphemes, but nonetheless often impart a (perhaps vague) meaning to words that they contain. Phonesthemes are discussed extensively below in section 3.4. For purposes of including multiple phonesthemes in the initial search, I relied on the lists in Marchand (Reference Marchand1960) and Hutchins (Reference Hutchins1998).
2.2 The culling procedure
The final model was made by reducing the original set of factors, described in the previous section, to a smaller set, each of whose members demonstrably contributes to predicting Seussian status in a logistic regression model.
I offer a brief note on logistic regression. Every principle that might help predict Seussian status is here termed a feature.Footnote 6 Each feature is given a particular number, called its weight. In my own setup, if the weight of a feature is positive, it means that the feature favors Seuss-coinage status; if the weight is negative, is means that the feature militates against Seussian status; and if it is zero, the feature is indifferent. Greater magnitudes of weights (either positive or negative) have greater effect.
The output of the model, for any given phonetically transcribed word, is a value ranging from zero to one, expressing the estimated probability that a word is Seussian; a perfect model would assign one to all Seuss coinages and zero to all ordinary words. Where computation enters is in setting the weights: one's chosen logistic-regression software will calculate the weights that best separate out the Seuss coinages in the data from the real words.Footnote 7
My chosen software was the bayesglm() function within the R statistics system (https://cran.r-project.org/web/packages/arm/arm.pdf). This version of logistic regression is somewhat conservative, assigning lower-magnitude weights than would obtain under the simplest forms of logistic regression.
I sought to trim my large candidate system of features into one that would be much smaller but perform almost as well. First, I removed all constraints that tested nonsignificant (by a .1 p value), then culled further with the stepAIC() function, which lets us keep a feature only if it creates improvement by the Akaike Information Criterion, a well-known measure that penalizes overly complex models.Footnote 8 Statistical evaluation of all models reported here is given below in appendix A.
2.3 The features of the completed model
Tables 1 and 2 below give what I found. There is one non-specific feature in the system, the Intercept, which simply is a raw penalty against being Seussian – a sensible penalty, in light of the disparity in numbers. The weight of the Intercept is −4.80, which is quite large. Hence, for any form to receive a really strong Seussian probability, it must rack up substantial compensation from the positive-weighted features in order, as it were, to climb out of the hole.
For each feature in the tables, I give the following information:
• The form of the feature; usually a sequence of phonemes. ‘[’means that a feature counts the relevant sequence only if it is initial in the word; ‘]’ analogously means ‘final’; no bracket means ‘anywhere’. A few features deviate from this format and are described in words.
• The weight of the feature. For intuitive interpretation of weights see footnote.Footnote 9
• The number of words, both Seussian and real, that come under the scope of the feature, with representative Seussian examples.
• Explanatory comments, where applicable; these serve as placeholders for the discussion to follow. ‘Marchand’ with page number means that a sequence has been identified as an English phonestheme by Marchand (Reference Marchand1960), the reference source used for statistical testing in section 4.
2.4 How well does the model work?
We should not expect the model to make always-correct up-or-down decisions on whether a word is a Seuss coinage or a normal word; it would be remarkable if Seuss somehow managed to make every coinage fully distinguishable in this way. Rather, we should see if the model makes meaningful, useful distinctions. It emerges here that the model's average ‘probability is Seuss’ for Seussian coinages is 21.1 percent, whereas the average ‘probability is Seuss’ for normal words is only 1.9 percent.
We can get a more detailed picture by comparing histograms. In figures 1 and 2, I plot the probabilities assigned by the model to the 435 Seuss words, compared to the probabilities assigned to the 17,744 real words (in the latter, the scale is compressed to accommodate them in the same vertical space).
It should be clear that the model predicts a very different distribution for the Seussian coinages.
We can also examine the extremes of behavior. In table 3 are given the ten most ‘Seussian’ Seuss coinages. I include the particular phoneme sequences that are picked up by the features of tables 1 and 2 and converted, via the math of logistic regression, to high predicted probability.
In less detail, (3) gives the ten least Seussian Seuss coinages, as well as the ‘most Seussian’ and ‘least Seussian’ real words (the latter consists of ten words randomly chosen from the 484 real words that got a 0.000 score).
(3) Further examples of the model's behavior
(a) Seuss coinages rated by the model as minimally Seussian
lopulous (0.004), Lass-a-lack (0.004), Hippo-Heimer (0.004), Ronk (0.003), Antrum (0.003), Offt (0.003), Gee-Hossa-Flat (0.003), Solla (Sollew) (0.003), rippulous (0.002), Keck (0.001)
(b) Real words rated as most Seussian by the model
xerox (0.845), zinc (0.843), quartz (0.837), zigzag (0.837), waltz (0.757), snuggle (0.754), snuff (0.749), snoop (0.732), zip (0.722), snub (0.659)
(c) Sample of real words rated as minimally Seussian by the model (all at 0.000)
administration, appreciation, appreciative, chicanery, competitor, elaboration, electromagnetic, encyclopedic, immemorial, meteorological Footnote 11
These are meant mainly as a guide for the intuition, though the forms of (3b) evoke a further phenomenon: Seuss occasionally adapts a real word, often bearing Seussian phonological traits, to serve as a novel word; for example zip, respelled as Zipp, is used as a surname in Oh Say Can You Say?. These forms are discussed in appendix B.
One wonders whether the model could be improved by further work. I would judge that this is likely, since many of the Seussian coinages are assigned low scores but somehow sound Seussian to me, for example sporn, Jounce and tweetle, all with scores below 0.02 – something is still missing. However, I believe the model in its present form suffices for its intended purpose; namely, that we can inspect it, trying to find in its features some principles that will be informative about Seuss’s coinage practice in general terms.
3 The Seuss coinages: seeking general principles
I will put forth four proposed principles of Seussian coinage.
3.1 Meter
First, the Seussian words are skewed somewhat to make them fit easily into his hallmark meter, anapestic tetrameter – see features (k) and (p) in table 1. Since these metrical principles are so distinct from the main theme of this article, I have relegated discussion of them to appendix C below.
3.2 Phonotactic violations
As Nilsen (Reference Nilsen1977) observed, a noticeable minority of the coinages violate principles of English phonotactics; specifically word-level phonological well-formedness. For English phonotactics see e.g. Hammond (Reference Hammond1999), Hayes & Wilson (Reference Hayes and Wilson2008) and Daland et al. (Reference Daland, Hayes, White, Garellek, Davis and Norrmann2011). The patterns noted below are probably not controversial.
First, a number of onsets found in the Seuss coinages are not permissible in the core English vocabulary (although they may occur in unassimilated borrowings).
(4) Illegal onsets occurring in Seuss coinages
Among other onsets tagged in the feature-selection process described above, [θw] and [gw] are very unusual in real English words and might be regarded as hovering on the fringes of ill-formedness. The impossible coda [bsk] is found in Obsk (a bird in Scrambled Eggs Super), along with Tobsk and Nobsk in the same location.
The letter name Nuh, from On Beyond Zebra, ends in a lax vowel ([nʌ]),Footnote 12 something impossible in ordinary words and limited to quasi-gestural forms like uh ([ʌ]; hesitation noise) and duh ([dʌ]; used to indicate one's interlocutor has missed something obvious).
Lastly, consider Snumm, quoted above in (2). Along with Snimm (a proper name from Too Many Daves) this coinage violates a phonotactic principle discussed in Davis (Reference Davis, Paradis and Prunet1991): English avoids the occurrence of similar or identical consonants in the C positions of the formula sCVC. Davis’ constraints include, for instance, bans on /spVp/ and /skVk/ (spip or skeck would be odd as English words). In the present context, the relevant ban, also noticed by Davis, is on /sNVN/, where N is any nasal consonant. As Davis points out, no such words exist in English and I personally find smem, smun, snam (and indeed Snumm and Snimm) to sound odd.
Unsurprisingly, none of these phonotactic violations is extreme, like, say, the use of uvular consonants or grossly sonority-violating initial or final clusters. It seems that Seuss wanted his words to sound funny, but would hardly want to inflict an impossible phonetic challenge on his readers.
The specific examples given above most likely are only the most salient cases of a more general pattern: Westbury et al.'s (Reference Westbury, Shaoul, Moroschan and Ramscar2016) experiments suggest that phonotactically improbable English nonce words are more likely than chance to be felt as funny, and their sample of Seuss coinages emerged in the aggregate as less phonotactically probable than ordinary words.
3.3 Words that sound German
Nilsen (Reference Nilsen1977) and Teuber (Reference Teuber2018) suggest that a number of the Seuss coinages sound like German words. Some of these have already been mentioned in the previous section: words beginning in [ʃl] and [ʃn] are aberrant in English, but are normal in German.
German is of course closely related to English and has similar phonotactics. Yet the phonological history of the language (see e.g. Chambers & Wilkie Reference Chambers and Wilkie1970) has produced some points of departure. By the Second Consonant Shift, historical Germanic *t and *p (preserved intact in English) evolved in certain contexts into [ts] and [pf], sequences that are very rare in English. Thus, the clusters in Katze [ˈkatsə] ‘cat’ and Kropf ‘(bird's) crop’ are a point of phonological divergence between German and English that is attested in multiple words. The [ʃl] and [ʃn] clusters just mentioned were also created by sound change, from historical *sl and *sn. All four of these German-linked patterns are shown in table 1 above to be statistically unambiguous features for Seuss coinages.Footnote 13
That these coinages were actually intended by Seuss to sound German is made plausible by several factors. First, the orthography Seuss chose for them is largely German, as in Gitz, Glotz, Schlottz, Schnutz (that is, tz not ts, sch not sh). Second, the texts include a few overt German cultural references, notably the blue-footed mandolinist Gretchen von Schwinn, from Oh Say Can You Say, and the castle of Krupp, from Dr. Seuss's Sleep Book. Lastly, Seuss’s German-styled coinage practice can be related to his own life history (Morgan & Morgan Reference Morgan and Morgan1995): he grew up in a German-speaking family (he was third-generation) in Springfield, Massachusetts, a city that during his youth included a vibrant German-American community.
3.3.1 The German coinages and the American audience
It is only natural that Seuss, a popular artist, would have attempted to create coinages that would make sense to his readers. In the present context this raises the question of whether Seuss’s audience (mostly mid-century Americans) would have been able to identify Germanness in nonce words. An intriguing research finding by Oh et al. (Reference Oh, Needle, Todd, Beckner, Hay and King2020) bears on this question: they show by experiment that non-Māori residents of New Zealand, very few of whom can actually speak Māori, nonetheless have an accurate sense of the phonotactic principles of the language, obtained from second-hand exposure. This suggests that if Seuss’s audience had enough second-hand exposure to German they likewise could have internalized a sense of what German phonology is like. It seems reasonable to me to claim that mid-century Americans did indeed have considerable exposure to German; this was the period following World War II, and closer to the historical time when German-Americans were the nation's largest ethnic minority.Footnote 14 Of course, even now many American Seuss readers would surely recognize Schlottz as a German-like word.Footnote 15
3.4 Phonesthemes
For present purposes, I define a phonestheme as the following: (i) it is a segment or segment sequence that occurs in multiple words; (ii) it has some vague, often expressive meaning; (iii) its ‘residue’ in a word is not a morpheme; e.g. in words of the form [ Ph X ]word, where Ph is a phonestheme, X is not in general an identifiable morpheme of the language. To give an example, initial [sn-] is a well-known phonestheme of English. Its meaning is (vaguely, as always) ‘having something to do with the nose’, as in snoot, snot, sneeze, snout, snuff, snore, sniff, sniffle, snort; and by extension ‘looking down the nose at’, snob, snooty, sneer, snicker, snide, sniffy and snub. Footnote 16 We will examine other phonesthemes below.
Phonesthemes are the topic of a large research literature,Footnote 17 which I briefly discuss before going on to the Seuss coinages.
3.4.1 Theories about phonesthemes
I see three basic lines of thought.
The first is least relevant here, so let us dispose of it up front. Phonesthemes, or at least many of them, are often said to have a natural phonetic basis, as in the affiliation of [i] (a low-sonority vowel with high F2) with smallness (Jespersen Reference Jespersen1933). For a careful overview of this topic see Kawahara (Reference Kawahara2021b). For present purposes I believe it will be safe to ignore whether a phonestheme is natural or arbitrary.
More pertinently, there are different points of view about where phonesthemes come from and their role in language. One prominent viewpoint is the word affinities approach, put forth by Bolinger (Reference Bolinger1965) and Magnus (Reference Magnus2001). This sees phonesthemes as the result of word comparison: human language learners comb through their lexicons, seeking all conceivable correlations between phoneme sequences and meaning. Of course, when pursued to a successful conclusion, this learning behavior yields knowledge of the authentic morphology, enabling most words to be parsed into a sequence of clearly defined, plainly meaningful morphemes. Phonesthemes, in contrast, are the morpheme candidates left on the workbench when learning doesn't fully succeed – hence, they occur in words whose ‘residues’ (X in [ Ph X ]word) are meaningless, their meanings are elusive, and native speaker judgments about them are difficult and ambivalent.Footnote 18
A rather different view on phonesthemes is put forth by Bloomfield (Reference Bloomfield1933: 156), Wales (Reference Wales and Ramsaran1990) and Joseph (Reference Joseph, Hinton, Nichols and Ohala1994), who emphasize the stylistic function of phonesthemes: phonesthemic words are characteristically vernacular in tone and expressive in function. Joseph (Reference Joseph, Hinton, Nichols and Ohala1994: 222, 229) articulates this view clearly, describing phonesthemic words as ‘expressive, affective, connotative’; they ‘add color to the language’. An important component of this view, put forth by Joseph, is that a word can include a phonestheme which embodies style without bearing any trace of the phonestheme's meaning. This will turn out to be important when we later turn to Seuss.
The stylistic function of phonesthemes arises, I suspect, from their use in word coinage. Speakers obviously do not concoct phonologically novel words for the purpose of making their meaning clear; rather, these coinages are intended to make an impression, based their imaginative phonological content. Earlier (section 1), I mentioned the apparent fact that many of our existing words originated as phonesthetic coinages, the work of creative speakers long forgotten. It is not unreasonable to regard these coinages, at least at the moment of origin, as the deployment of phonesthemes in the service of verbal folk art.Footnote 19 Here, I suggest that Seuss embraced this art form as part of his own distinctive vernacular style.
I have now given two accounts of phonesthemes, but how do we integrate them? Here again, word coinage provides the key: the anonymous verbal artists who coin new words draw on the set of word affinities to make their words more vivid as well as more intelligible. Although the phonesthemes originate with word affinities, the fact that they are repeatedly used to create new vernacular words over time means that the phonesthemes themselves are likely eventually to acquire the vernacular stylistic tinge. And the process may be self-feeding: the acquired stylistic tinge invites word-coiners to make use of the phonestheme more frequently, a virtuous cycle.
3.4.2 Phonesthemic words: a three-way distinction
With the above general discussion in mind, we now turn to a proposed taxonomy of the words in which the phonesthemes occur. The idea is that for any given phonestheme, we will normally find words that fit into each of the following categories.
(5) A three-way classification for phonesthemic vocabulary
(a) Words in the meaningful core of a phonestheme (‘core words’) both contain the phonestheme and bear the appropriate meaning.
(b) Words in the penumbra of a phonestheme contain the phonestheme and also convey the vivid, expressive character of phonesthetic style; but they do not bear the meaning of the phonestheme.
(c) Words in the neutral zone of a phonestheme contain the segments of the phonestheme but do not bear the meaning of the phonestheme and lack a vivid, expressive meaning; they are not phonesthetic.Footnote 20
I illustrate this taxonomy for the ‘nasal’ phonestheme [sn-] already mentioned. The core of this phonestheme would include the words with nasal meaning enumerated earlier: snoot, snot, sneeze, snout, snuff, snore, sniff, sniffle, snort, snob, snooty, sneer, snicker, snide, sniffy, snub. What of the penumbra? I suggest that it includes words like snazzy, snag, snap, snatch, sneak, snip, snitch, snoop and snug. These seem unnasal in their meaning, but they are nonetheless expressive, in the way that phonesthemes characteristically are. To defend this claim, I juxtapose some words that occupy the penumbra of the [sn-] phonestheme with their literal near-equivalents:
(6) Comparing penumbral phonestheme words with literal expression
Here are further comparisons, in this case involving words in the penumbrae of two phonesthemes to be discussed below, [z-] and [j-]:
(7) Further comparisons with [z-] and [j-]
I think it is clear that the words in (6) and (7) that include the phonestheme are more vivid and more colloquial. The implication is that a phonestheme does not require its core meaning to be present to render its stylistic effect.
Unsurprisingly, the element of vivid style that is the sole phonesthemic property of penumbral words is also found in the words of the core, as the comparisons of (8) suggest.
(8) The stylistic effect of phonesthemes in core vocabulary
Consider next the neutral zone. It is treated here as the set of words that accidentally contain the segments of a phonestheme, in the same way that, say, lens accidentally contains the [-z] of the plural suffix. This zone can be a source of frustration to anyone lecturing about phonesthemes, whose audience is naturally inclined to ask, ‘What about word X? Isn't that a counterexample?’ It seems best to acknowledge that most phonesthemes do have a neutral zone, but the existence of this zone should not be taken as counterevidence to the existence of the phonestheme – in pointing out a phonestheme, we are only pointing out a pattern that is too frequent to be coincidence, not an implicational law. Indeed, Hutchins’ (Reference Hutchins1998) experiments affirmed psychological reality for phonesthemes that possess a demonstrable neutral zone.
The neutral zone for [sn-], a potent phonestheme, is small; I suggest that two plausible candidate words are snow and snail.
3.4.3 ‘Gravitational attraction’ in phonesthemes
A number of scholars (e.g. Jespersen Reference Jespersen1922: 407; Malkiel Reference Malkiel1990; Magnus Reference Magnus2001: 8, 72; Pentangelo Reference Pentangelo2020)) have suggested that phonesthemes exert a kind of gravitational attraction; drawing additional words into their membership by adjusting either their formFootnote 21 or their meaning. In present terms this claim can be elaborated a bit: I suggest that members of the periphery may gradually assume semantic properties of the core, and members of the neutral zone may be drawn into the core or periphery, becoming regarded as phonesthetic and expressive. Such drift is likely to be the result of language misacquisition; children are prone to mislearn either the style level or the meaning of phonesthetic words.
Here is an example of drift into the core: Malkiel (Reference Malkiel1990: 99–110) documents an Italian phonestheme of the form CVCiCiV (CiCi a geminate) with meaning ‘negative, or ridiculous, or both’, which has pulled in words that were formerly neutral such as nullo ‘nothing’ and secco ‘dry’, giving them novel secondary usages that fit the core meaning. Another example is the extraordinary semantic drift of English snob (roughly, from ‘lowlife’ to ‘one who looks down on others’), documented in the OED. For drift into the penumbra, I am on more speculative ground, but the reader may wish to ponder the words snooker and snipe. I feel that they belong in the penumbra, not the neutral zone, of [sn-]: as words they seem absurdly jokey and vivid for the purpose of denoting an ordinary indoor sport and bird species.Footnote 22 The Broadway composer Irving Berlin evidently felt a sense of pull for the phonestheme [j-] when he wrote the musical Yip Yip Yaphank, attracting the name of the Long Island town where he did his Army service ([ˈjæpæŋk]) into the penumbra of the [j-] phonestheme.
What enables a neutral-zone word to resist the inward pull of its component phonestheme? I suspect frequency matters: in my lexical database, the most frequent words (per CELEX) beginning with the phonesthemes discussed here have at most a modest penumbral tinge: snow,Footnote 23 Z, zone, year, use, young. The other cause of phonestheme resistance is speech register: formal or technical words are incompatible with the stylistic character of phonesthesia, and so they can contain the phonesthemic sequence without being pulled in: Snell's Law, zoning, zinc, zinnia, yarrow, ubiquity. For further discussion see Magnus (Reference Magnus2001: 5, 10, 34, 72).
An important implication of the above for present purposes is this: a word coined for purposes of writing a children's book would be unlikely to occupy the neutral zone of any phonestheme it contains. If an author uses the segments of a phonestheme, it will probably be perceived by readers as being intended as a phonestheme. The word frequency of a coinage is very low (i.e. zero); and technical or formal vocabulary would hardly be expected in a children's book.
3.4.4 Phonesthemes in general: summary
Summing up, in the discussion below I will approach Seuss’s coinages from the viewpoint of the three-way taxonomy of (5), which emphasizes (a) the stylistic role of phonesthemes; (b) the possibility of phonesthemes that convey style but not the relevant meaning; (c) gravitational attraction, under specified conditions, from neutral zone to penumbra to core. These ideas can be connected to the rough theories of phonesthemes discussed above. The core words acquire their phonesthetic meaning via the word-comparison process, during language acquisition. Core words tend to be felt as vernacular for the reason given earlier; that use in coinages over time gradually lends the phonestheme a vernacular tone. Words of the penumbra have meanings that cannot be accommodated within the phonestheme's semantic territory, but speakers nevertheless apprehend their vernacular character, either from context, or simply by adopting the reasonable hypothesis that whatever is phonesthemic is also vernacular. Lastly, neutral zone words are the words that can escape the gravitational-attraction mechanism: either they are so frequent that they can maintain their style and meaning on their own, or they fall into a dry, technical lexical domain, so that no one would think of using them in vernacular style.
At this point we can turn to some of the particular phonesthemes used in Seuss’s coinages. I will argue that a minority of the phonesthemic usages in Seuss are core, the rest are penumbral and none are neutral.
3.4.5 [sn-]
The 21 Seuss coinages that begin with the ‘nasal’ phonestheme [sn-] are given in (9):
(9) Coinages with [sn-]
snop, snarggled, Snarp, snaff, Snux, Snumm, snuv, Snegg, Sneth, Snick, Snimm, Snee, Sneetch, Sneetcher, Sneedle, Sneeden, Sneelock, Sneepy, Snooker(s), Snoo, Snoor
Of these, I have identified four as belonging to the core of [sn-]. Snaff, from The Big Brag, inherits the phonesthemic status of sniff, of which it is a jocular past tense. Snargled appears in a sequence of verbs with sneezed, snuffled and sniffed, describing inhalation of polluted air, in The Lorax. The snobbish Sneetches plainly qualify, per Seuss’s description:
(10) With their snoots in the air, they would sniff and they'd snort
A more subtle case is the Sneedle, from On Beyond Zebra: this is an insect whose nose takes the form of a large and frightening stinger:
(11) Then we go on to SNEE. And the SNEE is for Sneedle
A terrible kind of ferocious mos-keedle
Whose hum-dinger stinger is sharp as a needle.
However, this seems to exhaust the core [sn-] words in Seuss, as the remaining 18 [sn-] coinages are slim pickings for anyone seeking out nasal meaning. For instance, the Drum-Tummied Snumm, from (2) above, has a spectacular tummy, but a very ordinary nose. Elsewhere in If I Ran the Circus, neither the Harp-Twanging Snarp nor Mr. Sneelock seem nasal in any way, and the same goes for the remaining words in (9). I would suggest that these forms are indeed penumbral; i.e. expressive but not nasally meaningful.
3.4.6 [z-]
[z-] is given short shrift by my primary reference, Marchand (Reference Marchand1960) (‘an infrequent initial’) but is taken more seriously by Wescott (Reference Wescott1980), who demonstrates considerable productivity for it. Let us consider the cases from my own data corpus. The 47 real [z]-initial words in my dictionary include eight that seem fairly clearly phonesthetic: zest, zigzag, zing, zip, zoom, zot, zany and zap. Were I try to define the core meaning of [z-], I would guess something like ‘with great liveliness’. Thus, a person who is zany is not just somewhat crazy, but crazy in a lively way; for a lizard to zap an insect it must make a very abrupt movement of its tongue.
The [z-] phonestheme also appears to have a penumbra. For example, zit is a very expressive way to denote a pimple, but pimples are not lively. Zilch means ‘nothing’, but is used to express the idea with feeling and humor. Zonked is plainly expressive but denotes stupor rather than liveliness. There is a neutral zone, composed of technical expressions like zinc and zinnia. A possible example of a neutral-zone word drawn toward the penumbra is Zenith, a brand name that did well for selling television sets in Seuss’s day.
[z-] has been noticed before by scholars of the Seuss coinages (Teuber Reference Teuber2018; Keyes Reference Keyes2021) and is indeed the most frequent phonestheme in his work, with 40 occurrences.
(12) Coinages with [z-]
Zomba-ma-tant, Zozzfozzel, Zax, Zans, zang, Zatz, Zatz-it, zazz, Zuff, Zuk, zum, zummer, Zummz, Zummzian, Zutt, zuzz, Zall, Zong, Zorn, Zower, Zike, Zed, Zellar, Zelf, Zable, Zayt, Zidd, Ziff, Zillow, Zinn-a-Zu, Zind, Zinzibar, zizz, Zizzer-Zoof, Zizzy, Zizzer-Zazzer-Zuzz, Zeep, Zook, Zooie, zoop
Searching through their meanings, we again find just a few cases in which the [z]-word occupies the phonesthemic core. Zoop is part of Zoom-a-zoop, describing a virtuosic trapeze act in If I Ran the Circus. When the bird character Gertrude McFuzz suddenly sprouts a spectacular tail to satisfy her vanity, she does it ‘With a zang! With a zing!’. With an extension to not-quite-initial position, we may include G-r-r-zapp, G-r-r-zibb, G-r-r-zopp, the sounds of the arrows shot by the Yeoman of the Bowmen in The 500 Hats of Bartholemew Cubbins. However, most instances of [z-] in Seuss appear to be only penumbral. Notably, several [z]-initial Seussian animals are placid and serene: the Ziffs and Zuffs of Scrambled Eggs Super, the Zizzer-Zazzer-Zuzz of Dr. Seuss's ABC, the Zatz-It of On Beyond Zebra, and the Zans and the Zeep of One Fish Two Fish Red Fish Blue Fish. Footnote 24
3.4.7 [j-]
For Marchand (Reference Marchand1960: 334) this phonestheme is for ‘words expressive of vocal sounds’; my own preference would be to characterize it as ‘vigorous or uncontrolled vocalization’. Core examples are: yahoo, yammer, yatter, yap, yawp, yell, yelp, yip, yipe, yippee, yo, yowl and yodel. Some penumbral words are yo-yo, yank and Yankee. [j-] is a ‘weaker’ phonestheme than the other two and it has a large neutral zone including words like yellow,Footnote 25 yoke, yarn, yolk and eucharist. Neutral zone words that (for me at least) risk falling into the penumbra are Yonkers, yak and yam, which seem a bit silly for purposes of denoting a city, an animal and a vegetable; see also Yaphank, above.
As a phonestheme in Seuss [j-] includes the following core items:
(13) Core [j-]-initial coinages in Seuss
(a) YOPP, the cry of help made by a small Who that saves the Whos from destruction (Horton Hears a Who)Footnote 26
(b) Yekko, a beast who ‘howls in an underground grotto in Gekko’ (On Beyond Zebra)
(c) Ying, a creature with whom it is fun to sing (One Fish Two Fish)
But as before, the penumbral examples outnumber them: these include Yop (this time a name of a creature, in One Fish Two Fish); Yink, another creature in One Fish Two Fish; and Yupster, a place name in On Beyond Zebra. There are about ten other cases.
To sum up this section: the patterning of phonesthemes in Seuss’s coinages matches their behavior in real language: we find full-blown core coinages like Sneedle, bearing the appropriate meaning; as well as penumbral coinages like Snumm, in which the phonestheme provides only expressiveness and style. The third case, namely appearance of the phonesthemic segments without any phonesthetic effect at all, appears to be impossible, since in real life these cases exist only among words that are frequent or learned, neither of which could plausibly be used in a Seuss coinage.
4 Are these speculations on the right track? A statistical test
To return to the main thread, we sought to explain in general terms Seuss’s coinage practice, and came up with four hypotheses:
• Words that match Seuss’s meter are likely to be Seuss coinages.
• Words that are phonotactically aberrant are likely to be Seuss coinages.
• Words that sound German are likely to be Seuss coinages.
• Words that contain phonesthemes are likely to be Seuss coinages.
The statistical model described in section 2 was meant to provide the raw material for evaluating these hypotheses in detail. However, that model only tests phoneme sequences as such, and we have not yet tested whether it is really true that it is the phonesthemic status of these sequences, as I have claimed, that is essential. Perhaps Seuss’s practice is systematic, but has nothing to do with phonesthemes. Hypothesis-testing in this domain is not straightforward given the notorious subjectivity of phonesthemic analysis.
Hoping to find objectivity, I constructed a second logistic regression model on a different basis. Whereas the previous model was an attempt to scrutinize a great number of potential features, hoping to find the best ones, my second model implements only the proposed phonesthemes found in one single reference source, Marchand (Reference Marchand1960); I will call it the Marchand Model.Footnote 27 The model is less complete and accurate than the Full Model given in tables 1–2, but it is arguably objective. Marchand had no ax to grind concerning Seuss, but simply compiled a long list, offering his considered and informed judgment (based on examination of numerous examples) of whether a particular sequence was phonesthemic.
In compiling his list it is clear that Marchand examined all English vowels, all possible onsets and a great many syllable rhymes.Footnote 28 In these domains, if Marchand makes no mention of a sequence, it is reasonable to infer that he saw no reason to call it a phonestheme. Unsurprisingly, there is much overlap in the features of the Marchand Model with my Full Model, and to show this, I included the information (page number) of Marchand's discussion of these various sequences in tables 1 and 2 above. As before, the complete Marchand Model may be inspected in the Supplemental Materials.
I made two versions of the Marchand Model, coarse-grained and fine-grained. The coarse-grained version implements the intended statistical test. It has just five features, shown in table 4.
I fitted the coarse-grained Marchand Model to the same data as before, again using the BayesGLM()package in R. The constraint weights and significance values that were calculated are given in (14).
(14) Result of fitting the coarse-grained Marchand Model
The model, being so coarse, is far less effective than the model of tables 1 and 2 in predicting Seussian status; see appendix A for details. The key point of the model is that it permits Marchand's independent testimony to bear on the question of whether Seuss’s practice is indeed phonesthemic. The results of the model are that Germanness, phonotactic ill-formedness and metrical appropriateness all test significant as factors for predicting Seussian status. In addition, phonesthemic status, for sequences identified as such by Marchand, also matters; although the constraint weights may be lower than those for Germanness and phonotactics, the number of words covered is considerably larger.Footnote 30, Footnote 31
To obtain a more detailed look, I also ran a fine-grained version of the Marchand Model that separates out all the Marchand-mentioned features (there are 44 for onsets and 116 for rhymes). The result, available in the Supplemental Materials, demonstrates that most of the work of predicting Seussian status is being done by a fairly small subset of Marchand's features; only 21 of 160 meet the criterion of bearing a weight of at least 1 and receiving a p-value < .001.Footnote 32
The upshot of these studies, I believe, is as follows. If we agree to take Marchand as an impartial witness for phonesthemic status, then it seems almost certain that Seuss is using phonesthemes when he coins words. Further, Seuss is making use of only a modest subset of Marchand's phonesthemes. There are at least two possible reasons for this. First, Marchand may have been overenthusiastic in positing phonesthemes (I tend to think so, particularly among the onsets). Second, Seuss was perhaps making an unconscious artistic decision, choosing his favorites from a larger available inventory.
5 Conclusions
Verbal artists, particularly popular artists, must rely on phonological resources they share with their reading community. This dictum is confirmed by Seuss’s coinage practice. First, native speakers of English internalize a detailed phonotactics of their language, which leads them to be amused by novel words like Thneed. Speakers also have some ability to internalize phonotactic principles of languages they don't speak but find accessible, and hence can be entertained by novel pseudo-German words like Schlottz. Lastly, they have internalized a system of phonesthemes, which gives them the ability to appreciate novel phonesthemic words. Just like real-life phonesthetic words, coined ones may either include the semantic component of the phonestheme, as in Sneedle, or exclude it, with the phonestheme offering only a sense of style, as in Snumm.
It goes without saying that the rigor of the research reported in this article would be increased by extensive experimentation, in the research tradition of, e.g., Fordyce (Reference Fordyce1988), Hutchins (Reference Hutchins1998) and Bergen (Reference Bergen2004). We would like to know more about which proposed phonemes are actually internalized by native speakers, what meanings they are assigned, whether my proposed ‘penumbra’ (section 3.4.2) is psychologically real, and (on a different topic) the extent to which older American English speakers (the original Seuss audience) have internalized the phonotactics of German (section 3.3.1). Since my account depends on the ability of people to learn the stylistic affiliation of particular linguistic entities, we are also in need of a theory of how this is done.
Lastly, it might also be useful to carry out studies comparing Seuss’s use of phonesthemes with that of other word-coiners – in literature, in ordinary life and in industry (see Wong Reference Wong2014, and the Pokémon research cited in section 1). I imagine that such study would find considerable variation. While Seuss’s choices were principled, they probably access only a subset of the possibilities offered by the resources the language offers – this is what the fine-grained Marchand Model (section 4) suggests. My conjecture reflects the view (section 3.4) that word coinage is a folk art: within the limits of what their language makes available, verbal artists can make choices.
Appendix A: Performance of the models compared
The metrics given here are explicated in, e.g., Johnson (Reference Johnson2008).
Appendix B: Coinages homophonous with real words
I excluded from the regression analyses the 66 Seuss ‘coinages’ that exist as real words, but are used in context as novel. For instance, Flummox is used in Seuss not as a verb but as a noun, the name of an imaginary creature in If I Ran the Circus. From the viewpoint of the Full Model developed in section 2, these ‘semi-coinages’ appear to have an intermediate status, as the figures in (15) indicate.
The ten real words with highest Seussian model probability are Zipp (0.722), Flummox (0.489), Krupp (0.338), guff (0.329), Snide (0.264), Fuddle (0.227), Quibble (0.222), duff (0.179), Snell (0.167) and Gusset (0.15).
Appendix C: The coinage principles motivated by meter
I discuss in this appendix two principles of Seuss coinage that are based on the fact that he would naturally favor coinages that fit well with his favorite meter, which is anapestic tetrameter, base form /x x X x x X x x X x x X/. Seuss took care to write this meter with strict adherence to syllable count; i.e. he sought never to invoke the license (common among his parodists) of substituting a binary for a ternary foot. Hence, words like Bippo-no-Bungus or Motta-fa-Potta-fa-Pell,Footnote 33 which have the sequence ˈσ σ̆ σ̆ ˈσ, serve a useful metrical purpose.Footnote 34 They are fairly common in the Seuss oeuvre, despite his general dispreference for long coinages (table 2, (a)).
A special case is found in words that include the medial sequence [əmə], as in Katta-ma-side, Yuzz-a-ma-Tuzz and six similar instances. This sequence appears in real English words, usually vernacular; the cases known to me are razzamatazz, rigamarole, tacamahac, Fishamajig, whatcha-ma-callit, thingamajig(ger), thingamabob and Kalamazoo (the last three appear in Seuss). The morphological status of [-əmə-] is obscure to me, but it does seem to be productive (e.g. rigamarole evolved from earlier rigmarole; OED); perhaps it is a phonestheme. In any event, [-əmə-] provided Seuss with a source for words that are both colloquial and metrically facilitating.