Neuroethological investigations of mammalian and avian
auditory systems have documented species-specific specializations
for processing complex acoustic signals that could, if viewed in
abstract terms, have an intriguing and striking relevance for human
speech sound categorization and representation. Each species forms
biologically relevant categories based on combinatorial analysis of
information-bearing parameters within the complex input signal. This
target article uses known neural models from the mustached bat and
barn owl to develop, by analogy, a conceptualization of human
processing of consonant plus vowel sequences that offers a partial
solution to the noninvariance dilemma – the nontransparent
relationship between the acoustic waveform and the phonetic segment.
Critical input sound parameters used to establish species-specific
categories in the mustached bat and barn owl exhibit high correlation
and linearity due to physical laws. A cue long known to be relevant to
the perception of stop place of articulation is the second formant
(F2) transition. This article describes an empirical phenomenon
– the locus equations – that describes the relationship
between the F2 of a vowel and the F2 measured at the onset of a
consonant-vowel (CV) transition. These variables, F2 onset and F2
vowel within a given place category, are consistently and robustly
linearly correlated across diverse speakers and languages, and even
under perturbation conditions as imposed by bite blocks. A functional
role for this category-level extreme correlation and linearity (the
“orderly output constraint”) is hypothesized based on the
notion of an evolutionarily conserved auditory-processing strategy.
High correlation and linearity between critical parameters in the
speech signal that help to cue place of articulation categories might
have evolved to satisfy a preadaptation by mammalian auditory systems
for representing tightly correlated, linearly related components of
acoustic signals.