Hostname: page-component-586b7cd67f-rcrh6 Total loading time: 0 Render date: 2024-11-28T16:24:44.827Z Has data issue: false hasContentIssue false

Speaker-specific processing and local context information: The case of speaking rate

Published online by Cambridge University Press:  29 December 2015

EVA REINISCH*
Affiliation:
Ludwig Maximilian University Munich
*
ADDRESS FOR CORRESPONDENCE Eva Reinisch, Institute of Phonetics and Speech Processing, Ludwig Maximilian University Munich, Schellingstraße 3, Munich 80799, Germany. E-mail: [email protected]

Abstract

To deal with variation in the speech signal, listeners rely on local context, such as speaking rate in a carrier sentence directly preceding a target, as well as more global properties of the speech signal, such as speaker-specific pronunciation variants. The present study addressed whether, despite its variability even within one speaker, habitual speaking rate can be tracked as a speaker-specific property and how such speaker-specific tracking of habitual rate would interact with effects of local-rate normalization. In two experiments, listeners were exposed to a 2-min dialogue between a fast and a slow speaker. At test, listeners categorized minimal word pair continua differing in the German /a/–/a:/ duration contrast spoken by the same two speakers. The results showed that listeners responded with /a:/ more often for the fast speaker but only when words were presented in isolation and not when presented with additional local-rate information. That is, despite the general assumption that duration cues and speaking rate are too variable to be used in a speaker-specific fashion, tracking habitual speaking rate may help speech perception. The results are discussed in relation to a belief-updating model of perceptual adaptation and exemplar models.

Type
Articles
Copyright
Copyright © Cambridge University Press 2015 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

REFERENCES

Abramson, A. S., & Lisker, L. (1985). Relative power of cues: F0 shift versus voice timing. In Fromkin, V. (Ed.), Phonetic linguistics: Essays in honor of Peter Ladefoged (pp. 2533). New York: Academic Press.Google Scholar
Allen, S. J., & Miler, J. L. (2004). Listener sensitivity to individual talker differences in voice-onset-time. Journal of the Acoustical Society of America, 115, 31713183.CrossRefGoogle ScholarPubMed
Allen, J. S., Miller, J. L., & deSteno, D. (2003). Individual talker differences in voice-onset-time. Journal of the Acoustical Society of America, 113, 544552.Google Scholar
Baese-Berk, M., Bradlow, A. R., & Wright, B. A. (2013). Accent-independent adaptation to foreign-accented speech. Journal of the Acoustical Society of America, 133, EL174EL180.Google Scholar
Baese-Berk, M. M., Heffner, C. C., Dilley, C. L., Pitt, M. A., Morrill, T. H., & McAuley, J. D. (2014). Long-term temporal tracking of speech rate affects spoken-word recognition. Psychological Science, 25, 15461553.Google Scholar
Barr, D. J., Levy, R., Scheepers, C., & Tily, H. (2013). Random-effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language, 68, 255278.Google Scholar
Boersma, P., & Weenink, D. (2009). PRAAT, doing phonetics by computer (version 5.1) [Computer software]. Retrieved from http://www.praat.org Google Scholar
Bradlow, A. R., & Bent, T. (2008). Perceptual adaptation to non-native speech. Cognition, 106, 707729.Google Scholar
Brouwer, S., Mitterer, H., & Huettig, F. (2012). Speech reductions change the dynamics of competition during spoken word recognition. Language and Cognitive Processes, 27, 539571.Google Scholar
Clark, A. (2013). Whatever next? Predictive brains, situated agents, and the future of cognitive science. Behavioral and Brain Sciences, 36, 181253.Google Scholar
Cook, T. D., & Campbell, D. T. (1979). Quasi-experimentation: Design and analysis issues for field settings. Boston: Houghton Mifflin.Google Scholar
Creel, S. C., & Tumlin, M. A. (2011). On-line acoustic and semantic interpretation of talker information. Journal of Memory and Language, 65, 264285.Google Scholar
Crystal, T. H., & House, A. S. (1982). Segmental durations in connected speech signals: Preliminary results. Journal of the Acoustical Society of America, 72, 705716.Google Scholar
Crystal, T. H., & House, A. S. (1988). Segmental durations in connected-speech signals: Current results. Journal of the Acoustical Society of America, 83, 15531573.Google Scholar
Dilley, L. C., & Pitt, M. A. (2010). Altering context speech rate can cause words to appear or disappear. Psychological Science, 21, 16641670.Google Scholar
Farmer, T. A., Brown, M., & Tanenhaus, M. C. (2013). Prediction, explanation, and the role of generative models in language processing: Commentary to Clark, A. Behavioral and Brain Sciences, 36, 211212.CrossRefGoogle Scholar
Gay, T. (1978). Effect of speaking rate on vowel formant movements. Journal of the Acoustical Society of America, 63, 223230.Google Scholar
Goldinger, S. D. (1996). Words and voices: Episodic traces in spoken word identification and recognition memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 22, 11661183.Google ScholarPubMed
Goldinger, S. D. (1998). Echoes of echoes? An episodic theory of lexical access. Psychological Review, 105, 251279.Google Scholar
Green, K. P., Tomiak, G. R., & Kuhl, P. K. (1997). The encoding of rate and talker information during phonetic perception. Perception & Psychophysics, 59, 675692.Google Scholar
Jessen, M. (1993). Stress conditions on vowel quality and quantity in German. Working Papers of the Cornell Phonetics Laboratory, 8, 127.Google Scholar
Johnson, K. (1997). Speech perception without speaker normalization: An exemplar model. In Johnson, K. & Mullennix, J. W. (Eds.), Talker variability in speech processing (pp. 145165). San Diego, CA: Academic Press.Google Scholar
Johnson, K. (2006). Resonance in an exemplar-based lexicon: The emergence of social identity and phonology. Journal of Phonetics, 34, 485499.CrossRefGoogle Scholar
Jungers, M. K., & Hupp, J. M. (2009). Speech priming: Evidence for rate persistence in unscripted speech. Language and Cognitive Processes, 24, 611624.Google Scholar
Kidd, G. R. (1989). Articulatory-rate context effects in phoneme identification. Journal of Experimental Psychology: Human Perception and Performance, 15, 736748.Google Scholar
Kleinschmidt, D. F., & Jaeger, T. F. (2015). Robust speech perception: Recognize the familiar, generalize to the similar, and adapt to the novel. Psychological Review, 122, 148203.CrossRefGoogle Scholar
Koreman, J. (2006). Perceived speech rate: The effects of articulation rate and speaking style in spontaneous speech. Journal of the Acoustical Society of America, 119, 582596.Google Scholar
Kraljic, T., & Samuel, A. G. (2007). Perceptual adjustments to multiple speakers. Journal of Memory and Language, 56, 115.Google Scholar
Lisker, L., & Abramson, A. S. (1964). A cross language study of voicing in initial stops: Acoustic measurements. Word, 20, 384420.Google Scholar
Lisker, L., & Abramson, A. S. (1967). Some effects of context on voice onset time in English stops. Language and Speech, 10, 128.Google Scholar
McQueen, J. M., & Huettig, F. (2012). Changing only the probability that spoken words will be distorted changes how they are recognized. Journal of the Acoustical Society of America, 131, 509517.Google Scholar
Miller, J. L. (1987). Rate-dependent processing in speech perception. In Ellis, A. W. (Ed.), Progress in the psychology of language (Vol. 3, pp. 119157). London: Erlbaum.Google Scholar
Miller, J. L., & Dexter, E. R. (1988). Effects of speaking rate and lexical status on phonetic perception. Journal of Experimental Psychology: Human Perception and Performance, 14, 369378.Google ScholarPubMed
Miller, J. L., Grosjean, F., & Lomanto, C. (1984). Articulation rate and its variability in spontaneous speech: A reanalysis and some implications. Phonetica, 41, 215225.CrossRefGoogle ScholarPubMed
Miller, J. L., & Liberman, A. M. (1979). Some effects of later-occurring information on the perception of stop consonant and semivowel. Perception & Psychophysics, 25, 457465.Google Scholar
Newman, R. S., & Sawusch, J. R. (1996). Perceptual normalization for speaking rate: Effects of temporal distance. Perception & Psychophysics, 58, 540560.CrossRefGoogle ScholarPubMed
Newman, R. S., & Sawusch, J. R. (2009). Perceptual normalization for speaking rate: III. Effects of the rate of one voice on perception of another. Journal of Phonetics, 37, 4665.Google Scholar
Norris, D., McQueen, J. M., & Cutler, A. (2003). Perceptual learning in speech. Cognitive Psychology, 47, 204238.Google Scholar
Nygaard, L. C., & Pisoni, D. B. (1998). Talker-specific learning in speech perception. Perception & Psychophysics, 60, 355376.Google Scholar
Nygaard, L. C., Sommers, M. S., & Pisoni, D. B. (1994). Speech perception as a talker-contingent process. Psychological Science, 5, 4246.Google Scholar
Pätzold, M., & Simpson, A. P. (1997). Acoustic analysis of German vowels in read speech. In Simpson, A. P., Kohler, K. J., & Rettstadt, T. (Eds.). The Kiel Corpus of Read/Spontaneous Speech—Acoustic database, processing tools and analysis results (AIPUK Vol. 32, pp. 215247). Kiel, Germany: IPDS.Google Scholar
Pierrehumbert, J. B. (2001). Exemplar dynamics: Word frequency, lenition and contrast. in Bybee, J. & Hopper, P. (Eds.), Frequency effects and the emergence of linguistic structure (pp. 137157). Amsterdam: John Benjamins.Google Scholar
Poellmann, K., Mitterer, H., & McQueen, J. M. (2014). Use what you can: Storage, abstraction processes, and perceptual adjustments help listeners recognize reduced form. Frontiers in Psychology: Language Sciences, 5, 437.Google Scholar
Quené, H. (2008). Multilevel modeling of between-speaker and within-speaker variation in spontaneous speech tempo. Journal of the Acoustical Society of America, 123, 11041113.CrossRefGoogle ScholarPubMed
Quené, H. (2013). Longitudinal trends in speech tempo: The case of queen Beatrix. Journal of the Acoustical Society of America, 133, EL452EL457.Google Scholar
Reinisch, E., Jesse, A., & McQueen, J. M. (2011a). Speaking rate from proximal and distal contexts is used during word segmentation. Journal of Experimental Psychology: Human Perception and Performance, 37, 978996.Google Scholar
Reinisch, E., Jesse, A., & McQueen, J. M. (2011b). Speaking rate affects the perception of duration as a suprasegmental lexical-stress cue. Language and Speech, 54, 147166.Google Scholar
Reinisch, E., & Sjerps, M. J. (2013). Compensation for speaking rate and spectral context take place at a similar point in time. Journal of Phonetics, 41, 101116.Google Scholar
Sawusch, J. R., & Newman, R. S. (2000). Perceptual normalization for speaking rate: II. Effects of signal discontinuities. Perception & Psychophysics, 62, 285300.Google Scholar
Sjerps, M. J., & Reinisch, , , E. (2015). Divide and conquer: How perceptual contrast sensitivity and perceptual learning cooperate in reducing input variation in speech perception. Journal of Experimental Psychology: Human Perception and Performance, 41, 710722.Google Scholar
Summerfield, Q. (1981). Articulatory rate and perceptual constancy in phonetic perception. Journal of Experimental Psychology: Human Perception and Performance, 7, 10741095.Google Scholar
Theodore, R. M., Miller, J. L., & deSteno, D. (2009). Individual talker differences in voice-onset-time: Contextual influences. Journal of the Acoustical Society of America, 125, 39743982.CrossRefGoogle ScholarPubMed
Tsao, Y.-C., & Weismer, G. (1997). Interspeaker variation in habitual speaking rate: Evidence for a neuromuscular component. Journal of Speech, Language, and Hearing Research, 40, 858866.Google Scholar
Wayland, S. C., Miller, J. L., & Volaitis, L. E. (1994). The influence of sentential speaking rate on the internal structure of phonetic categories. Journal of the Acoustical Society of America, 95, 26942701.CrossRefGoogle ScholarPubMed
Wilson, M., & Wilson, T. P. (2005). An oscillator model of the timing on turn-taking. Psychonomic Bulletin & Review, 12, 957968.Google Scholar