Book contents
9 - Prosody, intonation, and speech technology
Published online by Cambridge University Press: 05 March 2010
Summary
Introduction
The purpose of this chapter is to explore the implications of some facts about prosody and intonation for efforts to create more general and higher quality speech technology. It will emphasize parallels between speech synthesis and speech recognition, because I believe that the challenges presented in these two areas exhibit strong similarities and that the best progress will be made by working on both together.
In the area of synthesis, there are now text-to-speech systems that are useful in many practical applications, especially ones in which the users are experienced and motivated. In order to have more general and higher quality synthesis technology it will be desirable (1) to improve the phonetic quality of synthetic speech to the point where it is as easily comprehended as natural speech and where it is fully acceptable to naive or unmotivated listeners, (2) to use expressive variation appropriately to convey the structure and relative importance of information in complex materials, and (3) to model the speech of people of different ages, sexes, and dialects in order to support applications requiring use of multiple voices.
Engineers working on recognition have a long-standing goal of building systems that can handle large-vocabulary continuous speech. To be useful, such systems must be either speaker-independent or speaker-dependent; if speaker-dependent, engineers must be trained using a sample of speech that can feasibly be collected and analyzed. Present systems exhibit a strong trade-off between degree of speaker independence on the one hand and the size of the vocabulary and branching factor in the grammar on the other.
- Type
- Chapter
- Information
- Challenges in Natural Language Processing , pp. 257 - 280Publisher: Cambridge University PressPrint publication year: 1993
- 6
- Cited by