Book contents
- Frontmatter
- Contents
- Preface
- Acknowledgements
- List of abbreviations
- 1 Introduction to multimedia networking
- 2 Digital speech coding
- 3 Digital audio coding
- 4 Digital image coding
- 5 Digital video coding
- 6 Digital multimedia broadcasting
- 7 Multimedia quality of service of IP networks
- 8 Quality of service issues in streaming architectures
- 9 Wireless broadband and quality of service
- 10 Multimedia over wireless broadband
- 11 Digital rights management of multimedia
- 12 Implementations of multimedia networking
- Index
2 - Digital speech coding
Published online by Cambridge University Press: 26 January 2010
- Frontmatter
- Contents
- Preface
- Acknowledgements
- List of abbreviations
- 1 Introduction to multimedia networking
- 2 Digital speech coding
- 3 Digital audio coding
- 4 Digital image coding
- 5 Digital video coding
- 6 Digital multimedia broadcasting
- 7 Multimedia quality of service of IP networks
- 8 Quality of service issues in streaming architectures
- 9 Wireless broadband and quality of service
- 10 Multimedia over wireless broadband
- 11 Digital rights management of multimedia
- 12 Implementations of multimedia networking
- Index
Summary
The human vocal and auditory organs form one of the most useful and complex communication systems in the animal kingdom. All speech (voice) sounds are formed by blowing air from the lungs through the vocal cords (also called the vocal fold), which act like a valve between the lung and vocal tract. After leaving the vocal cords, the blown air continues to be expelled through the vocal tract towards the oral cavity and eventually radiates out from the lips (see Figure 2.1). The vocal tract changes its shape with a relatively slow period (10 ms to 100 ms) in order to produce different sounds [1] [2].
In relation to the opening and closing vibrations of the vocal cords as air blows over them, speech signals can be roughly categorized into two types of signals: voiced speech and unvoiced speech. On the one hand, voiced speech, such as vowels, exhibit some kind of semi-periodic signal (with time-varying periods related to the pitch); this semi-periodic behavior is caused by the up–down valve movement of the vocal fold (see Figure 2.2(a)). As a voiced speech wave travels past, the vocal tract acts as a resonant cavity, whose resonance produces large peaks in the resulting speech spectrum. These peaks are known as formants (see Figure 2.2(b)).
On the other hand, the hiss-like fricative or explosive unvoiced speech, e.g., the sounds, such as s, f, and sh, are generated by constricting the vocal tract close to the lips (see Figure 2.3(a))
- Type
- Chapter
- Information
- Multimedia NetworkingFrom Theory to Practice, pp. 11 - 25Publisher: Cambridge University PressPrint publication year: 2009