Book contents
- Frontmatter
- Contents
- List of contributors
- 1 Multimodal signal processing for meetings: an introduction
- 2 Data collection
- 3 Microphone arrays and beamforming
- 4 Speaker diarization
- 5 Speech recognition
- 6 Sampling techniques for audio-visual tracking and head pose estimation
- 7 Video processing and recognition
- 8 Language structure
- 9 Multimodal analysis of small-group conversational dynamics
- 10 Summarization
- 11 User requirements for meeting support technology
- 12 Meeting browsers and meeting assistants
- 13 Evaluation of meeting support technology
- 14 Conclusion and perspectives
- References
- Index
8 - Language structure
Published online by Cambridge University Press: 05 July 2012
- Frontmatter
- Contents
- List of contributors
- 1 Multimodal signal processing for meetings: an introduction
- 2 Data collection
- 3 Microphone arrays and beamforming
- 4 Speaker diarization
- 5 Speech recognition
- 6 Sampling techniques for audio-visual tracking and head pose estimation
- 7 Video processing and recognition
- 8 Language structure
- 9 Multimodal analysis of small-group conversational dynamics
- 10 Summarization
- 11 User requirements for meeting support technology
- 12 Meeting browsers and meeting assistants
- 13 Evaluation of meeting support technology
- 14 Conclusion and perspectives
- References
- Index
Summary
Introduction
While the meeting setting creates many challenges just in terms of recognizing words and who is speaking them, once we have the words, there is still much to be done if the goal is to be able to understand the conversation. To do this, we need to be able to understand the language and the structure of the language being used.
The structure of language is multilayered. At a fine-grained, detailed level, we can look at the structure of the spoken utterances themselves. Dialogue acts which segment and label the utterances into units with one core intention are one type of structure at this level. Another way of looking at understanding language at this level is by focusing on the subjective language being used to express internal mental states, such as opinions, (dis-)agreement, sentiments, and uncertainty.
At a coarser level, language can be structured by the topic of conversation. Finally, within a given topic, there is a structure to the language used to make decisions. Language understanding is sufficiently advanced to capture the content of the conversation for specific phenomena like decisions based on elaborate domain models. This allows an indexing and summarization of meetings at a very high degree of understanding.
Finally, the language of spoken conversation differs significantly from written language. Frequent types of speech disfluencies can be detected and removed with techniques similar to those used for understanding language structure as described above.
- Type
- Chapter
- Information
- Multimodal Signal ProcessingHuman Interactions in Meetings, pp. 125 - 154Publisher: Cambridge University PressPrint publication year: 2012