Book contents
- The Cambridge Handbook of Language in Context
- Cambridge Handbooks in Language and Linguistics
- The Cambridge Handbook of Language in Context
- Copyright page
- Contents
- Figures
- Tables
- Contributors
- Acknowledgments
- Language in Context Studies
- Part I Language in Context: A Sociohistorical Perspective
- Part II Philosophical, Semantic, and Grammatical Approaches to Context
- Part III Pragmatic Approaches to Context
- Part IV Applications of Context Studies
- Part V Advances in Multimodal and Technological Context-Based Research
- 19 Nonverbal Communication and Context: Multimodality in Interaction
- 20 AI, Human–Robot Interaction, and Natural Language Processing
- 21 Social Media and Computer-Mediated Communication
- Index
- References
20 - AI, Human–Robot Interaction, and Natural Language Processing
from Part V - Advances in Multimodal and Technological Context-Based Research
Published online by Cambridge University Press: 30 November 2023
- The Cambridge Handbook of Language in Context
- Cambridge Handbooks in Language and Linguistics
- The Cambridge Handbook of Language in Context
- Copyright page
- Contents
- Figures
- Tables
- Contributors
- Acknowledgments
- Language in Context Studies
- Part I Language in Context: A Sociohistorical Perspective
- Part II Philosophical, Semantic, and Grammatical Approaches to Context
- Part III Pragmatic Approaches to Context
- Part IV Applications of Context Studies
- Part V Advances in Multimodal and Technological Context-Based Research
- 19 Nonverbal Communication and Context: Multimodality in Interaction
- 20 AI, Human–Robot Interaction, and Natural Language Processing
- 21 Social Media and Computer-Mediated Communication
- Index
- References
Summary
An AI-driven (or AI-assisted) speech or dialogue system, from an engineering perspective, can be decomposed into a pipeline with a subset of the following three distinct processing activities: (1) Speech processing that turns sampled acoustic sound waves into enriched phonetic information through automatic speech recognition (ASR), and vice versa via text-to-speech (TTS); (2) Natural Language Processing (NLP), which operates at both syntactic and semantic levels to get at the meanings of words as well as of the enriched phonetic information; (3) Dialogue processing which ties both together so that the system can function within the specified latency and semantic constraints. This perspective allows for at least three levels of context. The lowest level is phonetic, where the fundamental components of speech are built from a time-sequence string of acoustic symbols (analyzed in ASR or generated in TTS). The next higher level of context is word- or character-level, normally postulated as sequence-to-sequence modeling. The highest level of context typically used today keeps track of a conversation or topic. An even higher level of context, generally missing today, but which will be essential in future, is that of our beliefs, desires, and intentions.
- Type
- Chapter
- Information
- The Cambridge Handbook of Language in Context , pp. 436 - 454Publisher: Cambridge University PressPrint publication year: 2023
References
- 2
- Cited by