From data to speech: a general approach

M. THEUNE; E. KLABBERS; J. R. DE PIJPER; E. KRAHMER; J. ODIJK

doi:10.1017/S1351324901002625

From data to speech: a general approach

Published online by Cambridge University Press: 26 April 2001

M. THEUNE ,

E. KRAHMER and

M. THEUNE: Affiliation:
IPO, Center for User-System Interaction, P.O. Box 513, 5600 MB Eindhoven, The Netherlands; e-mail: [email protected], [email protected], [email protected], [email protected]
E. KLABBERS: Affiliation:
IPO, Center for User-System Interaction, P.O. Box 513, 5600 MB Eindhoven, The Netherlands; e-mail: [email protected], [email protected], [email protected], [email protected]
J. R. DE PIJPER: Affiliation:
IPO, Center for User-System Interaction, P.O. Box 513, 5600 MB Eindhoven, The Netherlands; e-mail: [email protected], [email protected], [email protected], [email protected]
E. KRAHMER: Affiliation:
IPO, Center for User-System Interaction, P.O. Box 513, 5600 MB Eindhoven, The Netherlands; e-mail: [email protected], [email protected], [email protected], [email protected]
J. ODIJK: Affiliation:
Lernout & Hauspie Speech Products, Flanders Language Valley 50, 8900 Ieper, Belgium; e-mail: [email protected]

Article contents

Abstract

Get access

Rights & Permissions

Abstract

We present a data-to-speech system called D2S, which can be used for the creation of data-to-speech systems in different languages and domains. The most important characteristic of a data-to-speech system is that it combines language and speech generation: language generation is used to produce a natural language text expressing the system's input data, and speech generation is used to make this text audible. In D2S, this combination is exploited by using linguistic information available in the language generation module for the computation of prosody. This allows us to achieve a better prosodic output quality than can be achieved in a plain text-to-speech system. For language generation in D2S, the use of syntactically enriched templates is guided by knowledge of the discourse context, while for speech generation pre-recorded phrases are combined in a prosodically sophisticated manner. This combination of techniques makes it possible to create linguistically sound but efficient systems with a high quality language and speech output.

Type: Research Article
Information: Natural Language Engineering , Volume 7 , Issue 1 , March 2001 , pp. 47 - 86

DOI: https://doi.org/10.1017/S1351324901002625 [Opens in a new window]

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article contents

From data to speech: a general approach

Abstract

Access options

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests