This special issue brings together representative views on what has come to be
known as "best practice" in the development and evaluation of spoken language
dialogue systems (SLDSs). The issue was initiated in the context of the European
Esprit project DISC, which ran from June 1997 till February 2000. DISC's main
goal was to identify current practice in both the development and the evaluation of
SLDSs, in order to arrive at a useful definition and description of best practice. The
project has resulted in a collection of guidelines which are intended for different
target groups, in particular developers, deployers and customers.DISC partners were: Natural Interactive Systems Laboratory, Odense University, Denmark
(coordination); Department of Speech, Music and Hearing (KTH), Stockholm, Sweden;
Human-Machine Communication Department, CNRS-LIMSI, Orsay, France; Institute
for Natural Language Processing (IMS), University of Stuttgart, Germany; Vocalis Ltd,
Cambridge, United Kingdom; DaimlerChrysler Research Center Ulm, Germany; and the
ELSNET foundation, Utrecht, The Netherlands.
The last few years the interest in SLDSs has increased enormously. At present
there is a large number of systems available, many of them for commercial use.
Their number is growing rapidly, and so are the variety of their functionalities and
the diversity of their application domains. The tasks that advanced systems are able
to perform are often more complex, less stereotypical, and are often carried out in
the context of several interconnected domains of application. With these advances
have come higher expectations of the naturalness and intelligence with which SLDSs
fulfill their assignments, and as a consequence the interest in such systems has even
grown more, both within academic and commercial circles. As far as natural human-
system interaction is concerned, one significant change in SLDS design concerns
the interaction between natural language understanding and dialogue management.
Here we see a clear tendency towards models that incorporate a substantial amount
of discourse semantics and make use of some conception of context-change. This
allows for more natural interactions between the system and its human users, due
on the one hand to the system's improved ability to compute the intended meaning
of the user's input and on the other to the increased sophistication of the strategies
it uses for planning its own responses. Such improved capacities are crucial when the
system is to leave more of the initiative to the user, instead of keeping the dialogue
on a narrowly circumscribed path of largely predictable exchanges. Further, there
is a tendency to combine spoken language human-system interaction with other
modalities of information exchange and representation (e.g., images and gestures),
asking for both modality-specific and modality-integrating syntactic and semantic
processing capabilities. All these developments have led to a situation in which there
is a great need, shared by developers, deployers and customers alike, for effective
guidelines, which will enable them to make accurate and successful design and
implementation decisions, in accordance with broad consensus of what must be best
practice in this particular engineering domain.