Hostname: page-component-745bb68f8f-b6zl4 Total loading time: 0 Render date: 2025-01-10T05:48:01.329Z Has data issue: false hasContentIssue false

Robust discourse parsing via discourse markers, topicality and position

Published online by Cambridge University Press:  21 August 2002

FRANK SCHILDER
Affiliation:
Department for Informatics, University of Hamburg, Vogt-Kölln-Str. 30, 22527 Hamburg, Germany

Abstract

This paper describes a simple discourse parsing and analysis algorithm that combines a formal underspecification utilising discourse grammar with Information Retrieval (IR) techniques. First, linguistic knowledge based on discourse markers is used to constrain a totally underspecified discourse representation. Then, the remaining underspecification is further specified by the computation of a topicality score for every discourse unit. This computation is done via the vector space model. Finally, the sentences in a prominent position (e.g. the first sentence of a paragraph) are given an adjusted topicality score. The proposed algorithm was evaluated by applying it to a text summarisation task. Results from a psycholinguistic experiment, indicating the most salient sentences for a given text as the ‘gold standard’, show that the algorithm performs better than commonly used machine learning and statistical approaches to summarisation.

Type
Research Article
Copyright
2002 Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)