Finite state segmentation of discourse into clauses

EVA EJERHED

doi:10.1017/S1351324997001629

Finite state segmentation of discourse into clauses

Published online by Cambridge University Press: 01 December 1996

EVA EJERHED

Show author details

EVA EJERHED: Affiliation:
Department of Linguistics, University of Umeå, S-90187 Umeå, Sweden. e-mail: [email protected]

Article contents

Abstract

Get access

Rights & Permissions

Abstract

The paper presents background and motivation for a processing model that segments discourse into units that are simple, non-nested clauses, prior to the recognition of clause internal phrasal constituents, and experimental results in support of this model. One set of results is derived from a statistical reanalysis of the Swedish empirical data in Strangert, Ejerhed and Huber 1993 concerning the linguistic structure of major prosodic units. The other set of results is derived from experiments in segmenting part of speech annotated Swedish text corpora into clauses, using a new clause segmentation algorithm. The clause segmented corpus data is taken from the Stockholm Umeå Corpus (SUC), 1 M words of Swedish texts from different genres, part of speech annotated by hand, and from the Umeå corpus DAGENS INDUSTRI 1993 (DI93), 5 M words of Swedish financial newspaper text, processed by fully automatic means consisting of tokenizing, lexical analysis, and probabilistic POS tagging. The results of these two experiments show that the proposed clause segmentation algorithm is 96% correct when applied to manually tagged text, and 91% correct when applied to probabilistically tagged text.

Type: Research Article
Information: Natural Language Engineering , Volume 2 , Issue 4 , December 1996 , pp. 355 - 364

DOI: https://doi.org/10.1017/S1351324997001629 [Opens in a new window]

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article contents

Finite state segmentation of discourse into clauses

Abstract

Access options

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests