Quantifying Dialect Similarity by Comparison of the Lexical Distribution of Phonemes

15 - Quantifying Dialect Similarity by Comparison of the Lexical Distribution of Phonemes

Published online by Cambridge University Press: 12 September 2012

Warren Maguire

Edited by

John Nerbonne ,

Charlotte Gooskens ,

Sebastian Kürschner and

Renée van Bezooijen

Show author details

Warren Maguire: Affiliation:
University of Edinburgh
John Nerbonne: Affiliation:
University of Groningen
Charlotte Gooskens: Affiliation:
University of Groningen
Sebastian Kürschner: Affiliation:
Friedrich-Alexander-Universität Erlangen-Nürnberg
Renée van Bezooijen: Affiliation:
University of Groningen

Book contents

Get access

Summary

Abstract This paper describes a new method for quantifying the similarity of the lexical distribution of phonemes in different varieties of a language (in this case English). In addition to introducing the method, it discusses phonological problems which must be addressed if any comparison of this sort is to be attempted, and applies the method to a limited data set of varieties of English. Since the method assesses their structural similarity, it will be useful for analysing the historical development of varieties of English and the relationships (either as a result of common origin or of contact) that hold between them.

INTRODUCTION

In recent years considerable progress has been made in assessing the relationships between linguistic varieties by measuring the similarity between strictly comparable sets of phonetic data. In particular, measurement of Levenshtein Distance (see, for example, Nerbonne, Heeringa, and Kleiweg, 1999; Nerbonne and Heeringa, 2001; Heeringa, 2004) has proved useful for determining the relationships between closely related varieties, and the ‘Sound Comparisons’ method for assessing the distance between varieties provides a very promising alternative technique for looking into the changing relationships between closely-related and not so closely-related varieties (Heggarty, McMahon and McMahon, 2005; McMahon, Heggarty, McMahon and Maguire, 2007).

Phonetic comparison algorithms of this sort are not, however, without their problems. Firstly, they often depend upon auditory phonetic transcriptions of one degree of fineness or another, with all the associated issues of transcriber isoglosses, inaccuracies and realism that this method brings (see Milroy and Gordon, 2003: 144–152 for a discussion of the issues).

Type: Chapter
Information: Computing and Language Variation
International Journal of Humanities and Arts Computing Volume 2
, pp. 261 - 278

Publisher: Edinburgh University Press

Print publication year: 2009

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book contents

15 - Quantifying Dialect Similarity by Comparison of the Lexical Distribution of Phonemes

Summary

Access options

Save book to Kindle

Save book to Dropbox

Save book to Google Drive