Book contents
- Frontmatter
- Contents
- From the Editors
- Notes on Contributors
- 1 Introduction: Language Variation Studies and Computational Humanities
- 2 Panel Discussion on Computing and the Humanities
- 3 Making Sense of Strange Sounds: (Mutual) Intelligibility of Related Language Varieties. A Review
- 4 Phonetic and Lexical Predictors of Intelligibility
- 5 Linguistic Determinants of the Intelligibility of Swedish Words among Danes
- 6 Mutual Intelligibility of Standard and Regional Dutch Language Varieties
- 7 The Dutch-German Border: Relating Linguistic, Geographic and Social Distances
- 8 The Space of Tuscan Dialectal Variation: A Correlation Study
- 9 Recognising Groups among Dialects
- 10 Comparison of Component Models in Analysing the Distribution of Dialectal Features
- 11 Factor Analysis of Vowel Pronunciation in Swedish Dialects
- 12 Representing Tone in Levenshtein Distance
- 13 The Role of Concept Characteristics in Lexical Dialectometry
- 14 What Role does Dialect Knowledge Play in the Perception of Linguistic Distances?
- 15 Quantifying Dialect Similarity by Comparison of the Lexical Distribution of Phonemes
- 16 Corpus-based Dialectometry: Aggregate Morphosyntactic Variability in British English Dialects
15 - Quantifying Dialect Similarity by Comparison of the Lexical Distribution of Phonemes
Published online by Cambridge University Press: 12 September 2012
- Frontmatter
- Contents
- From the Editors
- Notes on Contributors
- 1 Introduction: Language Variation Studies and Computational Humanities
- 2 Panel Discussion on Computing and the Humanities
- 3 Making Sense of Strange Sounds: (Mutual) Intelligibility of Related Language Varieties. A Review
- 4 Phonetic and Lexical Predictors of Intelligibility
- 5 Linguistic Determinants of the Intelligibility of Swedish Words among Danes
- 6 Mutual Intelligibility of Standard and Regional Dutch Language Varieties
- 7 The Dutch-German Border: Relating Linguistic, Geographic and Social Distances
- 8 The Space of Tuscan Dialectal Variation: A Correlation Study
- 9 Recognising Groups among Dialects
- 10 Comparison of Component Models in Analysing the Distribution of Dialectal Features
- 11 Factor Analysis of Vowel Pronunciation in Swedish Dialects
- 12 Representing Tone in Levenshtein Distance
- 13 The Role of Concept Characteristics in Lexical Dialectometry
- 14 What Role does Dialect Knowledge Play in the Perception of Linguistic Distances?
- 15 Quantifying Dialect Similarity by Comparison of the Lexical Distribution of Phonemes
- 16 Corpus-based Dialectometry: Aggregate Morphosyntactic Variability in British English Dialects
Summary
Abstract This paper describes a new method for quantifying the similarity of the lexical distribution of phonemes in different varieties of a language (in this case English). In addition to introducing the method, it discusses phonological problems which must be addressed if any comparison of this sort is to be attempted, and applies the method to a limited data set of varieties of English. Since the method assesses their structural similarity, it will be useful for analysing the historical development of varieties of English and the relationships (either as a result of common origin or of contact) that hold between them.
INTRODUCTION
In recent years considerable progress has been made in assessing the relationships between linguistic varieties by measuring the similarity between strictly comparable sets of phonetic data. In particular, measurement of Levenshtein Distance (see, for example, Nerbonne, Heeringa, and Kleiweg, 1999; Nerbonne and Heeringa, 2001; Heeringa, 2004) has proved useful for determining the relationships between closely related varieties, and the ‘Sound Comparisons’ method for assessing the distance between varieties provides a very promising alternative technique for looking into the changing relationships between closely-related and not so closely-related varieties (Heggarty, McMahon and McMahon, 2005; McMahon, Heggarty, McMahon and Maguire, 2007).
Phonetic comparison algorithms of this sort are not, however, without their problems. Firstly, they often depend upon auditory phonetic transcriptions of one degree of fineness or another, with all the associated issues of transcriber isoglosses, inaccuracies and realism that this method brings (see Milroy and Gordon, 2003: 144–152 for a discussion of the issues).
- Type
- Chapter
- Information
- Computing and Language VariationInternational Journal of Humanities and Arts Computing Volume 2, pp. 261 - 278Publisher: Edinburgh University PressPrint publication year: 2009