Published online by Cambridge University Press: 29 November 2017
The article contributes new data and findings to the growing field of corpus-based dialect syntax research. The focus of the paper is on variation in ‘need’-constructions (tarvis/vaja olema + nominal complement/infinitive ‘need to’) based on the corpus of Estonian dialects. Our purpose was to demonstrate the complex nature of syntactic variation, constrained geographically, individually or by language-internal factors. The study takes a corpus-based quantitative approach to observing the geographical spread of linguistic units. We apply conditional inference tree and random forests models to capture the (co)varying parts of the construction studied. Our results show that variation in different parts of constructions is influenced by different factors, both geographical and language-internal. Lexical variation (adverb tarvis ‘need’ or vaja ‘need’) and omission of the copula are clearly geographically distributed, while omission of the experiencer is determined mainly by language-internal factors. However, the study has also found extensive inter-individual differences.