Comparing Bayesian and Frequentist Models of Language Variation

doi:10.1017/9781108589314.009

8 - Comparing Bayesian and Frequentist Models of Language Variation

The Case of Help + (to-)Infinitive

from Part III - Perspectives on Multifactorial Methods

Published online by Cambridge University Press: 06 May 2022

Natalia Levshina

Edited by

Ole Schützler and

Julia Schlüter

Show author details

Ole Schützler: Affiliation:
Universität Leipzig
Julia Schlüter: Affiliation:
Universität Bamberg

Book contents

Get access

Summary

This chapter compares standard frequentist and more recent Bayesian approaches to logistic regression analyses. Starting out from a multifactorial case study of the verb help complemented by either the bare infinitive or the to-infinitive, the key components and the main conceptual differences of frequentist and Bayesian inference are discussed. Conceptually, the Bayesian rationale of directly testing hypotheses on the effects of multiple factors on an outcome variable is argued to be preferable and more sensitive than the conventional approach of testing null hypotheses. On the practical side, Bayesian statistics enables the researcher to recycle and integrate the results of previous analyses based on different datasets as informative priors, which can help improve and stabilize statistical modelling. Recourse to prior research can thus produce synergies and reduce data preparation expense. In cases of data sparsity, it can by the same token enable researchers to analyse small samples. Bayesian methods are thus put forward as powerful tools for overcoming the limitations of isolated corpus studies and for promoting synergies between data collected by individual researchers.

Keywords

Bayesian inference sample size posteriors MCMC algorithm credible interval confidence interval horror aequi

Type: Chapter
Information: Data and Methods in Corpus Linguistics
Comparative Approaches
, pp. 224 - 258

DOI: https://doi.org/10.1017/9781108589314.009 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2022

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book purchase

Temporarily unavailable

References

Bartoń, Kamil. 2018. MuMIn: Multi-Model Inference. R package version 1.42.1. https://CRAN.R-project.org/package=MuMIn.Google Scholar

Bates, Douglas, Maechler, Martin, Bolker, Ben and Walker, Steve. 2015. Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software 67(1). 1–48. https://doi.org/10.18637/jss.v067.i01.Google Scholar

Biber, Douglas, Johansson, Stig, Leech, Geoffrey, Conrad, Susan and Finegan, Edward. 1999. Longman Grammar of Spoken and Written English. Harlow: Longman.Google Scholar

Bürkner, Paul-Christian. 2017. brms: An R Package for Bayesian Multilevel Models Using Stan. Journal of Statistical Software 80(1). 1–28. https://doi.org/10.18637/jss.v080.i01.CrossRef Google Scholar

Davies, Mark. 2008–. The Corpus of Contemporary American English (COCA): 560 million words, 1990–present. https://corpus.byu.edu/coca/.Google Scholar

Dixon, R.M.W. 1991. A New Approach to English Grammar, on Semantic Principles. Oxford: Clarendon Press.Google Scholar

Fox, John. 2003. Effect Displays in R for Generalised Linear Models. Journal of Statistical Software 8(15). 1–27. www.jstatsoft.org/v08/i15.CrossRef Google Scholar

Gelman, Andrew, and Hill, Jennifer. 2007. Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge: Cambridge University Press.Google Scholar

Goodman, Steven. 2008. A Dirty Dozen: Twelve P-Value Misconceptions. Seminars in Hematology 45(3). 135–40. https://doi.org/10.1053/j.seminhematol.2008.04.003.Google Scholar

Goodman, Steven N., Fanelli, Daniele and John, P. A. Ioannidis. 2016. What Does Research Reproducibility Mean? Science Translational Medicine 8(341). 12. https://doi.org/10.1126/scitranslmed.aaf5027.CrossRef Google Scholar PubMed

Huddleston, Rodney, and Pullum, Geoffrey K.. 2002. The Cambridge Grammar of the English Language. Cambridge: Cambridge University Press. https://doi.org/10.1017/9781316423530.Google Scholar

Kruschke, John K. 2011a. Doing Bayesian Data Analysis: A Tutorial with R and BUGS. Oxford: Elsevier.Google Scholar

Kruschke, John K. 2011b. Introduction to Special Section on Bayesian Data Analysis. Perspectives on Psychological Science 6(3). 272–3. https://doi.org/10.1177/1745691611406926.Google Scholar

Levshina, Natalia. 2016. When Variables Align: A Bayesian Multinomial Mixed-Effects Model of English Permissive Constructions. Cognitive Linguistics 27(2). 235–68.Google Scholar

Levshina, Natalia. 2018. Probabilistic Grammar and Constructional Predictability: Bayesian Generalized Additive Models of Help + (To) Infinitive in Varieties of Web-Based English. Glossa 3(1). 55. 1–22. https://doi.org/10.5334/gjgl.294/.Google Scholar

Levshina, Natalia. In press. Communicative Efficiency: Language Structure and Use. Cambridge: Cambridge University Press.Google Scholar

Lind, Age. 1983. The Variant Forms of Help to/Help Ø. English Studies 64. 263–75. https://doi.org/10.1080/00138388308598255.Google Scholar

Lohmann, Arne. 2011. Help vs. Help to: A Multifactorial, Mixed-Effects Account of Infinitive Marker Omission. English Language and Linguistics 15(3). 499–521. https://doi.org/10.1017/S1360674311000141.CrossRef Google Scholar

Lunn, David, Jackson, Christopher, Best, Nicky, Thomas, Andrew and Spiegelhalter, David. 2013. The BUGS Book: A Practical Introduction to Bayesian Analysis. Boca Raton, FL: CRC Press.Google Scholar

Mair, Christian. 2002. Three Changing Patterns of Verb Complementation in Late Modern English: A Real-Time Study Based on Matching Text Corpora. English Language and Linguistics 6(1). 105–31. https://doi.org/10.1017/S1360674302001065.Google Scholar

McElreath, Richard. 2016. Statistical Rethinking: A Bayesian Course with Examples in R and Stan. Boca Raton, FL: CRC Press.Google Scholar

McEnery, Anthony and Xiao, Zhonghua. 2005. HELP or HELP to: What Do Corpora Have to Say? English Studies 86(2). 161–87. https://doi.org/10.1080/0013838042000339880.Google Scholar

Nakagawa, Shinichi, Johnson, Paul C. D. and Schielzeth, Holger. 2017. The Coefficient of Determination R² and Intra-Class Correlation Coefficient from Generalized Linear Mixed-Effects Models Revisited and Expanded. Journal of The Royal Society Interface 14(134). http://doi.org/10.1098/rsif.2017.0213.Google Scholar

Rohdenburg, Günter. 1996. Cognitive Complexity and Increased Grammatical Explicitness in English. Cognitive Linguistics 7(2). 149–82. https://doi.org/10.1515/cogl.1996.7.2.149.Google Scholar

Rohdenburg, Günter. 2003. Horror Aequi and Cognitive Complexity as Factors Determining the Use of Interrogative Clause Linkers. In Rohdenburg, Günter and Mondorf, Britta, eds. Determinants of Grammatical Variation in English. Berlin: Mouton de Gruyter. 205–50. https://doi.org/10.1515/9783110900019.205.CrossRef Google Scholar

Rohdenburg, Günter. 2009. Grammatical Divergence between British and American English in the Nineteenth and Early Twentieth Centuries. In Ingrid van Ostade, Tieken-Boon and van der Wurff, Wim, eds. Current Issues in Late Modern English. Linguistic Insights 77. Bern: Peter Lang. 301–30.Google Scholar

Schlüter, Julia. 2003. Phonological Determinants of Grammatical Variation in English: Chomsky’s Worst Possible Case. In Rohdenburg, Günter and Mondorf, Britta, eds. Determinants of Grammatical Variation in English. Berlin/New York, NY: Mouton de Gruyter. 69–118.Google Scholar

Scrivner, Olga B. 2015. A Probabilistic Approach in Historical Linguistics: Word Order Change in Infinitival Clauses: From Latin to Old French. Doctoral dissertation. Indiana University.Google Scholar

Straka, Milan, and Jana, Straková. 2017. Tokenizing, POS Tagging, Lemmatizing and Parsing UD 2.0 with UDPipe. Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, Vancouver, Canada, August 2017.Google Scholar

Van De Schoot, Rens, and Depaoli, Sarah. 2014. Bayesian Analyses: Where to Start and What to Report. The European Health Psychologist 16(2). 75–84.Google Scholar

van de Schoot, Rens, David Kaplan, Jaap J. Denissen, Jens B. Asendorpf, Franz J. Neyer and Marcel A. G. van Aken. 2014. A Gentle Introduction to Bayesian Analysis: Applications to Developmental Research. Child Development 85. 842–60. https://doi.org/10.1111/cdev.12169.Google Scholar

Vasishth, Shravan, Chen, Zhong, Qiang, Li and Guo, Guelian. 2013. Processing Chinese Relative Clauses: Evidence for the Subject-Relative Advantage. PLoS ONE 8(10). 1–14. http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0077006.Google Scholar

Wasow, Thomas, Levy, Roger, Melnick, Robin, Zhu, Hanzhi and Juzek, Tom. 2015. Processing, Prosody, and Optional to. In Frazier, Lyn and Gibson, Edward, eds. Explicit and Implicit Prosody in Sentence Processing. New York: Springer. 133–58. https://doi.org/10.1007/978–3–319–12961–7_8.Google Scholar

Book contents

8 - Comparing Bayesian and Frequentist Models of Language Variation

Summary

Keywords

Access options

Book purchase

Temporarily unavailable

References

Further Reading

References

Save book to Kindle

Save book to Dropbox

Save book to Google Drive