Testing Theories with Bayes Factors

doi:10.1017/9781009010054.024

23 - Testing Theories with Bayes Factors

from Part IV - Statistical Approaches

Published online by Cambridge University Press: 25 May 2023

Zoltan Dienes

Edited by

Austin Lee Nichols and

John Edlund

Show author details

Austin Lee Nichols: Affiliation:
Central European University, Vienna
John Edlund: Affiliation:
Rochester Institute of Technology, New York

Book contents

Get access

Summary

Bayes factors – evidence for one model versus another – are a useful tool in the social and behavioral sciences, partly because they can provide evidence for no effect relative to the sort of effect expected. By contrast, a non-significant result does not provide evidence for the null hypothesis tested. If non-significance does not in itself count against any theory predicting an effect, how could a theory fail a test? Bayes factors provide a measure of evidence from first principles. A severe test is one that is likely to obtain evidence against a theory if it were false – to obtain an extreme Bayes factor against the theory. Bayes factors show why cherry picking degrades evidence, how to deal with multiple testing, and how optional stopping is consistent with severe testing. Further, informed Bayes factors can be used to link theory tightly to how that theory is tested, so that the measured evidence does relate to the theory.

Keywords

Bayes Factor Severe Test Evidence Multiple Testing Optional Stopping Cherry Picking Priors

Type: Chapter
Information: The Cambridge Handbook of Research Methods and Statistics for the Social and Behavioral Sciences
Volume 1: Building a Program of Research
, pp. 494 - 512

DOI: https://doi.org/10.1017/9781009010054.024 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2023

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book purchase

Temporarily unavailable

References

Baguley, T. S. (2012). Serious Stats: A Guide to Advanced Statistics for the Behavioral Sciences. Palgrave Macmillan.Google Scholar

Benjamin, D. J., Berger, J. O., Johannesson, M., et al. (2018). Redefine statistical significance. Nature Human Behaviour, 2(1), 6–10. https://doi.org/10.1038/s41562-017-0189-z Google Scholar

Chambers, C. D. (2019). What’s next for registered reports? Nature, 573(7773), 187–189.CrossRef Google Scholar PubMed

Cortex (2021). Guidelines for users. Available at: http://cdn.elsevier.com/promis_misc/PROMIS%20pub_idt_CORTEX%20Guidelines_RR_29_04_2013.pdf.Google Scholar

Cowles, M. & Davis, C. (1982). On the origins of the .05 level of statistical significance. American Psychologist, 37, 553–558.CrossRef Google Scholar

Cuddy, A. C., Wilmuth, C. A., Yap, A. J., & Carney, D. R. (2015). Preparatory power posing affects nonverbal presence and job interview performance. Journal of Applied Psychology, 100, 1286–1295.Google Scholar

DeGroot, M. H. (1986). Probability and Statistics, 2nd ed. Addison-Wesley.Google Scholar

Devezer, B., Navarro, D. J., Vandekerckhove, J., & Buzbas, E. O. (2020). The case for formal methodology in scientific reform. bioRxiv. https://doi.org/10.1101/2020.04.26.048306 Google Scholar

Dienes, Z. (2008). Understanding Psychology as a Science: An Introduction to Scientific and Statistical Inference. Palgrave Macmillan.Google Scholar

Dienes, Z. (2016). How Bayes factors change scientific practice. Journal of Mathematical Psychology, 72, 78–89.CrossRef Google Scholar

Dienes, Z. (2019). How do I know what my theory predicts? Advances in Methods and Practices in Psychological Science, 2, 364–377.Google Scholar

Dienes, Z. (2021a). How to use and report Bayesian hypothesis tests. Psychology of Consciousness: Theory, Research, and Practice, 8, 9–26Google Scholar

Dienes, Z. (2021b). The inner workings of registered reports [Preprint]. PsyArXiv. https://doi.org/10.31234/osf.io/yhp2a CrossRef Google Scholar

Dienes, Z. & McLatchie, N. (2018). Four reasons to prefer Bayesian over significance testing. Psychonomic Bulletin & Review, 25, 207–218.CrossRef Google Scholar PubMed

Edlund, J., Cuccolo, K., Irgens, M. S., Wagge, J. R., & Zlokovich, M. S. (2021). Saving science through replication studies. Perspectives on Psychological Science, March 8. https://doi.org/10.1177/1745691620984385 CrossRef Google Scholar

Fisher, R. A. (1925). Statistical Methods for Research Workers. Oliver & Boyd.Google Scholar

Greenland, S. (2017). The need for cognitive science in methodology. American Journal of Epidemiology, 186, 639–645.CrossRef Google Scholar PubMed

Hendriksen, A., de Heide, R., & Grünwald., P. (2020). Optional stopping with Bayes factors. Available at https://arxiv.org/pdf/1807.09077.pdf.Google Scholar

Jeffreys, H. (1939). The Theory of Probability. Oxford University Press.Google Scholar

Klaschinski, L., Schnabel, K., & Schröder-Abé, M. (2017) Benefits of power posing: Effects on dominance and social sensitivity, Comprehensive Results in Social Psychology, 2, 55–67.CrossRef Google Scholar

Kruschke, J. K. (2011). Bayesian assessment of null values via parameter estimation and model comparison. Perspectives on Psychological Science, 6, 299–312.CrossRef Google Scholar PubMed

Kruschke, J. K. (2013a). Posterior predictive checks can and should be Bayesian. British Journal of Mathematical and Statistical Psychology, 66, 45–56.CrossRef Google Scholar PubMed

Kruschke, J. K. (2013b). Bayesian estimation supersedes the t test. Journal of Experimental Psychology: General, 142, 573–603CrossRef Google Scholar PubMed

Lindley, D. V. (1957). A statistical paradox. Biometrika, 44, 187–192.Google Scholar

Lindley, D. V. (2014). Understanding Uncertainty, revised edition. John Wiley & Sons.Google Scholar

MacCoun, R. & Perlmutter, S. (2015). Hide results to seek the truth. Nature, 526, 187–189.Google Scholar

Mayo, D. G. (2018). Statistical Inference as Severe Testing: How to Get Beyond the Statistics Wars. Cambridge University Press.Google Scholar

Meehl, P. E. (1967). Theory testing in psychology and physics: A methodological paradox. Philosophy of Science, 34, 103–115.CrossRef Google Scholar

McIntosh, R. D. (2017). Exploratory reports: A new article type for Cortex. Cortetx, 96, A1–A4.Google Scholar

McPhetres, J., Albayrak-Aydemir, N., Barbosa Mendes, A., et al. (2021). A decade of theory as reflected in psychological science (2009–2019). PLOS One, March 5. https://doi.org/10.1371/journal.pone.0247986 CrossRef Google Scholar

Miller, D. (1999). Critical Rationalism: A Restatement and Defence. Open Court.Google Scholar

Morey, R. (2018). Redefining statistical significance: The statistical arguments. Available at: https://medium.com/@richarddmorey/redefining-statistical-significance-the-statistical-arguments-ae9007bc1f91.Google Scholar

Morey, R. D., Romeijn, J. W., & Rouder, J. N. (2013). The humble Bayesian: Model checking from a fully Bayesian perspective. British Journal of Mathematical and Statistical Psychology. 66, 68–75Google Scholar

Morey, R. D., Romeijn, J.-W., & Rouder, J. N. (2016). The philosophy of Bayes factors and the quantification of statistical evidence. Journal of Mathematical Psychology, 72, 6–18.Google Scholar

Notturno, M. A. (1999). Science and the Open Society. Central European University Press.Google Scholar

Palfi, B. & Dienes, Z. (2019). When and how to calculate the Bayes factor with an interval null hypothesis. PsyArXiv. https://doi.org/10.31234/osf.io/9chmw Google Scholar

Palfi, B., Moga, G., Lush, P., Scott, R. B., & Dienes, Z. (2020). Can hypnotic suggestibility be measured online? Psychological Research, 84, 1460–1471. https://doi.org/10.1007/s00426-019-01162-w Google Scholar

Pericchia, L. & Pereira, C. (2016). Adaptative significance levels using optimal decision rules. Brazilian Journal of Probability and Statistics, 30, 70–90.Google Scholar

Popper, K. R. (1959). The Logic of Scientific Discovery. Hutchinson.Google Scholar

Popper, K. R. (1963). Conjectures and Refutations: The Growth of Scientific Knowledge. Routledge.Google Scholar

Popper, K. R. (1972). Objective Knowledge: An Evolutionary Approach. Oxford University Press.Google Scholar

Rouder, J. & Haaf, J. M. (2020). Optional stopping and the interpretation of the Bayes factor. https://doi.org/10.31234/osf.io/m6dhwR CrossRef Google Scholar

Rouder, J. N., Speckman, P. L., Sun, D., Morey, R. D., & Iverson, G. (2009). Bayesian t tests for accepting and rejecting the null hypothesis. Psychonomic Bulletin & Review, 16, 225–237.Google Scholar

Skora, L., Livermore, J. J. A., Dienes, Z., Seth, A., & Scott, R. B. (2020). Feasibility of unconscious instrumental conditioning: A registered replication. PsyArXiv. https://doi.org/10.31234/osf.io/p9dgn CrossRef Google Scholar

Stigler, S. M. (1999). Statistics on the Table: The History of Statistical Concepts and Methods. Harvard University Press.Google Scholar

Stroop, J. R. (1935). Studies of interference in serial verbal reactions. Journal of Experimental Psychology, 18, 643–662CrossRef Google Scholar

Szollosi, A., Kellen, D., Navarro, D. J., et al. (2020). Is preregistration worthwhile? Trends in Cognitive Sciences, 24, 94–95.Google Scholar

Tendeiro, J. N. & Kiers, H. A. L. (2019). A review of issues about null hypothesis Bayesian testing. Psychological Methods, 24(6), 774–795.Google Scholar

van Dongen, N. N. N., Wagenmakers, E., & Sprenger, J. (2020). A Bayesian perspective on severity: Risky predictions and specific hypotheses. PsyArXiv. https://doi.org/10.31234/osf.io/4et65 Google Scholar

Vanpaemel, W. (2020). Strong theory testing using the prior predictive and the data prior. Psychological Review, 127, 136–145, http://dx.doi.org/10.1037/rev0000167 CrossRef Google Scholar PubMed

Wagenmakers, E. (2017). How to test interval-null hypotheses in JASP. Available at: https://jasp-stats.org/2017/10/25/test-interval-null-hypotheses-jasp/.Google Scholar

Wagenmakers, E. (2019). A breakdown of “preregistration is redundant, at best”. Available at: www.bayesianspectacles.org/a-breakdown-of-preregistration-is-redundant-at-best.Google Scholar

Wagenmakers, E., Gronau, Q. F., & Vandekerckhove, J. (2019). Five Bayesian intuitions for the stopping rule principle. PsyArXiv. https://doi.org/10.31234/osf.io/5ntkd Google Scholar

Westfall, P. H., Johnson, W. O., & Utts, J. M. (1997). A Bayesian perspective on the Bonferroni adjustment. Biometrika, 84, 419–427.Google Scholar

Wiseman, R. & Greening, E. (2002) The mind machine: A mass participation experiment into the possible existence of extrasensory perception. British Journal of Psychology, 93, 487–99.Google Scholar