Building the Study

doi:10.1017/9781009010054.007

6 - Building the Study

from Part I - From Idea to Reality: The Basics of Research

Published online by Cambridge University Press: 25 May 2023

Martin Schnuerch and

Edgar Erdfelder

Edited by

Austin Lee Nichols and

John Edlund

Show author details

Austin Lee Nichols: Affiliation:
Central European University, Vienna
John Edlund: Affiliation:
Rochester Institute of Technology, New York

Book contents

Get access

Summary

This chapter discusses the key elements involved when building a study. Planning empirical studies presupposes a decision about whether the major goal of the study is confirmatory (i.e., tests of hypotheses) or exploratory in nature (i.e., development of hypotheses or estimation of effects). Focusing on confirmatory studies, we discuss problems involved in obtaining an appropriate sample, controlling internal and external validity when designing the study, and selecting statistical hypotheses that mirror the substantive hypotheses of interest. Building a study additionally involves decisions about the to-be-employed statistical test strategy, the sample size required by this strategy to render the study informative, and the most efficient way to achieve this so that study costs are minimized without compromising the validity of inferences. Finally, we point to the many advantages of study preregistration before data collection begins.

Keywords

Validity of Studies Hypothesis Testing Estimation Sampling Strategies Power Analysis Preregistration

Type: Chapter
Information: The Cambridge Handbook of Research Methods and Statistics for the Social and Behavioral Sciences
Volume 1: Building a Program of Research
, pp. 103 - 124

DOI: https://doi.org/10.1017/9781009010054.007 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2023

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book purchase

Temporarily unavailable

References

Aberson, C. L. (2019). Applied Power Analysis for the Behavioral Sciences, 2nd ed. Routledge, Taylor & Francis Group.Google Scholar

Anscombe, F. J. (1954). Fixed-sample-size analysis of sequential observations. Biometrics, 10, 89–100. https://doi.org/10.2307/3001665 CrossRef Google Scholar

Bakan, D. (1966). The test of significance in psychological research. Psychological Bulletin, 66(6), 423–437. https://doi.org/10.1037/h0020412 Google Scholar

Bakker, M., van Dijk, A., & Wicherts, J. M. (2012). The rules of the game called psychological science. Perspectives on Psychological Science, 7(6), 543–554. https://doi.org/10.1177/1745691612459060 CrossRef Google Scholar PubMed

Bakker, M., Veldkamp, C. L. S., van Assen, M. A. L. M., et al. (2020). Ensuring the quality and specificity of preregistrations. PLOS Biology, 18(12), e3000937. https://doi.org/10.1371/journal.pbio.3000937 Google Scholar

Barnard, G. A. (1946). Sequential tests in industrial statistics. Supplement to the Journal of the Royal Statistical Society, 8(1), 1–26. https://doi.org/10.2307/2983610 CrossRef Google Scholar

Bredenkamp, J. (1972). Der Signifikanztest in der psychologischen Forschung [The Test of Significance in Psychological Research]. Akademische Verlagsgesellschaft.Google Scholar

Bredenkamp, J. (1980). Theorie und Planung psychologischer Experimente [Theory and Planning of Psychological Experiments]. Steinkopff.CrossRef Google Scholar

Brysbaert, M. (2019). How many participants do we have to include in properly powered experiments? A tutorial of power analysis with reference tables. Journal of Cognition, 2(16), 1–38. https://doi.org/10.5334/joc.72 Google Scholar

Brysbaert, M. & Stevens, M. (2018). Power analysis and effect size in mixed effects models: A tutorial. Journal of Cognition, 1(1), 9. https://doi.org/10.5334/joc.10 Google Scholar

Campbell, D. T. (1957). Factors relevant to the validity of experiments in social settings. Psychological Bulletin, 54(4), 297–312. https://doi.org/10.1037/h0040950 CrossRef Google Scholar

Campbell, J. I. D. & Thompson, V. A. (2012). MorePower 6.0 for ANOVA with relational confidence intervals and Bayesian analysis. Behavior Research Methods, 44(4), 1255–1265. https://doi.org/10.3758/s13428-012-0186-0 Google Scholar

Chambers, C. D. & Tzavella, L. (2020). The past, present, and future of registered reports [Preprint]. MetaArXiv. https://doi.org/10.31222/osf.io/43298 CrossRef Google Scholar

Champely, S. (2020). pwr: Basic functions for power analysis [Manual]. Available at: https://CRAN.R-project.org/package=pwr.Google Scholar

Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences, 2nd ed. Erlbaum.Google Scholar

Cooper, E. H. & Pantle, A. J. (1967). The total-time hypothesis in verbal learning. Psychological Bulletin, 68(4), 221–234. https://doi.org/10.1037/h0025052 CrossRef Google Scholar PubMed

Craik, F. I. M. & Lockhart, R. S. (1972). Levels of processing: A framework for memory research. Journal of Verbal Learning and Verbal Behavior, 11(6), 671–684. https://doi.org/10.1016/S0022-5371(72)80001-X Google Scholar

Cumming, G. (2014). The new statistics: Why and how. Psychological Science, 25, 7–29. https://doi.org/10.1177/0956797613504966 Google Scholar

Edwards, W., Lindman, H., & Savage, L. J. (1963). Bayesian statistical inference for psychological research. Psychological Review, 70(3), 193–242. https://doi.org/10.1037/h0044139 CrossRef Google Scholar

Erdfelder, E. (1984). Zur Bedeutung und Kontrolle des beta-Fehlers bei der inferenzstatistischen Prüfung log-linearer Modelle [On importance and control of beta errors in statistical tests of log-linear models]. Zeitschrift für Sozialpsychologie, 15, 18–32.Google Scholar

Erdfelder, E. (1994). Erzeugung und Verwendung empirischer Daten [Generation and Use of Empirical Data]. In Herrmann, T. & Tack, W. (eds.), Methodologische Grundlagen der Psychologie (Vol. 1, pp. 47–97). Hogrefe.Google Scholar

Erdfelder, E. & Bredenkamp, J. (1994). Hypothesenprüfung [Hypothesis Testing]. In Herrmann, T. & Tack, W. (eds.), Methodologische Grundlagen der Psychologie (Vol. 1, pp. 604–648). Hogrefe.Google Scholar

Erdfelder, E., Faul, F., & Buchner, A. (1996). GPOWER: A general power analysis program. Behavior Research Methods, Instruments, & Computers, 28(1), 1–11. https://doi.org/10.3758/BF03203630 CrossRef Google Scholar

Falk, A. & Heckman, J. J. (2009). Lab experiments are a major source of knowledge in the social sciences. Science, 326(5952), 535–538. https://doi.org/10.1126/science.1168244 CrossRef Google Scholar

Faul, F., Erdfelder, E., Lang, A.-G., & Buchner, A. (2007). G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39(2), 175–191. https://doi.org/10.3758/BF03193146 CrossRef Google Scholar PubMed

Foster, E. D. & Deardorff, A. (2017). Open Science Framework (OSF). Journal of the Medical Library Association, 105(2), 203–206. https://doi.org/10.5195/JMLA.2017.88 CrossRef Google Scholar

Fu, Q., Hoijtink, H., & Moerbeek, M. (2021). Sample-size determination for the Bayesian t test and Welch’s test using the approximate adjusted fractional Bayes factor. Behavior Research Methods, 53(1), 139–152. https://doi.org/10.3758/s13428-020-01408-1 CrossRef Google Scholar PubMed

Gelman, A. & Carlin, J. (2014). Beyond power calculations: Assessing type S (sign) and type M (magnitude) errors. Perspectives on Psychological Science, 9(6), 641–651. https://doi.org/10.1177/1745691614551642 Google Scholar

Gigerenzer, G. (1993). The superego, the ego, and the id in statistical reasoning. In Keren, G. & Lewis, C. (eds.), A Handbook for Data Analysis in the Behavioral Sciences (pp. 311–339). Erlbaum.Google Scholar

Gigerenzer, G. (2004). Mindless statistics. The Journal of Socio-Economics, 33(5), 587–606. https://doi.org/10.1016/j.socec.2004.09.033 CrossRef Google Scholar

Green, P., & MacLeod, C. J. (2016). SIMR: An R package for power analysis of generalized linear mixed models by simulation. Methods in Ecology and Evolution, 7(4), 493–498. https://doi.org/10.1111/2041-210X.12504 CrossRef Google Scholar

Greve, W., Bröder, A., & Erdfelder, E. (2013). Result-blind peer reviews and editorial decisions: A missing pillar of scientific culture. European Psychologist, 18(4), 286–294. https://doi.org/10.1027/1016-9040/a000144 CrossRef Google Scholar

Guven, C. & Lee, W.-S. (2015). Height, aging and cognitive abilities across Europe. Economics & Human Biology, 16, 16–29. https://doi.org/10.1016/j.ehb.2013.12.005 CrossRef Google Scholar PubMed

Hays, W. L. (1963). Statistics. Holt, Rinehart and Winston.Google Scholar

Heck, D. W. & Erdfelder, E. (2019). Maximizing the expected information gain of cognitive modeling via design optimization. Computational Brain & Behavior, 2(3–4), 202–209. https://doi.org/10.1007/s42113-019-00035-0 Google Scholar

Highhouse, S. & Gillespie, J. Z. (2009). Do samples really matter that much? In C. E. Lance & R. J. Vandenberg (eds.), Statistical and Methodological Myths and Urban Legends: Doctrine, Verity and Fable in the Organizational and Social Sciences (pp. 247–265). Routledge, Taylor & Francis Group.Google Scholar

Jager, J., Putnick, D. L., & Bornstein, M. H. (2017). More than just convenient: The scientific merits of homogeneous convenience samples. Monographs of the Society for Research in Child Development, 82(2), 13–30. https://doi.org/10.1111/mono.12296 Google Scholar

John, L. K., Loewenstein, G., & Prelec, D. (2012). Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological Science, 23(5), 524–532. https://doi.org/10.1177/0956797611430953 Google Scholar

Kerr, N. L. (1998). HARKing: Hypothesizing after the results are known. Personality and Social Psychology Review, 2(3), 196–217. https://doi.org/10.1207/s15327957pspr0203_4 Google Scholar

Kruschke, J. K. (2013). Bayesian estimation supersedes the t test. Journal of Experimental Psychology: General, 142(2), 573–603. https://doi.org/10.1037/a0029146 Google Scholar

Kumle, L., Võ, M. L.-H., & Draschkow, D. (2021). Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R. Behavior Research Methods, 53, 2528–2543. https://doi.org/10.3758/s13428-021-01546-0 Google Scholar

Lakatos, I. (1978). The Methodology of Scientific Research Programmes. Cambridge University Press.CrossRef Google Scholar

Lakens, D. (2021). The practical alternative to the p value is the correctly used p value. Perspectives on Psychological Science, 16(3), 639–648. https://doi.org/10.1177/1745691620958012 CrossRef Google Scholar

Lakens, D. (2022). Sample size justification. Collabra: Psychology, 8(1), 33267. https://doi.org/10.1525/collabra.33267 Google Scholar

Lakens, D. & Caldwell, A. R. (2021). Simulation-based power analysis for factorial analysis of variance designs. Advances in Methods and Practices in Psychological Science, 4(1), 251524592095150. https://doi.org/10.1177/2515245920951503 Google Scholar

Lakens, D., Adolfi, F. G., Albers, C. J., et al. (2018). Justify your alpha. Nature Human Behaviour, 2(3), 168–171. https://doi.org/10.1038/s41562-018-0311-x Google Scholar

Lakens, D., Pahlke, F., & Wassmer, G. (2021). Group sequential designs: A tutorial [Preprint]. PsyArXiv. https://doi.org/10.31234/osf.io/x4azm CrossRef Google Scholar

Landers, R. N. & Behrend, T. S. (2015). An inconvenient truth: Arbitrary distinctions between organizational, mechanical turk, and other convenience samples. Industrial and Organizational Psychology, 8(2), 142–164. https://doi.org/10.1017/iop.2015.13 Google Scholar

Leatherdale, S. T. (2019). Natural experiment methodology for research: A review of how different methods can support real-world research. International Journal of Social Research Methodology, 22(1), 19–35. https://doi.org/10.1080/13645579.2018.1488449 CrossRef Google Scholar

Lin, H., Werner, K. M., & Inzlicht, M. (2021). Promises and perils of experimentation: The mutual-internal-validity problem. Perspectives on Psychological Science, 16(4), 854–863. https://doi.org/10.1177/1745691620974773 CrossRef Google Scholar PubMed

Mayo, D. G. (2018). Statistical Inference as Severe Testing: How to Get Beyond the Statistics Wars. Cambridge University Press.Google Scholar

Meiser, T. (2011). Much pain, little gain? Paradigm-specific models and methods in experimental psychology. Perspectives on Psychological Science, 6(2), 183–191. https://doi.org/10.1177/1745691611400241 Google Scholar

Miller, J. & Ulrich, R. (2020). A simple, general, and efficient method for sequential hypothesis testing: The independent segments procedure. Psychological Methods, 26(4), 486–497. https://doi.org/10.1037/met0000350 CrossRef Google Scholar PubMed

Morey, R. D., Rouder, J. N., Verhagen, J., & Wagenmakers, E.-J. (2014). Why hypothesis tests are essential for psychological science: A comment on Cumming (2014). Psychological Science, 25, 1289–1290. https://doi.org/10.1177/0956797614525969 Google Scholar

Nosek, B. A., Ebersole, C. R., DeHaven, A. C., & Mellor, D. T. (2018). The preregistration revolution. Proceedings of the National Academy of Sciences, 115(11), 2600–2606. https://doi.org/10.1073/pnas.1708274114 CrossRef Google Scholar PubMed

Pashler, H. & Wagenmakers, E.-J. (2012). Editors’ introduction to the special section on replicability in psychological science: A crisis of confidence? Perspectives on Psychological Science, 7(6), 528–530. https://doi.org/10.1177/1745691612465253 Google Scholar

Perugini, M., Gallucci, M., & Costantini, G. (2014). Safeguard power as a protection against imprecise power estimates. Perspectives on Psychological Science, 9, 319–332. https://doi.org/10.1177/1745691614528519 CrossRef Google Scholar

Perugini, M., Gallucci, M., & Costantini, G. (2018). A practical primer to power analysis for simple experimental designs. International Review of Social Psychology, 31(1). https://doi.org/10.5334/irsp.181 Google Scholar

Popper, K. R. (1968). The Logic of Scientific Discovery, 3rd ed. Hutchinson.Google Scholar

Reiber, F., Schnuerch, M., & Ulrich, R. (2020). Improving the efficiency of surveys with randomized response models: A sequential approach based on curtailed sampling. Psychological Methods, 27(2), 198–211. https://doi.org/10.1037/met0000353 Google Scholar

Reichenbach, H. (1938). Experience and Prediction: An Analysis of the Foundations and the Structure of Knowledge. University of Chicago Press. https://doi.org/10.1037/11656-000 Google Scholar

Roberts, S. & Pashler, H. (2000). How persuasive is a good fit? A comment on theory testing. Psychological Review, 107(2), 358–367. https://doi.org/10.1037/0033-295X.107.2.358 CrossRef Google Scholar

Rouder, J. N., Morey, R. D., & Wagenmakers, E.-J. (2016). The interplay between subjectivity, statistical practice, and psychological science. Collabra, 2, 1–12. https://doi.org/10.1525/collabra.28 Google Scholar

Rouder, J. N., Schnuerch, M., Haaf, J. M., & Morey, R. D. (2022). Principles of model specification in ANOVA designs. Computational Brain & Behavior. https://doi.org/10.1007/s42113-022-00132-7 Google Scholar

Rouder, J. N., Speckman, P. L., Sun, D., Morey, R. D., & Iverson, G. (2009). Bayesian t tests for accepting and rejecting the null hypothesis. Psychonomic Bulletin & Review, 16, 225–237. https://doi.org/10.3758/PBR.16.2.225 Google Scholar

Sackett, P. R. & Larson, J. Jr. R. (1990). Research strategies and tactics in industrial and organizational psychology. In M. D. Dunnette & L. M. Hough (eds.), Handbook of Industrial and Organizational Psychology, Volume 1, 2nd ed. (pp. 419–489). Consulting Psychologists Press.Google Scholar

Sagarin, B. J., Ambler, J. K., & Lee, E. M. (2014). An ethical approach to peeking at data. Perspectives on Psychological Science, 9(3), 293–304. https://doi.org/10.1177/1745691614528214 CrossRef Google Scholar PubMed

Scheel, A. M., Schijen, M. R. M. J., & Lakens, D. (2021). An excess of positive results: Comparing the standard psychology literature with registered reports. Advances in Methods and Practices in Psychological Science, 4(2), 251524592110074. https://doi.org/10.1177/25152459211007467 Google Scholar

Schimmack, U. (2020). A meta-psychological perspective on the decade of replication failures in social psychology. Canadian Psychology/Psychologie Canadienne, 61(4), 364–376. https://doi.org/10.1037/cap0000246 CrossRef Google Scholar

Schnuerch, M. & Erdfelder, E. (2020). Controlling decision errors with minimal costs: The sequential probability ratio t test. Psychological Methods, 25(2), 206–226. https://doi.org/10.1037/met0000234 Google Scholar

Schnuerch, M., Erdfelder, E., & Heck, D. W. (2020). Sequential hypothesis tests for multinomial processing tree models. Journal of Mathematical Psychology, 95, 102326. https://doi.org/10.1016/j.jmp.2020.102326 Google Scholar

Schönbrodt, F. D. & Stefan, A. M. (2019). BFDA: An R package for Bayes factor design analysis (version 0.5.0) [Manual]. Available at: https://github.com/nicebread/BFDA.Google Scholar

Schönbrodt, F. D., Wagenmakers, E.-J., Zehetleitner, M., & Perugini, M. (2017). Sequential hypothesis testing with Bayes factors: Efficiently testing mean differences. Psychological Methods, 22(2), 322–339. https://doi.org/10.1037/met0000061 CrossRef Google Scholar PubMed

Schram, A. (2005). Artificiality: The tension between internal and external validity in economic experiments. Journal of Economic Methodology, 12(2), 225–237. https://doi.org/10.1080/13501780500086081 Google Scholar

Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22(11), 1359–1366. https://doi.org/10.1177/0956797611417632 Google Scholar

Ulrich, R., Miller, J., & Erdfelder, E. (2018). Effect size estimation from t-statistics in the presence of publication bias: A brief review of existing approaches with some extensions. Zeitschrift für Psychologie, 226, 56–80. https://doi.org/10.1027/2151-2604/a000319 Google Scholar

Vanpaemel, W. (2010). Prior sensitivity in theory testing: An apologia for the Bayes factor. Journal of Mathematical Psychology, 54(6), 491–498. https://doi.org/10.1016/j.jmp.2010.07.003 CrossRef Google Scholar

Vanpaemel, W. & Lee, M. D. (2012). Using priors to formalize theory: Optimal attention and the generalized context model. Psychonomic Bulletin & Review, 19(6), 1047–1056. https://doi.org/10.3758/s13423-012-0300-4 Google Scholar

Wagenmakers, E.-J. (2007). A practical solution to the pervasive problems of p values. Psychonomic Bulletin & Review, 14, 779–804. https://doi.org/10.3758/BF03194105 Google Scholar

Wagenmakers, E.-J., Wetzels, R., Borsboom, D., van der Maas, H. L. J., & Kievit, R. A. (2012). An agenda for purely confirmatory research. Perspectives on Psychological Science, 7(6), 632–638. https://doi.org/10.1177/1745691612463078 Google Scholar

Wald, A. (1947). Sequential Analysis. Wiley.Google Scholar

Wetherill, G. B. (1975). Sequential Methods in Statistics, 2nd ed. Chapman and Hall.Google Scholar