Hostname: page-component-cd9895bd7-dzt6s Total loading time: 0 Render date: 2024-12-22T19:01:22.144Z Has data issue: false hasContentIssue false

Active Maintenance: A Proposal for the Long-Term Computational Reproducibility of Scientific Results

Published online by Cambridge University Press:  23 April 2021

Limor Peer
Affiliation:
Yale University
Lilla V. Orr
Affiliation:
Yale University
Alexander Coppock
Affiliation:
Yale University

Abstract

Computational reproducibility, or the ability to reproduce analytic results of a scientific study on the basis of publicly available code and data, is a shared goal of many researchers, journals, and scientific communities. Researchers in many disciplines including political science have made strides toward realizing that goal. A new challenge, however, has arisen. Code too often becomes obsolete within only a few years. We document this problem with a random sample of studies posted to the Institution for Social and Policy Studies (ISPS) Data Archive; we encountered nontrivial errors in seven of 20 studies. In line with similar proposals for the long-term maintenance of data and commercial software, we propose that researchers dedicated to computational reproducibility should have a plan in place for “active maintenance” of their analysis code. We offer concrete suggestions for how data archives, journals, and research communities could encourage and reward the active maintenance of scientific code and data.

Type
Article
Copyright
© The Author(s), 2021. Published by Cambridge University Press on behalf of the American Political Science Association

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

REFERENCES

Alvarez, R. Michael, Key, Ellen M., and Núñez, Lucas. 2018. “Research Replication: Practical Considerations.” PS: Political Science & Politics 51 (2): 422–26.Google Scholar
Beaulieu-Jones, Brett K., and Greene, Casey S.. 2017. “Reproducibility of Computational Workflows Is Automated Using Continuous Analysis.” Nature Biotechnology 35 (4): 342–46.CrossRefGoogle ScholarPubMed
Bowers, Jake. 2011. “Six Steps to a Better Relationship with Your Future Self.” https://cpb-us-e1.wpmucdn.com/blogs.rice.edu/dist/d/2418/files/2013/09/tpm{\_}v18{\_}n2.pdf.Google Scholar
Brady, Henry E., and Ansolabehere, Stephen. 1989. “The Nature of Utility Functions in Mass Publics.” American Political Science Review 83 (1): 143–63.CrossRefGoogle Scholar
Chassanoff, Alexandra, Borghi, John, AlNoamany, Yasmin, and Thornton, Katherine. 2018. “Software Curation in Research Libraries: Practice and Promise.” Journal of Librarianship and Scholarly Communication 1 (eP2239). https://doi.org/10.7710/2162-3309.2239.CrossRefGoogle Scholar
Christensen, Garret, Freese, Jeremy, and Miguel, Edward. 2019. Transparent and Reproducible Social Science Research: How to Do Open Science. Oakland: University of California Press.Google Scholar
Conway, Paul. 2000. “Overview: Rationale for Digitization and Preservation.” In Handbook for Digital Projects, ed. Sitts, Maxine K. 520. Andover, MA: Northeast Document Conservation Center.Google Scholar
Coppock, Alexander. 2017. “Did Shy Trump Supporters Bias the 2016 Polls? Evidence from a Nationally Representative List Experiment.” Statistics, Politics and Policy 8 (1): 2940.CrossRefGoogle Scholar
Dafoe, Allan. 2014. “Science Deserves Better: The Imperative to Share Complete Replication Files.” PS: Political Science & Politics 47 (1): 6066.Google Scholar
Daigle, Bradley, Cariani, Karen, Kussmann, Carol, Tallman, Nathan, and Work, Lauren. 2018. “Levels of Digital Preservation.” https://ndsa.org/publications/levels-of-digital-preservation.Google Scholar
Elman, Colin, Kapiszewski, Diana, and Lupia, Arthur. 2018. “Transparent Social Inquiry: Implications for Political Science.” Annual Review of Political Science 21 (1): 2947.CrossRefGoogle Scholar
Fowler, Martin, and Foemmel, Matthew. 2006. “Continuous Integration.” https://martinfowler.com/articles/continuousintegration.html.Google Scholar
Gertler, Aaron L., and Bullock, John G.. 2017. “Reference Rot: An Emerging Threat to Transparency in Political Science.” PS: Political Science & Politics 50 (1): 166.Google Scholar
Green, Ann, Dionne, JoAnn, and Dennis, Martin. 1999. Preserving the Whole: A Two-Track Approach to Rescuing Social Science Data and Metadata. Washington, DC: Digital Library Federation.Google Scholar
Hinsen, Konrad. 2019. “Dealing with Software Collapse.” Computing in Science & Engineering 21 (3): 104108.CrossRefGoogle Scholar
Ivie, Peter, and Thain, Douglas. 2018. “Reproducibility in Scientific Computing.” ACM Computing Surveys 51 (3): 136. DOI:10.1145/3186266.CrossRefGoogle Scholar
Katz, Daniel. 2017. “Is Software Reproducibility Possible and Practical?” https://danielskatzblog.wordpress.com/2017/02/07/is-software-reproducibility-possible-and-practical.Google Scholar
Krafczyk, Matthew, Shi, August, Bhaskar, Adhithya, Marinov, Darko, and Stodden, Victoria. 2019. “Scientific Tests and Continuous Integration Strategies to Enhance Reproducibility in the Scientific Software Context.” In Proceedings of the 2nd International Workshop on Practical Reproducible Evaluation of Computer Systems, 23–28. P-Recs ’19. New York: Association for Computing Machinery.CrossRefGoogle Scholar
Kuhn, Max, Chow, Fanny, and Wickham, Hadley. 2020. rsample: General Resampling Infrastructure. https://cran.r-project.org/package=rsample.Google Scholar
Lupia, Arthur, and Elman, Colin. 2014. “Openness in Political Science: Data Access and Research Transparency: Introduction.” PS: Political Science & Politics 47 (1): 1942.Google Scholar
National Academies of Sciences, Engineering, and Medicine. 2019. Reproducibility and Replicability in Science. Washington, DC: The National Academies Press.Google Scholar
Peer, Limor, and Green, Ann. 2012. “Building an Open Data Repository for a Specialized Research Community: Process, Challenges and Lessons.” International Journal of Digital Curation 7 (1): 151–62.CrossRefGoogle Scholar
Peer, Limor, Green, Ann, and Stephenson, Elizabeth. 2014. “Committing to Data Quality Review.” International Journal of Digital Curation 9 (1): 263–91.CrossRefGoogle Scholar
Peer, Limor, Orr, Lilla, and Coppock, Alexander. 2021. “Replication Data for: Active Maintenance: A Proposal for the Long-Term Computational Reproducibility of Scientific Results.” Harvard Dataverse. DOI:10.7910/DVN/JLLFGK.CrossRefGoogle Scholar
Peng, Roger D. 2011. “Reproducible Research in Computational Science.” Science 334 (6060): 1226–27.CrossRefGoogle ScholarPubMed
Robinson, David, and Hayes, Alex. 2019. broom: Convert Statistical Analysis Objects into Tidy Tibbles. https://cran.r-project.org/package=broom.Google Scholar
Rohlfing, Ingo, Königshofen, Lea, Krenzer, Susanne, Schwalbach, Jan, and Ayjeren Bekmuratovna, R. 2020. “A Reproduction Analysis of 106 Articles Using Qualitative Comparative Analysis, 2016–2018.” PS: Political Science & Politics. First View 1–5. https://doi.org/10.1017/S1049096520001717.CrossRefGoogle Scholar