The epidemiology of errors in data capture, management, and analysis: A scoping review of retracted articles and retraction notices in clinical and translational research

Abigail S. Baldridge; Grace C. Bellinger; Oriana M. Fleming; Luke V. Rasmussen; Eric W. Whitley; Leah J. Welty

doi:10.1017/cts.2024.533

The epidemiology of errors in data capture, management, and analysis: A scoping review of retracted articles and retraction notices in clinical and translational research

Published online by Cambridge University Press: 17 May 2024

Eric W. Whitley and

Abigail S. Baldridge: Affiliation:
Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
Grace C. Bellinger: Affiliation:
Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
Oriana M. Fleming: Affiliation:
Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
Luke V. Rasmussen: Affiliation:
Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
Eric W. Whitley: Affiliation:
Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
Leah J. Welty*: Affiliation:
Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
*: Corresponding author: L. J. Welty; Email: [email protected]

Article contents

Abstract
Introduction:
Methods:
Results:
Conclusions:
Introduction
Methods
Results
Discussion
Conclusions
Supplementary material
Author contributions
Funding statement
Competing interests
References

Rights & Permissions

Abstract

Introduction:

To better understand and prevent research errors, we conducted a first-of-its-kind scoping review of clinical and translational research articles that were retracted because of problems in data capture, management, and/or analysis.

Methods:

The scoping review followed a preregistered protocol and used retraction notices from the Retraction Watch Database in relevant subject areas, excluding gross misconduct. Abstracts of original articles published between January 1, 2011 and January 31, 2020 were reviewed to determine if articles were related to clinical and translational research. We reviewed retraction notices and associated full texts to obtain information on who retracted the article, types of errors, authors, data types, study design, software, and data availability.

Results:

After reviewing 1,266 abstracts, we reviewed 884 associated retraction notices and 786 full-text articles. Authors initiated the retraction over half the time (58%). Nearly half of retraction notices (42%) described problems generating or acquiring data, and 28% described problems with preparing or analyzing data. Among the full texts that we reviewed: 77% were human research; 29% were animal research; and 6% were systematic reviews or meta-analyses. Most articles collected data de novo (77%), but only 5% described the methods used for data capture and management, and only 11% described data availability. Over one-third of articles (38%) did not specify the statistical software used.

Conclusions:

Authors may improve scientific research by reporting methods for data capture and statistical software. Journals, editors, and reviewers should advocate for this documentation. Journals may help the scientific record self-correct by requiring detailed, transparent retraction notices.

Keywords

Clinical and translational research data management data analysis retractions scoping review

Type: Research Article
Information: Journal of Clinical and Translational Science , Volume 8 , Issue 1 , 2024 , e110

DOI: https://doi.org/10.1017/cts.2024.533 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright: © The Author(s), 2024. Published by Cambridge University Press on behalf of Association for Clinical and Translational Science

Introduction

Retraction of inaccurate scientific articles is critical to ensuring the integrity of research and published literature. Incorrect findings, especially those that go unnoticed for months or years, may have broad repercussions in areas affecting human health, including clinical practice, drug discovery, and public policy. Reasons for retraction may be multifaceted, ranging from honest mistakes to egregious ethical and scientific misconduct [Reference Chen, Hu, Milbank and Schultz1].

The emergence of digitized publication databases in the 1980s and 1990s, such as PubMed, MEDLINE, and Embase, made possible the formal study of retractions. Chen and colleagues (2013) found that retractions in PubMed increased 8-fold from 2001 to 2011 [Reference Chen, Hu, Milbank and Schultz1]. An analysis of MEDLINE similarly found that retractions increased substantially, from 0.002% in the early 1980s to approximately 0.02% in 2005–2009 [Reference Wager and Williams2]. Recent studies across a range of disciplines have found that retractions may be increasing disproportionately to the number of articles published [Reference Al-Ghareeb, Hillel and McKenna3–Reference Wang, Ku, Alotaibi and Rutka5]. Growing interest in studying retractions and increased retraction volume has led to the creation of databases of retracted articles, such as the Retraction Watch Database (RWDB) [6].

Numerous articles have studied retractions within clinical areas, including cardiovascular medicine, radiology, surgery, nursing, cancer research, obstetrics, oncology, dentistry, and most recently, COVID-19 [Reference Al-Ghareeb, Hillel and McKenna3–Reference Wang, Ku, Alotaibi and Rutka5,Reference Audisio, Robinson and Soletti7–Reference Shimray13]. Many of these studies have analyzed the metadata from the RWDB or citation indices. More general investigations of retractions across biomedical sciences are often limited to article characteristics such as time to retraction, number of authors, country of origin, or reason for retraction as coded by the Retraction Watch (RW) team [Reference Fang, Steen and Casadevall14,Reference Peng, Romero and Horvat15]. More granular investigation of the full text of the retracted articles may be prohibitively labor intensive; to date, most studies examining the full text tend to include fewer articles and, therefore, have limited generalizability.

Across fields and inquiries, research misconduct consistently emerges as a common reason for retraction [Reference Bozzo, Bali, Evaniew and Ghert4,Reference Audisio, Robinson and Soletti7,Reference Fang, Steen and Casadevall14,Reference Budd, Sievert and Schultz16–Reference Zhao, Dai, Lun and Gao23]. The identification of research misconduct, its prevalence, and its prevention have received substantial attention [Reference Fang, Steen and Casadevall14,Reference Nath, Marcus and Druss20,Reference Stretton, Bramich and Keys22,Reference Van Noorden24]. In contrast, an estimated 21%–62% of retractions are related to unintentional errors [Reference Wager and Williams2,Reference Bozzo, Bali, Evaniew and Ghert4,Reference Fang, Steen and Casadevall14,Reference Damineni, Sardiwal, Waghle and Dakshyani17,Reference Gaudino, Robinson and Audisio18,Reference Nath, Marcus and Druss20,Reference Hosseini, Hilhorst, de Beaufort and Fanelli25]; the large range may be due in part to the difficulty of inferring authors’ intentions from retraction notices.

With the growth of team science and big data, the increased complexity of research may make preventable errors, such as those involving analytic methods, more likely to occur. To our knowledge, there has been no comprehensive study of articles in the biomedical literature that were retracted owing to mistakes in data capture, management, and/or analysis. This omission is critical: characterizing methodological and analytic mistakes is an essential step in improving their detection and prevention, thereby benefiting authors, reviewers, editors, and, ultimately, patient care, public policy, and human health.

We therefore conducted a first-of-its-kind scoping review of articles published from January 1, 2011 to January 31, 2020 that were subsequently retracted for reasons related to data capture, management, and/or analysis but not gross misconduct. Our scoping review builds on the existing literature in three important ways. First, we considered articles published in clinical and translational research, which includes a broad collection of articles across basic science, clinical medicine, and public health. Second, we extracted detailed information about methods, such as study design, how data were obtained, and statistical software, from the articles’ full text. Third, we reviewed retraction notices to categorize who initiated the retraction, author involvement, and high-level categories of the types of errors that occurred. Our review summarizes problems in the research pipeline related to the capture, management, and/or analysis of data so that authors, reviewers, editors, and publishers may consider steps to better detect and avoid these preventable errors.

Methods

Study design

The scoping review complied with Preferred Reporting Items for Systematic Reviews and Meta-Analysis extension for Scoping Reviews (PRISMA-ScR) guidelines and followed a preregistered protocol [Reference Baldridge, Welty, Rasmussen and Whitley26,Reference Page, McKenzie and Bossuyt27]. The corresponding PRISMA-ScR checklist is included in the supplemental materials (Online Supplement).

Searches

The website RW was launched in 2010 as an initiative to assist the scientific community, and in 2014 became part of The Center for Scientific Integrity [28]. The RWDB, launched in 2018, is an index of retractions and at the time of data request was publicly available subject to a data use agreement [6]. The RWDB comprises a systematic and comprehensive compendium of retracted articles, including a detailed ontology to classify and describe the retracted articles. At the time of this writing, the RWDB included over 43,000 records and has been cited in over 140 research articles that aim to evaluate and understand trends, practices, and behaviors around retractions.

A total of 21,252 records were retrieved from the RWDB, current to March 12, 2020. Each record includes publication information (article title, journal, authors, publication date, DOI [digital object identifier], URL, PubMed ID), retraction information (date, retraction DOI, coded reasons), and coded subject lists (e.g., Business – Accounting, Neuroscience, History – Asia, Geology). Coded subjects and retraction reasons are applied to each retracted article by RW staff from prespecified banks of possible codes.

Inclusion and exclusion criteria

Identification of the articles and retraction notices for inclusion was a four-step process: (1) data from the RWDB were reviewed to identify abstracts eligible for review; (2) abstracts were reviewed in duplicate to identify articles eligible for review; (3) we reviewed the retraction notices for all eligible articles; and (4) we reviewed articles if the full text could be located. Inclusion and exclusion criteria for each step are described below:

RWDB

Year of Publication: RWDB records were subset to articles published on or after January 1, 2011. We chose this range to balance having a large sample of retracted articles with relatively recent articles.
Subject Lists: We tabulated the frequency of each subject and reviewed the list for applicability (Supplemental Table 1). We first subsetted retraction records to include records associated with a subject of interest (e.g., Biology – Cancer, Neuroscience). Second, from this selection, we then excluded records for which the subject list contained terms unrelated to human subjects, medical or clinical research, or the practice of human subjects research (e.g., Foreign Aid, Astrophysics).
Retraction Reasons: We tabulated the frequency of each retraction reason and reviewed the list for applicability (Supplemental Table 2). Retraction records were first subsetted to include records associated with a reason of interest (e.g., Concerns/Issues About Data, Error in Analyses). Second, we then excluded records for which the retraction reasons contained terms indicating gross misconduct (e.g., Falsification/Fabrication of Data, Ethical Violations by Author).

Abstracts

English Language: Abstracts published in English were eligible.
Human Subjects Research: Abstracts reporting research on human subjects were eligible.
Clinical and Translational Research: For abstracts that did not explicitly report human subjects research, we determined if they were reporting clinical and translational research following guidelines published by the National Institutes of Health. We excluded abstracts that reported “basic research,” but we did include abstracts that reported “preclinical research,” defined as connecting “basic science of disease with human medicine [29].”

Retraction notices

Retraction notices matching the DOI in the RWDB were reviewed for all eligible articles.

Full-text articles

Eligible articles were reviewed if the full-text article matching the DOI in the retraction notice could be accessed by the study team using journal subscriptions available through Northwestern University’s library system.

Data collection and management

The Research Electronic Data Capture (REDCap) tool hosted at Northwestern University was used throughout the review processes for data entry and importing article information from the RWDB (Online Supplement) [Reference Harris, Taylor, Thielke, Payne, Gonzalez and Conde30].

Abstracts

Each abstract was located by searching for the DOI or article title as recorded in the RWDB. Abstracts were located and assessed for inclusion in duplicate by two independent reviewers (ASB, LVR, EWW, or LJW; Online Supplement). Conflicting decisions were resolved through review by a third team member, followed by discussion with all four reviewers.

Retraction notices

The Committee of Publication Ethics (COPE) guidelines were used to inform a framework for qualitative review of the retraction notices (Online Supplement) [31]. Information on the involvement of authors, editors, journals, and publishers in the retraction process was extracted by a single reviewer (ASB, GCB, OMF, LVR, or LJW). The retraction notices were examined in duplicate by two independent reviewers (ASB, GCB, OMF, LVR, or LJW) to qualitatively code if the underlying reasons for retraction were related to either (1) generating or acquiring data and/or (2) preparing or analyzing data, defined below.

Generating or acquiring data: Examples include laboratory error, sample contamination, incorrect articles included in a meta-analysis, wrong cell types, incorrect patient identification for a case, incorrect data pulled from an electronic health record or other system, misinterpretation of diagnoses or tests in the data pull, unreliable data or concerns about data, error in data, or loss of data. This category also includes instances for which investigators regenerated data that were inconsistent with the original data, or if there was a problem with data storage (i.e., acquiring, saving, and retaining).
Preparing or analyzing data: Examples include data preparation, data cleaning, data normalization, unit conversion, incorrect data merge, variable coding, statistical analysis, or incorrect standard errors. This category also includes instances for which concerns were noted about results, as long as the wording suggested that concerns were related to data analysis rather than benchtop work or data generation.

When retraction notices did not map to either of the coded categories, any other reasons for retraction were excerpted into an “Other” category.

“Other” reasons: Examples include general statements about “could not be replicated” that do not refer specifically to data or results, questions about the integrity of the data (not about the process generating the data), and duplicate figures or articles published in another context.

Conflicting decisions were resolved through consensus review.

Full-text articles

A single team member (ASB, GCB, OMF, LVR, or LJW) searched by DOI or article title, reviewed the retrieved article to ensure that it was the version that was retracted, and uploaded the article to the REDCap database. The team member then extracted the following information (Online Supplement):

Authorship: The number of authors and their contributions.
Data: Whether data were collected de novo or previously collected, how data were captured and stored, and if data were available publicly or available upon request.
Study Design: Whether the study was a systematic review or meta-analysis, animal study, and/or human subject research; the specific study design for human subjects research.
Methods and Analysis: If a statistical analysis plan was prespecified; what software was used for data analysis; and any other information about reproducible research (e.g., availability of code or software).

Data extraction was restricted to information contained within the full text or the supplemental materials available with the original publication. A random sample (n = 44, 6%) of full-text articles were reviewed and data were extracted in duplicate (LVR and LJW). Discordant data from this verification process were evaluated by all reviewers.

Statistical analyses

Initial data management and application of inclusion and exclusion criteria on raw data from the RWDB were performed using SAS v9.4 (SAS Inc., Cary, NC). All other analyses were performed using Stata v17 (StataCorp LLC., College Station, TX) or R v4.0.1 (https://www.R-project.org/). The manuscript was prepared using StatTag [Reference Welty, Rasmussen, Baldridge and Whitley32]. All continuous variables are summarized with medians and interquartile ranges. Categorical variables such as article characteristics and retraction reasons are summarized with frequencies and percentages. We used Kappa statistics to summarize concordance for data collected in duplicate: whether or not abstracts described clinical and translational research; retraction reasons being related to getting or acquiring data or preparing or analyzing data; and data extracted from a random subset of n = 44 full-text articles. We reviewed the Kappa statistics for patterns based on reviewer dyads or article characteristics.

Results

Eligible publications and reviewer agreement

Of 21,252 records retrieved from the RWDB, 1,266 (6%) were eligible for abstract review, and 884 (70%) of these abstracts were in English and related to clinical and translational research (Fig. 1). All 884 retraction notices were reviewed in duplicate for retraction reason. Of the 884 eligible abstracts, 786 (89%) had full article text available online through Northwestern University’s library system.

Figure 1. Flow chart of Retraction Watch Database records, eligible abstracts, eligible full texts, and complete reviews.

Agreement during the entire review process ranged from moderate to almost perfect. During abstract review, there was substantial agreement about what constituted “clinical and translational research” (κ = 0.66). For the 170 abstracts (13% of 1,266) for which the two initial reviewers disagreed about “clinical and translational research,” the team reviewed and resolved differences by consensus. When coding retraction reasons from retraction notices, there was moderate agreement regarding whether the retraction was related to generating or acquiring data (κ = 0.50), and substantial agreement regarding whether the retraction was related to preparing or analyzing the data (κ = 0.67). The team reviewed the 281 (32% of 884) retraction notices for which there were disagreements regarding retraction reason and resolved differences by consensus. Among the random sample (n = 44 [6%]) of full-text articles for which data were extracted in duplicate, concordance of the blinded reviewers had substantial agreement (highest κ = 0.97; lowest κ = 0.73; Supplemental Table 3). There was some variability in the ability to locate articles based on DOI; 3 of the 44 articles (7%) were found by one reviewer but not by the other.

Characteristics of retraction notices

Among 884 retractions, the median (interquartile range) time from publication to retraction was 1.0 years (0.4–2.4). The median (interquartile range) word count of the retraction notices was 109 (70–169) (Table 1).

Table 1. Characteristics of 884 retraction notices in clinical and translational research

¹ Response categories were not mutually exclusive.

² e.g., “We retract this article…”

³ e.g., institutions, medical centers, professional organizations.

Entities involved in the retraction

Among 884 notices, author involvement in the retraction process was common. For most retractions, authors were named as the entity retracting the article (n = 512, 58%). Authors were also the most common entity to initiate the retraction process (n = 358, 40%). Authors were usually involved in the retraction and did not explicitly disagree with retracting the article (n = 697, 79%), but in 7% of retractions (n = 65), authors were involved and at least one disagreed. Of note, 12% of retraction notices (n = 105) did not address author involvement.

The next most common entities named in retracting the article were editors (n = 381, 43%) and publishers (n = 139, 16%). Several journals used a standard format for their retraction notices that consistently listed all three – authors, editors, and publishers – as formally retracting the original article. Nearly one-third of retraction notices did not state who initiated the retraction (n = 277, 31%).

Types of errors

Among 884 retraction notices, 42% (n = 367) described problems with generating or acquiring data, and 28% (n = 248) described problems with preparing or analyzing data. Both types of errors were described in 31 notices (4%) (included in the percentages above). Only 6 retraction notices contained insufficient information to make any determination about the retraction reason (e.g., “this article has been withdrawn”) [Reference Kubo, Nagao and Mori33]; another 294 described reasons that were unrelated to our categories, such as problems with original study design or conduct, or that were too vague to classify (“wrong content with serious consequences;” “serious scientific errors”) [34,35]. Table 2 provides illustrative excerpts of types of errors.

Table 2. Selected extracts of retraction notices by type of error

Among the notices that identified problems with generating or acquiring data, reasons ranged from problems with instrumentation or measurement (“technical error in the measurement”) [36] to misidentified study subjects (“one of the cell lines […] had been unintentionally misidentified;” “transgenic mice reported […] were misidentified”); “incorrect cohort identification (ie, we missed many patients […]”) [Reference He, Chen, Cai and Chen37–Reference Schopfer, Takemoto and Allsup39]. Problems with incorrect data entry (“incorrectly entered data for six subjects;” “some data points that should have been entered as a positive result were instead entered as having a negative result”) [Reference Stein, Ramirez and Heinrich40,41] might have been prevented by more robust data capture and data quality checking procedures. In some instances, notices stated that data were no longer available or lost.

Among the retraction notices that identified problems with preparing or analyzing data, errors ranged from simple to complex. Some retraction notices described misclassification errors that are easy to make but may reverse a finding, such as miscoding a binary variable (“the responses for “attitude” and “intention” measures were switched;” “the experimental and control groups were inadvertently switched;” “the assignment was made incorrectly and resulted in a reversed coding of the study groups”) [42–Reference Aboumatar, Naqibuddin and Chung44]. Other errors included inappropriate selection of statistical methods (“the model did not include random slopes;” “immortality bias within the findings;” “did not adequately limit the impact of outlier data points”) [45–Reference Manning, Dixit, Satthenapalli, Katare and Sutherland47]. Although some retraction notices explained the specific statistical error, other descriptions were nonspecific, providing little insight into the root cause (“the authors discovered statistical errors”) [48].

Characteristics of retracted articles

Authorship

Among 786 full-text articles that were reviewed, most had multiple authors, but only 302 (38%) included a statement of authors’ contributions. The median (interquartile range) number of individual authors was 6 (4-8) (Table 3). Very few retracted articles included a consortium in the author list (n = 11, 1%).

Table 3. Characteristics of 786 retracted articles in clinical and translational research

¹ Response categories for these questions were not mutually exclusive.

² e.g., Comprehensive Meta-Analysis, Origin, SPM (Statistical Parametric Mapping), StatView.

Data

Although most articles collected primary data, there were few details about methods for data capture or data availability. More than three-quarters of articles reported collecting data de novo (n = 608, 77%). The remaining articles included data that were either previously collected for research purposes (n = 91, 12%), such as publicly available datasets (e.g., NHANES, BRFSS) and meta-analyses, or data that were previously collected but not necessarily for research purposes (n = 87, 11%), such as electronic health records and insurance claim data.

Of the 608 articles that collected data de novo, most did not report the methods used for data collection and storage (n = 580, 95%). A few articles reported using Microsoft Excel spreadsheets or other editable files (e.g., CSV or tab delimited) for data collection (n = 18, 3%); less than 1% reported using data collection tools such as REDCap, clinical trial management software, or database programs such as Microsoft Access or SQL.

Most articles (n = 698, 89%) did not include any statements about data availability. Fewer than 1 in 10 articles included data that were publicly available (i.e., either the original data source was public or the authors made their de novo data available). Less than 3% of articles (n = 21) explicitly stated that the data were available upon request.

Study design

Approximately three-quarters of articles involved human research (n = 571, 77%), about a quarter involved animal research (n = 213, 29%), and 47 (6%) were systematic reviews or meta-analyses (Table 3). Within the retracted articles coded as human research, 316 (55%) included an observational study, 213 (37%) described benchtop data, and 80 (14%) involved a clinical trial based on the current National Institutes of Health definition [49].

Methods and analysis

Details that support replication and reproduction were infrequently reported. More than one-third of the retracted articles did not specify the statistical analysis software used (n = 300, 38%). The most common programs were SPSS/PASW (n = 229, 29%) and GraphPad PRISM (n = 86, 11%). Statistical software such as Stata (n = 43, 5%), SAS (n = 39, 5%), or R or R Studio (n = 29, 4%) were infrequently reported. A total of 12 articles (2%) mentioned a prespecified statistical analysis plan and only five stated it was publicly available (Table 3). Only nine articles (1%) mentioned additional tools to support reproducing analyses and transparency; four of these specifically referenced Open Science Framework, other articles reported sharing code/scripts [Reference Foster and Deardorff50].

Discussion

To our knowledge, this is the first comprehensive scoping review of articles in clinical and translational research that were retracted for errors in data capture, management, and/or analysis. Among the retracted articles, we observed a pervasive lack of reporting on data capture, management, storage, and statistical software. While some retraction notices provided detailed information about the discovery and provenance of errors, others provided limited or no actionable information for other investigators to learn from.

Reasons for retracting articles

Our findings highlight the need for greater attention to data acquisition. Nearly half of retraction notices (42%) described issues with generating or acquiring data. Similar to the 87% of retracted articles reported in MEDLINE retraction notices [Reference Wager and Williams2], the majority of retracted articles (77%) involved de novo data collection. However, only a small fraction of these articles (5%) described how de novo data were captured and stored. Among those few articles that did, more named all-purpose business spreadsheet tools, such as Microsoft Excel, than programs specifically designed for robust research data capture, such as REDCap or clinical trial management systems. Research teams should consider the capture and storage of their (hard won) research data as a critical, if not especially exciting step, in the research process, and one that will require even more attention as data sets become larger and data elements more complex.

Data sharing was the exception rather than the norm – only one in ten articles made statements about data availability; even fewer stated data were publicly available. We expect these percentages will increase as many entities within the scientific community, including journals and funding agencies, require that data be made available through supplemental files or data repositories. For example, the National Institutes of Health recently issued a policy requiring data management and sharing plans in all grant applications submitted on or after January 25, 2023 [51]. However, in order for shared data to facilitate secondary analyses, it must be accompanied by accurate codebooks and detailed documentation. Data sharing makes it possible to uncover errors in previous analyses, but it also allows errors from data collection to propagate beyond the original investigation. Data with protected health information and personally identifying information – common in clinical and translational research – requires special considerations when sharing [Reference Rodriguez, Tuck and Dozier52,53].

Whether or not a research group intends to share its data, collecting and documenting it as if it will be shared will only benefit the work. Data management software, such as REDCap, that includes data validation, built-in data quality checks, and options for double data entry, may help avoid errors in data entry and improve data quality. In addition, many of these programs can automatically generate codebooks and well-documented datasets. The documentation may reduce data preparation errors such as miscoding a binary variable (e.g., reversing negative/positive, present/absent).

Software is an integral part of the research process, yet over one-third (38%) of the retracted articles did not specify the statistical analysis software used, and only a few articles (1%) reported sharing code or using specific tools that support reproducing analyses [Reference Katz, Chue Hong and Clark54]. Our findings are consistent with a recent study that found open source software such as Python are R have been cited far less frequently in retracted articles than software such as SPSS and GraphPad PRISM [Reference Schindler, Yan, Spors and Krüger55].

At the time of this writing, a vast array of software tools are available that support reproducible research [Reference Alston and Rick56]. For example, tools such as R Markdown, Jupyter Notebooks, SAS ODS, and StatTag can all be used to connect manuscript text to analytic results and output, and therefore avoid situations in which an article reports results that are not supported by statistical output [Reference Welty, Rasmussen, Baldridge and Whitley32,Reference Alston and Rick56–59]. Tools such as Open Science Framework, mentioned in four retracted articles, provide online and open project documentation and management [Reference Foster and Deardorff50]; GitHub and Code Ocean are online platforms for sharing and running analytic code [60,Reference Clyburne-Sherin, Fei and Green61]. Although it is not standard practice to cite software tools that support reproducibility and transparency, our results suggest they should be used ubiquitously and regularly. Citing their use may also encourage others to adopt similar tools.

We also recommend that investigators consider using reproducibility checklists or developing their own, both to guard against errors leading to retraction as well as to streamline their research workflow. Checklists vary across disciplines, but typically include reporting prompts to ensure sufficient methodological detail is provided (e.g., are the methods for imputation described?) [Reference Considine and Salek62]. Other checklists describe concrete, actionable items related to data capture, file organization, data documentation, computing environments, software used, and data analysis [63,Reference Du, Aristizabal-Henao, Garrett, Brochhausen, Hogan and Lemas64]. It may help investigators to take a “pragmatic” approach to reproducibility and broadly consider how to account for and document variation and change across the research project [Reference Rasmussen, Whitley and Welty65,Reference Rasmussen, Whitley, Baldridge and Welty66]. For example, documenting how data files are stored and versioned protects against lost data and using the wrong file. Data preparation workflows and analysis methods are increasingly complex – it is not surprising that errors seep into this process. The critical step for research teams is to both prevent and discover errors prior to publication. Checklists may both reduce errors as well as help identify any that persist.

To avoid errors related to inappropriate statistical methods, we recommend that research teams include statisticians or similarly trained individuals with specific expertise and experience in design and conduct of data analysis. Once the team includes appropriate expertise, well-documented analytic code is essential to verify results.

The retraction process

We found that authors played a substantial role in the retraction process, either by initiating (40%) or formally retracting (58%) the article, which aligns with similar studies of MEDLINE retractions [Reference Wager and Williams2,Reference Nath, Marcus and Druss20]. We observed substantially fewer retraction notices that mentioned external investigations compared with prior studies [Reference Vuong, La, Ho, Vuong and Ho67]. However, the difference may be owing to our exclusion of articles explicitly identified as scientific misconduct. About one-third of the retraction notices (31%) that we reviewed did not indicate who initiated the retraction, markedly lower than the 55% observed in library and information sciences for published errata [Reference Ajiferuke and Adekannbi68]. We also note that author involvement in the retraction process may take many forms; we report only on the description in the notice itself, which may not tell the full story. For example, authors may be required by institutional policy to request a retraction, or they may merely have contacted the journal with notice of a correction that then turned into a retraction.

The COPE and the International Committee of Medical Journal Editors have published practice guidelines for retracting articles; both recommend retraction notices including reasons for retraction [31,Reference Kleinert69]. The COPE guidelines, originally published in 2009 and updated in 2019, are widely endorsed by publishing bodies in order to create uniformity in the retraction process [Reference Balhara and Mishra70]. Despite these recommendations and endorsements, we found wide variation in the content of retraction notices and inconsistent adherence to guidelines.

Retractions with ambiguous language were not only difficult to classify but also led us to speculate that misconduct might have occurred. For example, authors could report that supporting data were “lost” or “no longer available” to cover-up a fabricated figure. “Problems” with data collection or “unreliable data” could be euphemisms for falsification of data. Similarly, authors could report “serious statistical errors” to cover-up data or models that were manipulated to produce a desired p-value. In the past decade, there have been calls to update retraction policies so that retraction notices resulting from honest mistakes may be distinguished from instances of research misconduct [Reference Garfinkel, Alam and Baskin71–Reference Pulverer73].

In addition to the findings supported by the data we extracted from full-text articles and retraction notices, we also gathered anecdotal information about the visibility of retracted articles through the scoping review process. The PubMed database clearly indicates that an article has been retracted with a red and pink banner and a link to the retraction notice. Some journals include additional, obvious, and helpful visual indicators, such as watermarking retracted articles with “RETRACTED” in large red letters across every page. In contrast, other journals included a small retraction box at the end of the article or failed to indicate the retraction altogether. Some articles were formally retracted and subsequently replaced with a corrected version that confusingly used the same DOI as the original article. This approach made it difficult to locate the retracted version and determine whether an article had been retracted. Even when the formal retraction process follows COPE guidelines and the article is clearly shown as being retracted, it is impossible to inform every person who may have included the retracted article in their citation library. Unfortunately, many articles continue to be mistakenly cited long after they are retracted, sometimes up to decades later [Reference Chen, Hu, Milbank and Schultz1,Reference Greitemeyer74]. The scientific community would benefit from journals ensuring that they adopt practices that clearly indicate if articles are retracted, including issuing a new DOI for a replacement article, and as possible, automating processes to check references and ensure that retracted articles are not being cited.

The absence of transparency in retraction notices limits our ability to learn from them. Although some journals provided a robust addendum of the retraction, breaking down the rationale in great detail, others provided only vague descriptions (“wrong content with serious consequences;” “serious scientific errors”) [34,35]. Detailed analysis and reporting of errors align with root cause analysis – an important component of process improvement frameworks that may aid efforts to improve the research process and reduce the volume of future retractions [Reference Heuvel and Rooney75,Reference Alexander, Antony and Rodgers76].

Beyond the purposes of this scoping review, there is value in having retracted articles available online, with the caveat that they are clearly indicated as being retracted. In the context of retract and replace scenarios, The JAMA Network has a method by which the changes are highlighted in the retracted version [Reference Christiansen and Flanagin77]. This approach makes it very clear to readers where errors occurred, and which content is inaccurate or unreliable; we found their explanations highly informative for qualitative review especially.

We did not specifically capture information on instances of replacement, nor is this information transparent in many cases. Although some retracted articles may eventually be republished with the errors corrected, not all articles are salvageable. We speculate that in general, errors in data preparation or analysis may be corrected more readily than issues in data acquisition, so long as the original data still exist. For example, errors that involve contamination, measurement, or loss of data may not be easily remedied (e.g., “…we discovered a technical error in the measurement…,” “…underlying data …are unavailable,” Table 2), suggesting that associated articles cannot be corrected without collecting new data. Less pervasive errors in data collection may be fixed (e.g., “incorrectly entered data for six subjects on two variables,” “some data points that should have been entered as a positive result were instead entered as having a negative result,” Table 2) and the article resubmitted. Similarly, errors that involve data preparation or incorrect statistical analyses (e.g., “identified a mistake in the way the original data were merged,” “the models did not include random slopes,” Table 2) can be corrected by appropriately analyzing the original data, if available. In instances where data can be salvaged and/or analyses fixed after retraction, we note that articles may need substantial revision prior to submission if findings change in meaningful ways.

Limitations

Our findings generalize to retracted articles, which constitute a biased subset of articles with errors. For example, retracted articles may be more likely to contain noticeable errors, especially errors in figures, than articles that have not been retracted. Post hoc analyses indicated there were 246 retraction notices (28%) that referenced “image” or “figure.” We found that passive voice and euphemistic language are common in retraction notices – wording reflects a political and legal process that limits transparent documentation. Due to the ambiguity of retraction notices, we could not reliably classify reasons for retraction beyond our two categories. The purpose of our scoping review was descriptive and did not include a control group of articles that had not been retracted. Because the typical time to retraction is longer than 1 year and we included articles published up to the date of our data pull, our results may not be representative of more recent retractions across clinical and translational research, especially for articles published during the COVID-19 pandemic. In fact, there was a dip in time to retraction during the COVID-19 pandemic that we did not capture in our findings because our data pull preceded the pandemic [Reference Shi, Abritis and Patel78].

Our inclusion/exclusion criteria for abstract review relied on the coded subject lists in the RWDB; it was infeasible to review the titles or abstracts themselves to determine if the coded subjects formed a substantive component of the reported research. Our inclusion/exclusion criteria for abstract review also relied on the initial categorizations of retraction reasons in the RWDB – definitions that are subjective and may shift over time. Although we obtained 89% of the articles eligible for full-text review (786 out of 884), our sample may have been biased based on the availability of articles within our institution and in the public domain. We did not assess how many articles that were retracted were eventually republished. Despite these limitations, our findings have implications for the scientific community.

Conclusions

The scoping review identified more than 800 articles in clinical and translational research retracted over 10 years for concerns related to data capture, management, and/or analysis. Authors have the opportunity to improve the rigor of scientific research by reporting methods for data capture and management, statistical software, and other software tools or code sharing that support transparency and reproducibility. Journals, editors, and peer reviewers can contribute to these improvements by advocating for widespread adoption of this documentation. In addition, journals have the opportunity to help the scientific record self-correct by requiring detailed, transparent retraction notices.

Supplementary material

The supplementary material for this article can be found at https://doi.org/10.1017/cts.2024.533.

Acknowledgments

The authors would like to thank the Retraction Watch team, especially Ivan Oransky, for maintaining the Retraction Watch Database and sharing data on retracted articles. We also wish to thank Q. Eileen Wafford of Galter Health Sciences Library, Northwestern University for guidance on the scoping review.

Author contributions

Securing Funding: LJW. Conception and Design: ASB, LVR, EWW, LJW. Scoping Review of Abstracts, Article Text, and Retraction Notices: ASB, GCB, OMF, LVR, EWW, LJW. Data Management and Statistical Analysis: ASB, LVR, LJW. Drafting and Revising Manuscript: ASB, GCB, OMF, LVR, LJW. Critical Feedback to the Manuscript: EWW.

Funding statement

This work was supported in part by the National Institutes of Health’s National Center for Advancing Translational Sciences (UL1TR001422, All Authors).

Competing interests

None.

References

Chen, C, Hu, Z, Milbank, J, Schultz, T. A visual analytic study of retracted articles in scientific literature. J Am Soc Inform Sci Technol. 2013;64(2):234–253.CrossRef Google Scholar

Wager, E, Williams, P. Why and how do journals retract articles? An analysis of medline retractions 1988-2008. J Med Ethics. 2011;37(9):567–570.CrossRef Google Scholar PubMed

Al-Ghareeb, A, Hillel, S, McKenna, L, et al. Retraction of publications in nursing and midwifery research: a systematic review. Int J Nurs Stud. 2018;81:8–13.CrossRef Google Scholar PubMed

Bozzo, A, Bali, K, Evaniew, N, Ghert, M. Retractions in cancer research: a systematic survey. Res Integr Peer Rev. 2017;2(1):5.CrossRef Google Scholar PubMed

Wang, J, Ku, JC, Alotaibi, NM, Rutka, JT. Retraction of neurosurgical publications: a systematic review. World Neurosurg. 2017;103:809–14 e1.CrossRef Google Scholar PubMed

The Retraction Watch Database [Internet]. The Center for Scientific Integrity. The Retraction Watch Database [Internet]. New York: The Center for Scientific Integrity. 2018. ISSN: 2692-465X. Available from: http://retractiondatabase.org/.Google Scholar

Audisio, K, Robinson, NB, Soletti, GJ, et al. A survey of retractions in the cardiovascular literature. Int J Cardiol. 2022;349:109–114.CrossRef Google Scholar PubMed

Bennett, C, Chambers, LM, Al-Hafez, L, et al. Retracted articles in the obstetrics literature: lessons from the past to change the future. Am J Obstet Gynecol MFM. 2020;2(4):100201.CrossRef Google Scholar

King, EG, Oransky, I, Sachs, TE, et al. Analysis of retracted articles in the surgical literature. Am J Surg. 2018;216(5):851–855.CrossRef Google Scholar PubMed

Nogueira, TE, Goncalves, AS, Leles, CR, Batista, AC, Costa, LR. A survey of retracted articles in dentistry. BMC Res Notes. 2017;10(1):253.CrossRef Google Scholar PubMed

Pantziarka, P, Meheus, L. Journal retractions in oncology: a bibliometric study. Future Oncol. 2019;15(31):3597–3608.CrossRef Google Scholar PubMed

Rosenkrantz, AB. Retracted publications within radiology journals. AJR Am J Roentgenol. 2016;206(2):231–235.CrossRef Google Scholar PubMed

Shimray, SR. Research done wrong: a comprehensive investigation of retracted publications in COVID-19. Account Res. 2022;30(7):393–406.Google Scholar

Fang, FC, Steen, RG, Casadevall, A. Misconduct accounts for the majority of retracted scientific publications. Proc Natl Acad Sci U S A. 2012;109(42):17028–17033.CrossRef Google Scholar PubMed

Peng, H, Romero, DM, Horvat, EA. Dynamics of cross-platform attention to retracted papers. Proc Natl Acad Sci U S A. 2022;119(25):e2119086119.CrossRef Google Scholar PubMed

Budd, JM, Sievert, M, Schultz, TR. Phenomena of retraction: reasons for retraction and citations to the publications. JAMA. 1998;280(3):296–297.CrossRef Google Scholar PubMed

Damineni, RS, Sardiwal, KK, Waghle, SR, Dakshyani, MB. A comprehensive comparative analysis of articles retracted in 2012 and 2013 from the scholarly literature. J Int Soc Prev Community Dent. 2015;5(1):19–23.CrossRef Google Scholar PubMed

Gaudino, M, Robinson, NB, Audisio, K, et al. Trends and characteristics of retracted articles in the biomedical literature, 1971 to 2020. JAMA Intern Med. 2021;181(8):1118–1121.CrossRef Google Scholar PubMed

Grieneisen, ML, Zhang, M. A comprehensive survey of retracted articles from the scholarly literature. PLoS One. 2012;7(10):e44118.CrossRef Google Scholar PubMed

Nath, SB, Marcus, SC, Druss, BG. Retractions in the research literature: misconduct or mistakes? Med J Aust. 2006;185(3):152–154.CrossRef Google Scholar PubMed

Samp, JC, Schumock, GT, Pickard, AS. Retracted publications in the drug literature. Pharmacotherapy. 2012;32(7):586–595.CrossRef Google Scholar PubMed

Stretton, S, Bramich, NJ, Keys, JR, et al. Publication misconduct and plagiarism retractions: a systematic, retrospective study. Curr Med Res Opin. 2012;28(10):1575–1583.CrossRef Google Scholar PubMed

Zhao, T, Dai, T, Lun, Z, Gao, Y. An analysis of recently retracted articles by authors affiliated with hospitals in Mainland China. J Sch Publ. 2021;52(2):107–122.CrossRef Google Scholar

Van Noorden, R. Science publishing: the trouble with retractions. Nature. 2011;478(7367):26–28.CrossRef Google Scholar PubMed

Hosseini, M, Hilhorst, M, de Beaufort, I, Fanelli, D. Doing the right thing: a qualitative investigation of retractions due to unintentional error. Sci Eng Ethics. 2018;24(1):189–206.CrossRef Google Scholar PubMed

Baldridge, AS, Welty, LJ, Rasmussen, L, Whitley, E. Analysis of Retraction Watch Database for Retractions Due to Errors in the Clinical Science Reproducible Research Pipeline: A Scoping Review Protocol. Chicago: Galter Health Sciences Library & Learning Center; 2020.Google Scholar

Page, MJ, McKenzie, JE, Bossuyt, PM, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. Syst Rev. 2021; 10(1):89.CrossRef Google Scholar PubMed

Retraction Watch. The Retraction Watch FAQ, Including Comments Policy. New York, NY: The Center for Scientific Integrity; 2010. https://retractionwatch.com/the-retraction-watch-faq/.Google Scholar

Translational Science Spectrum: National Center for Advancing Translational Sciences; 2021 [updated 11-10-2021]. Available from: https://ncats.nih.gov/about/about-translational-science/spectrum. Accessed July 20, 2023.Google Scholar

Harris, PA, Taylor, R, Thielke, R, Payne, J, Gonzalez, N, Conde, JG. Research electronic data capture (REDCap)--a metadata-driven methodology and workflow process for providing translational research informatics support. J Biomed Inform. 2009;42(2):377–381.CrossRef Google Scholar PubMed

COPE Council. COPE Retraction Guidelines. COPE Council; 2019. https://doi.org/10.24318/cope.2019.1.4 CrossRef Google Scholar

Welty, LJ, Rasmussen, LV, Baldridge, AS, Whitley, EW. Facilitating reproducible research through direct connection of data analysis with manuscript preparation: statTag for connecting statistical software to microsoft word. JAMIA Open. 2020;3(3):342–358.CrossRef Google Scholar PubMed

Kubo, KY, Nagao, I, Mori, D, et al. WITHDRAWN: chewing during chronic stress ameliorates stress-induced suppression of neurogenesis in the hippocampal dentate gyrus in aged SAMP8 mice. Neurosci Lett. 2012. https://www.sciencedirect.com/science/article/pii/S0304394012008658.CrossRef Google Scholar PubMed

Retraction. Chewing ability in an adult Chinese population. Clin Oral Investig. 2012;16(5):1511.CrossRef Google Scholar

Retraction statement: ‘overview of recent trends in diagnosis and management of leptomeningeal multiple myeloma’ by Mahender R. Yellu, Jessica M. Engel, Abhimanyu Ghose and Adedayo A. Onitilo. Hematol Oncol. 2017;35(1):142. https://onlinelibrary.wiley.com/doi/10.1002/hon.2389 CrossRef Google Scholar

Retraction: phenazine content in the cystic fibrosis respiratory tract negatively correlates with lung function and microbial complexity. Am J Respir Cell Mol Biol. 2019;60(1):134. https://www.atsjournals.org/doi/10.1165/rcmb.601retraction?url_ver=Z39.88-2003&rfr_id=ori:rid:crossref.org&rfr_dat=cr_pub%20%200pubmed CrossRef Google Scholar

He, Y, Chen, F, Cai, Y and Chen, S. Retracted: knockdown of tumor protein D52-like 2 induces cell growth inhibition and apoptosis in oral squamous cell carcinoma. Cell Biol Int. 2016;40(3):361.CrossRef Google Scholar

Baeyens, L, Lemper, M, Leuckx, G, et al. Transient cytokine treatment induces acinar cell reprogramming and regenerates functional beta cell mass in diabetic mice. Nat Biotechnol. 2020;38(3):374.CrossRef Google Scholar PubMed

Schopfer, DW, Takemoto, S, Allsup, K, Notice of Retraction and Replacement, et al. Cardiac rehabilitation use among veterans with ischemic heart disease. JAMA Intern Med. 2014;174(10):1687–1689.CrossRef Google Scholar PubMed

Stein, JA, Ramirez, M, Heinrich, KM, et al. The effects of acute caffeine supplementation on performance in trained CrossFit athletes. Sports. 2020;8(2):24.CrossRef Google Scholar PubMed

Statement of Retraction. Expert Rev Clin Immunol. 2020;16(2):229. https://www.tandfonline.com/doi/full/10.1080/1744666X.2019.1698150.CrossRef Google Scholar

Statement of retraction: psychological predictors behind the intention to drink and drive among female drivers: application of extended theory of planned behavior. Traffic Inj Prev. 2020;21(2):179. https://www.tandfonline.com/doi/full/10.1080/15389588.2020.1724013.CrossRef Google Scholar

Deng, Y, Xie, M, Xie, L, et al. Retraction: association between polymorphism of the Interleukin-13 Gene and susceptibility to hepatocellular carcinoma in the chinese population. PLOS ONE. 2017;12(7):e0181509.CrossRef Google Scholar PubMed

Aboumatar, H, Naqibuddin, M, Chung, S, et al. Effect of a program combining transitional care and long-term self-management support on outcomes of hospitalized patients with chronic obstructive pulmonary disease. JAMA. 2018;320(22):2335–2343.CrossRef Google Scholar PubMed

Retraction of “Women’s preference for attractive makeup tracks changes in their salivary testosterone. Psychol Sci. 2016;27(3):432. https://journals.sagepub.com/doi/10.1177/0956797616630941?url_ver=Z39.88-2003.CrossRef Google Scholar

Retraction Notice. Eur J Prev Cardiol. 2019;26(7):783. https://academic.oup.com/eurjpc/article/26/7/783/5925102.CrossRef Google Scholar

Manning, PJ, Dixit, P, Satthenapalli, VR, Katare, R, Sutherland, WHF. WITHDRAWAL FOR “GLP-1 has an anti-inflammatory effect on adipose tissue expression of cytokines, chemokines, and receptors in obese women”. J Clin Endocrinol Metab. 2019;104(8):3524–3524.CrossRef Google Scholar

Retraction note to: a randomized phase II study of everolimus for advanced pancreatic neuroendocrine tumors in Chinese patients. Med Oncol. 2015; 32(8):221.CrossRef Google Scholar

NIH’s Definition of a Clinical Trial. Bethesda, Maryland: National Institutes of Health [20 July 2023]. Available from: https://grants.nih.gov/policy/clinical-trials/definition.htm.Google Scholar

Foster, ED, Deardorff, A. Open science framework (OSF). J Med Libr Assoc. 2017;105(2):203–206.CrossRef Google Scholar

Data Management & Sharing Policy Overview. Bethesda, Maryland: National Institutes of Health; 2023 [20 July 2023]. Available from: https://sharing.nih.gov/data-management-and-sharing-policy/about-data-management-and-sharing-policies/data-management-and-sharing-policy-overview#after.Google Scholar

Rodriguez, A, Tuck, C, Dozier, MF, et al. Current recommendations/practices for anonymising data from clinical trials in order to make it available for sharing: a scoping review. Clin Trials. 2022;19(4):452–463.CrossRef Google Scholar PubMed

Data Management and Sharing: Protecting Privacy When Sharing Human Research Participant Data: National Institutes of Health; 2022 Accessed September 21, 2022.Google Scholar

Katz, DS, Chue Hong, NP, Clark, T, et al. Recognizing the value of software: a software citation guide. F1000Res. 2020;1257:9.Google Scholar

Schindler, D, Yan, E, Spors, S, Krüger, F. Retracted articles use less free and open-source software and cite it worse. Quant SciStud. 2023;4(4 ):1–23.Google Scholar

Alston, JM, Rick, JA. A beginner’s guide to conducting reproducible research. Bull Ecol Soc Am. 2021;102(2):e01801.CrossRef Google Scholar

Kluyver, T, Ragan-Kelley, B, Pérez, F, et al. Jupyter notebooks – a publishing format for reproducible computational workflows. In: Loizides, F, Scmidt, B, eds. 20th International Conference on Electronic Publishing. Amsterdam, Netherlands: IOS Press; 2016:87–90.Google Scholar

Allaire, J, Xie, Y, McPherson, J, et al. rmarkdown: Dynamic Documents for R. R package version 2.5.2020.Google Scholar

SAS Institute. SAS User’s Guide: Statistics. Cary, North Carolina: SAS Institute; 1985.Google Scholar

github. GitHub. 2020; Available from: https://github.com/.Google Scholar

Clyburne-Sherin, A, Fei, X, Green, SA. Computational Reproducibility via Containers in Social Psychology. PsyArXiv; 2018.CrossRef Google Scholar

Considine, EC, Salek, RM. A tool to encourage minimum reporting guideline uptake for data analysis in metabolomics. Metabolites. 2019; 9(3):43.CrossRef Google Scholar PubMed

The University of Victoria Libraries. Reproducibility Checklist - Research Data Services - LibGuides at University of Victoria Libraries 2024. Available from: https://libguides.uvic.ca/researchdata. Accessed March 14, 2024.Google Scholar

Du, X, Aristizabal-Henao, JJ, Garrett, TJ, Brochhausen, M, Hogan, WR, Lemas, DJ. A checklist for reproducible computational analysis in clinical metabolomics research. Metabolites. 2022;12(1):87.CrossRef Google Scholar PubMed

Rasmussen, LV, Whitley, EW, Welty, LJ. Pragmatic reproducible research: improving the research process from raw data to results, bit by bit. J Clin Invest. 2023;133(16). https://www.jci.org/articles/view/173741.CrossRef Google Scholar PubMed

Rasmussen, L, Whitley, E, Baldridge, A, Welty, L. Pragmatic Reproducible Research Guide (v2.0.0). DigitalHub. Galter Health Sciences Library & Learning Center, 2019. https://doi.org/10.18131/g3-eqv8-gr68.Google Scholar

Vuong, Q-H, La, V-P, Ho, TM, Vuong, T-T, Ho, TM. Characteristics of retracted articles based on retraction data from online sources through February 2019. Sci Edit. 2020;7(1):34–44.CrossRef Google Scholar

Ajiferuke, I, Adekannbi, JO. Correction and retraction practices in library and information science journals. J Librariansh Inf Sci. 2020;52(1):169–183.CrossRef Google Scholar

Kleinert, S. COPE’s retraction guidelines. Lancet. 2009;374(9705):1876–1877.CrossRef Google Scholar PubMed

Balhara, YPS, Mishra, AK. Compliance of retraction notices for retracted articles on mental disorders with COPE guidelines on retraction. Curr Sci. 2014;107:757–760.Google Scholar

Garfinkel, S, Alam, S, Baskin, P, et al. Enhancing partnerships of institutions and journals to address concerns about research misconduct: recommendations from a working group of institutional research integrity officers and journal editors and publishers. JAMA netw open. 2023;6(6):e2320796.CrossRef Google Scholar

Fanelli, D. Set up a ’self-retraction’ system for honest errors. Nature. 2016;531(7595):415–415.CrossRef Google Scholar

Pulverer, B. When things go wrong: correcting the scientific record. EMBO J. 2015;34(20):2483–2485.CrossRef Google Scholar PubMed

Greitemeyer, T. Article retracted, but the message lives on. Psychon Bull Rev. 2014;21(2):557–561.CrossRef Google Scholar

Heuvel, LNV, Rooney, JJ. Root cause analysis for beginners. Qual Prog. 2004;37:45–53.Google Scholar

Alexander, P, Antony, J, Rodgers, B. Lean six sigma for small- and medium-sized manufacturing enterprises: a systematic review. Int J Qual Reliab Manag. 2019;36(3):378–397.CrossRef Google Scholar

Christiansen, L, Flanagin, A. Correcting the medical literature: “To err is human, to correct divine. JAMA. 2017;318(9):804–805.CrossRef Google Scholar

Shi, X, Abritis, A, Patel, RP, et al. Characteristics of retracted research articles about COVID-19 vs other topics. JAMA Netw Open. 2022;5(10):e2234585.CrossRef Google Scholar PubMed

Das, S, Sarkar, HS, Uddin, MR, Rissanen, K, Mandal, S, Sahoo, P. Retraction: differential detection and quantification of cyclic AMP and other adenosine phosphates in live cells. Chem Commun. 2019;55(86):13016.CrossRef Google Scholar PubMed

Xu, Q, Liu, S-Q, Niu, J-H, et al. Retraction notice to “a new technique for extraperitoneal repair of inguinal hernia”. J Surg Res. 2019;244:452–459.CrossRef Google Scholar

Zhou, Q, Li, X, Wang, Q, et al. Retracted: hepatitis B virus infection in preconception period among women of reproductive age in rural China – a nationwide study. Paediatr Perinat Epidemiol. 2017;31(5):484.CrossRef Google Scholar

Baradwan, S, Alyousef, A, Turkistani, A. Associations between iron deficiency anemia and clinical features among pregnant women: a prospective cohort study [Retraction]. J Blood Med. 2019;10:19.CrossRef Google Scholar

Jafari, H. Retraction of the original article “Persian version of thirst distress scale (TDS) in patients on hemodialysis: factor structure and psychometric properties. Clin Nephrol. 2019;91:268.Google Scholar

PLoS One Editors. Retraction: placental expression of CD100, CD72 and CD45 is dysregulated in human miscarriage. PloS One. 2019;14(11):e0225491.CrossRef Google Scholar

Retraction. Low sodium versus normal sodium diets in systolic heart failure: systematic review and meta-analysis. Heart. 2013;99(11):820. doi: 10.1136/heartjnl-2012-302337.CrossRef Google Scholar

The PLOS ONE Editors. Retraction: molecular mechanism of thiazolidinedione-mediated inhibitory effects on osteoclastogenesis. PLOS ONE. 2020;15(2):e0229392.CrossRef Google Scholar

Petschner, P, Tamasi, V, Adori, C, et al. Retraction note to: gene expression analysis indicates CB1 receptor upregulation in the hippocampus and neurotoxic effects in the frontal cortex 3 weeks after single-dose MDMA administration in dark agouti rats. BMC Genom. 2016;17(1):721.CrossRef Google Scholar PubMed

RETRACTED: in sickness and in health? physical illness as a risk factor for marital dissolution in later life. J Health Soc Behav. 2015;56(1):59–73. https://journals.sagepub.com/doi/10.1177/0022146514568351?url_ver=Z39.88-2003.CrossRef Google Scholar

Retraction statement: ethnic threat and social control: examining public support for judicial use of ethnicity in punishment. Criminology. 2020; 58(1):190. https://onlinelibrary.wiley.com/doi/10.1111/1745-9125.12235.CrossRef Google Scholar

Hansen, A, Opoku, ST, Brown, A, Zhang, J. Notice of Retraction and Replacement. Hawkins et al. Trends in weight loss efforts among US adolescents with overweight and obesity. JAMA Pediatr. 2019;173(5):500–501.CrossRef Google Scholar

Retraction notice to “plasma PCSK9 levels and clinical outcomes in the TNT (Treating to New Targets) Trial”. J Am Coll Cardiol. 2013; 61(16):1751. https://www.sciencedirect.com/science/article/pii/S073510971301382X.CrossRef Google Scholar

Karakisi, S, Kunt, A, Çankaya, İ., et al. RETRACTED: do phosphorylcholine-coated and uncoated oxygenators differ in terms of elicitation of cellular immune response during cardiopulmonary bypass surgery? Perfusion. 2017;32(8):NP2–NP9.CrossRef Google Scholar PubMed

The PLOS ONE Editors. Retraction: survival analysis of adult tuberculosis disease. PLOS ONE. 2018;13(9):e0204676.CrossRef Google Scholar

WITHDRAWAL FOR “GLP-1 has an anti-inflammatory effect on adipose tissue expression of cytokines, chemokines, and receptors in obese women”. J Clin Endocrinol Metab. 2019;104(8):3524. https://academic.oup.com/jcem/article/104/8/3524/5523087 CrossRef Google Scholar

Hsieh, C-H, Tseng, C-C, Shen, J-Y, Chuang, P-Y. Retraction note to: randomized controlled trial testing weight loss and abdominal obesity outcomes of moxibustion. Biomed Eng Online. 2020;19(1):6.CrossRef Google Scholar PubMed

Yao, J, Wang, J-Y, Liu, Y, et al. Retraction note to: a randomized phase II study of everolimus for advanced pancreatic neuroendocrine tumors in Chinese patients. Med Oncol. 2015;32(8):221.CrossRef Google Scholar

Figure 1. Flow chart of Retraction Watch Database records, eligible abstracts, eligible full texts, and complete reviews.

Table 1. Characteristics of 884 retraction notices in clinical and translational research

Table 2. Selected extracts of retraction notices by type of error

Table 3. Characteristics of 786 retracted articles in clinical and translational research

Baldridge et al. supplementary material

PDF 1 MB

Article contents

The epidemiology of errors in data capture, management, and analysis: A scoping review of retracted articles and retraction notices in clinical and translational research

Abstract

Keywords

Introduction

Methods

Study design

Searches

Inclusion and exclusion criteria

RWDB

Abstracts

Retraction notices

Full-text articles

Data collection and management

Abstracts

Retraction notices

Full-text articles

Statistical analyses

Results

Eligible publications and reviewer agreement

Characteristics of retraction notices

Entities involved in the retraction

Types of errors

Characteristics of retracted articles

Authorship

Data

Study design

Methods and analysis

Discussion

Reasons for retracting articles

The retraction process

Limitations

Conclusions

Supplementary material

Acknowledgments

Author contributions

Funding statement

Competing interests

References

Baldridge et al. supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests