In an analysis of data from a cohort study, Bornehag et al. (2018) reported that in utero exposure to acetaminophen (APAP) during 0–13 weeks of gestation was associated with language delay (LD) at 30 months in female offspring (adjusted odds ratio [aOR], 4.64; 95% confidence interval [CI], 1.02–21.05) [Reference Bornehag, Reichenberg, Hallerback, Wikstrom, Koch and JonssonFootnote 1]. We have concerns about the approach to the analysis of the data:
a Logically, the primary objective of this study would have been to determine whether APAP exposure was associated with LD. The authors found that, in a multivariable model that included gender, mother’s education, mother’s weight, mother’s smoking status, and week of enrollment as covariates; there was no significant association between APAP exposure and LD (aOR = 1.26, CI = 0.72–2.19). The analysis should have stopped here with the conclusion that APAP exposure does not influence the LD outcome.
b However, in a gender-stratified analysis that included covariates, APAP exposure was associated with increased risk of LD in female offspring (aOR, 4.64; 95% CI, 1.02–21.05) but not in male offspring (aOR, 0.89; 95% CI, 0.47–1.66). This analysis seems suspiciously like post hoc exploratory analysis because neither the authors’ introduction nor their stated primary outcome indicated addressing gender-specific effects. This approach increases the false positive rate [Reference Simmons, Nelson and Simonsohn2]; something that the authors did not correct for. Considering the small effect (absolute risk difference. 5.3%) and low precision (95% CI, 1.0–9.7%), it is certain that after correction for multiple hypothesis testing, the finding in female offspring would no longer be statistically significant.
Furthermore, the authors binned continuous variables into categories. The number of APAP tablets consumed was binned in three categories. LD was classified into <25, 25–50 and >50 words, then reclassified into >50 words and <50 words. This approach involves an arbitrary decision which makes comparison across studies difficult, and creates a biologically implausible model where the risk jumps suddenly at arbitrary cut-offs but remains constant within a category [Reference Altman3]. Ideally, a sensitivity analysis for these arbitrary decisions should have been reported.
To summarise, we argue that declared and undeclared flexibility in the statistical analysis has increased the chances of a false positive finding in this study [Reference Simmons, Nelson and Simonsohn2].
Comments
No Comments have been published for this article.