Introduction
Whilst several non-pharmacological strategies have been proposed for attention-deficit/hyperactivity disorder (ADHD) (Sonuga-Barke et al. Reference Sonuga-Barke, Brandeis, Cortese, Daley, Ferrin, Holtmann, Stevenson, Danckaerts, van der Oord, Döpfner, Dittmann, Simonoff, Zuddas, Banaschewski, Buitelaar, Coghill, Hollis, Konofal, Lecendreux, Wong and Sergeant2013; Stevenson et al. Reference Stevenson, Buitelaar, Cortese, Ferrin, Konofal, Lecendreux, Simonoff, Wong and Sonuga-Barke2014; Cortese et al. Reference Cortese, Ferrin, Brandeis, Buitelaar, Daley, Dittmann, Holtmann, Santosh, Stevenson, Stringaris, Zuddas and Sonuga-Barke2015, Reference Cortese, Ferrin, Brandeis, Holtmann, Aggensteiner, Daley, Santosh, Simonoff, Stevenson, Stringaris and Sonuga-Barke2016), pharmacological treatment is an important component of the multimodal treatment recommended for this disorder (Cortese et al. Reference Cortese, Adamo, Mohr-Jensen, Hayes, Bhatti, Carucci, Del, Atkinson, Banaschewski, Simonoff, Zuddas, Barbui, Purgato, Steinhausen, Shokraneh, Xia, Cipriani and Coghill2017). Medications for ADHD comprise psychostimulant [e.g. methylphenidate (MPH) and amphetamine derivatives] and non-psychostimulant drugs (e.g. atomoxetine, clonidine and guanfacine) (Cortese & Rosello-Miranda, Reference Cortese and Rosello-Miranda2017). MPH is the most commonly used psychostimulant for ADHD in many countries, where it has been used for several decades (Maia et al. Reference Maia, Cortese, Caye, Deakin, Polanczyk, Polanczyk and Rohde2014).
Despite previous systematic reviews and meta-analyses pointing to high effect sizes, when considering the efficacy of MPH for the reduction of ADHD symptoms in the short term (e.g. Schachter et al. Reference Schachter, Pham, King, Langford and Moher2001; Van der Oord et al. Reference Van der Oord, Prins, Oosterlaan and Emmelkamp2008; Koesters et al. Reference Koesters, Becker, Kilian, Fegert and Weinmann2009; Castells et al. Reference Castells, Ramos-Quiroga, Rigau, Bosch, Nogueira, Vidal and Casas2011), a recent systematic review and meta-analysis by a Cochrane group led by Storebø (Storebo et al. Reference Storebo, Ramstad, Krogh, Nilausen, Skoog, Holmskov, Rosendal, Groth, Magnusson, Moreira-Maia, Gillies, Buch Rasmussen, Gauci, Zwi, Kirubakaran, Forsbøl, Simonsen and Gluud2015) questioned the evidence base for the efficacy and tolerability of MPH for ADHD in children and adolescents. This generated a strong and passionate reaction from the ADHD scientific community (e.g. Banaschewski et al. Reference Banaschewski, Buitelaar, Chui, Coghill, Cortese, Simonoff and Wong2016; Romanos et al. Reference Romanos, Coghill, Gerlach, Becker, Holtmann, Dopfner and Banaschewski2016; Hoekstra & Buitelaar, Reference Hoekstra and Buitelaar2016).
The aim of this paper is to summarise the findings and conclusions of the Cochrane meta-analysis, to present the key critiques to it, and to consider it in the broader context of the evidence base for the efficacy and tolerability of MPH.
The Cochrane meta-analysis
The aim of the work by the Cochrane group led by Storebø was to systematically review and meta-analyse randomised controlled trials (RCTs) reporting outcomes related to the efficacy and/or tolerability of MPH in children and/or adolescents with ADHD. Storebø et al. included RCTs of MPH for children and adolescents with ADHD (defined based on DSM-III, III-R, IV, IV-TR, 5 or ICD-9 or 10), with or without psychiatric comorbidities, irrespective of language, publication year, publication type or publication status. Furthermore, it was required that at least 75% of participants in each trial had an IQ >70. The primary outcomes were ADHD symptoms, assessed by teachers. The authors also recorded serious adverse events reported in the studies as a primary outcome, with less severe adverse vents being considered as a secondary outcome measure. Additional secondary outcomes were general behaviour in school and at home, as rated by psychometric instruments such as the Child Behaviour Checklist (e.g. CBCL), and quality of life, as measured by psychometric instruments such as the Child Health Questionnaire (e.g. CHQ).
In line with the state of the art recommendations to rate the study risk of bias (RoB) and overall evidence quality, Storebø et al. used the Cochrane RoB tool to rate the RoB of individual RCTs included in their systematic review, and the GRADE system to assess the overall quality of the evidence. The standard RoB includes the following six items, which are rated as at low, unclear or high risk for each study: selection bias (random sequence generation; allocation concealment); performance bias (blinding participants/personnel); detection bias (blinding assessor); attrition bias (incomplete outcome data); reporting bias (selective reporting); other bias. Of note, Storebø et al. added a 7th item, that is not formally included in the RoB, that is, vested interest, related to industry funding of the study and authors’ conflict of interest, in particular due to relationship with drug companies. The authors considered that a study was at overall high RoB if any one of the seven items received a score of either ‘high’ or ‘unclear risk’ of bias.
The GRADE system is based on the assessment of the within-trial RoB: directness of the evidence, heterogeneity of the data, precision of effect estimates and risk of publication bias.
Storebø et al. found 38 parallel-group trials (including a total of 5111 participants) and 147 cross-over trials (comprising a total of 7134 participants) pertinent for their systematic review. The average duration of the included RCTs was 75 days.
The authors found that the effect size for the efficacy of MPH on the primary outcome (ADHD symptoms rated by teachers) was 0.77 (0.64–0.90), which corresponds to a mean difference (MD) of −9.6 points [95% confidence interval (CI) −13.75 to −6.38] on the ADHD Rating Scale (ADHD-RS). Of note, Storebø et al. point out that a change of 6.6 points on the ADHD-RS is considered clinically to represent the minimal relevant difference. The effect size for the primary efficacy measure is indeed one of the highest effect sizes found in psychiatry, and more generally across medical disciplines (Leucht et al. Reference Leucht, Hierl, Kissling, Dold and Davis2012).
The authors also found no evidence that MPH was associated with an increase in serious adverse events [risk ratio (RR) 0.98, 95% CI 0.44 to 2.22]. As for the secondary outcomes, teacher-rated general behaviour (SMD −0.87, 95% CI −1.04 to −0.71) and quality of life (SMD 0.61, 95% CI 0.42 to 0.80) were improved with MPH. Regarding secondary outcomes related to tolerability, the authors found a 29% increase in the overall risk of any non-serious adverse events [RR: 0.98, 95% CI 0.44 to 2.22]. The most frequent adverse events were sleep disturbance and appetite decrease. More specifically, children in the MPH group were at 60% greater risk for trouble sleeping/sleep problems (RR 1.60, 95% CI 1.15 to 2.23; 13 trials, 2416 participants), and 266% greater risk for decreased appetite (RR 3.66, 95% CI 2.56 to 5.23; 16 trials, 2962 participants) than children in the control group.
Based on their summative analysis, Storebø et al. deemed ‘all 185 trials were assessed to be at high risk of bias’ and that ‘the quality of the evidence was very low for all outcomes’
Therefore, the Cochrane group concluded that ‘the low quality of the underpinning evidence means that we cannot be certain of the magnitude of the effects’ and that ‘If MPH treatment is considered, clinicians might need to use it for short periods, with careful monitoring of both benefits and harms, and cease its use if no evidence of clear improvement of symptoms is noted, or if harmful effects appear’. Finally, Storebø et al. recommended the use of nocebo in future studies, to reduce the risk of unblinding.
Critiques to the Cochrane meta-analysis
As mentioned, the Cochrane meta-analysis generated a series of critical reactions from several ADHD experts across the world, both in scientific journals and in blogs, to which Storebø et al. have systematically replied.
The main critiques have focused around:
(1) An idiosyncratic and too stringent approach to rate the RoB of individual studies. In particular, it has been highlighted that the RoB in the meta-analysis by Storebø et al. included the vested interest item, which is not part of the standard Cochrane RoB (Banaschewski et al. Reference Banaschewski, Buitelaar, Chui, Coghill, Cortese, Simonoff and Wong2016). Storebø et al. replied that there is evidence, based on work from Andreas Lundh et al. (cited in Storebo et al. Reference Storebo, Ramstad, Krogh, Nilausen, Skoog, Holmskov, Rosendal, Groth, Magnusson, Moreira-Maia, Gillies, Buch Rasmussen, Gauci, Zwi, Kirubakaran, Forsbøl, Simonsen and Gluud2015) that ‘there are many subtle mechanisms through which sponsorship and conflict of interest may influence intervention effects on outcomes.’ However, it has been pointed out (Banaschewski et al. Reference Banaschewski, Buitelaar, Chui, Coghill, Cortese, Simonoff and Wong2016) that there is evidence showing that vested interests do not impact the overall RoB of a study. It is fair to conclude that evidence on this issue is far from being conclusive. In addition, it has been highlighted that considering the overall quality of a study as LOW just because at least one item of the RoB was Unclear may be too stringent. Whilst Storebø et al. cited evidence supporting this, other meta-analyses (e.g. Catala-Lopez et al. Reference Catalá-López, Hutton, Núñez-Beltrán, Page, Ridao, Macías Saint-Gerons, Catalá, Tabarés-Seisdedos and Moher2017), rated the RoB as high if at least one item was rated as high risk; if the risk was rated as ‘unclear’, this did not result in an overall high study RoB. This is important to consider since often times items in the RoB are rated Unclear just because of poor reporting, when indeed the risk could be lower if full information from the paper were available. Finally, in terms of GRADE, Storebø et al. downgraded the quality of evidence by one point for inconsistency of effects (heterogeneity) and by two points for high RoB. Both these decisions are questionable. As for heterogeneity, I 2 for the meta-analysis of the main outcome was 37% for the primary outcome measure. The Cochrane Handbook suggests that heterogeneity up to 40% may not be important. Clearly, there is a certain level of subjectivity and uncertainty in the use of the threshold, which may lead to discrepant views.
(2) Inclusion of studies, such as the Multimodal Treatment of ADHD, with no placebo/no treatment, or studies in pre-schoolers (for which the effects of MPH are notoriously less evident), which is likely to under-estimate the effect of MPH. Although Storebø et al. pointed out that this was done according to their pre-published protocol, it goes without saying that issues in the protocol are not less concerning than issues in the meta-analysis per se. More importantly, even removing these studies, the assessment of study bias and evidence quality (see previous point) is still problematic.
(3) An emphasis on non-serious adverse events. Indeed, overestimating the adverse events associated with a medication may result in individuals with ADHD being exposed to harm. However, it may lead to the patient not benefitting from effective medications, if the potential adverse events are overestimated, limiting children’s access to effective treatment for ADHD, which has serious implications, given the substantial risks of not treating ADHD. Although, as found by Storebø et al., sleep disorders and decrease of appetite are more frequent with MPH compared with placebo, they tend to be transitory in most cases and can be clinically managed (Cortese et al. Reference Cortese, Holtmann, Banaschewski, Buitelaar, Coghill, Danckaerts, Dittmann, Graham, Taylor and Sergeant2013), but this was not highlighted in the Cochrane review.
(4) Errors in computation of effect sizes. After the European ADHD Guidelines Group highlighted them, Storebø et al. acknowledged these mistakes, stating that they will be corrected in further revisions of the meta-analysis. Overall, these were minor mistakes.
(5) It has been pointed out that the use of a nocebo would be highly unethical in children. Whilst Storebø et al. suggested that it should be used initially for adults, the issue of its use in children is still problematic.
Ultimately, it appears that the controversy around the level of the evidence base for MPH is, at least in part, linked to the lack of consensus on how to rate important aspects related to possible RoB of studies and more in the cut off to adopt when using the GRADE to appraise the evidence.
Evidence base for ADHD: the broader context
It should be considered that the duration of the RCTs included in the Cochrane review was overall short (average 75 days), which clearly is not informative for clinicians who see patients usually for many years, given the chronic nature of ADHD in the majority of patients. Overall, readers should consider not only evidence from RCTs, but also from other types of designs and studies. Whilst it is unethical to run RCTs for long periods, it is useful to consider evidence form withdrawal design RCTs (which are still few in the field) and from epidemiological studies. Indeed, large epidemiological studies, published in very high-profile journal, show the long-term benefits of MPH. For instance, a study published in the New England Journal of Medicine (Lichtenstein et al. Reference Lichtenstein, Halldner, Zetterqvist, Sjolander, Serlachius, Fazel, Långström and Larsson2012) in 25 656 patients with a diagnosis of ADHD found that, compared with non-medication periods, there was a significant reduction of 32% in the criminality rate for men (adjusted hazard ratio, 0.68; 95% CI 0.63 to 0.73) and 41% for women (hazard ratio, 0.59; 95% CI 0.50 to 0.70) when they were treated with MPH. Furthermore, large epidemiological studies have found no evidence for an association between stimulants (including MPH) and severe cardiovascular effects. A large study (Cooper et al. Reference Cooper, Habel, Sox, Chan, Arbogast, Cheetham, Murray, Quinn, Stein, Callahan, Fireman, Fish, Kirshner, O‘Duffy, Connell and Ray2011) of 1 200 438 children and young adults between the ages of 2 and 24 years found no evidence that current use of a medication for ADHD was associated with an increased risk of severe cardiovascular events (sudden cardiac death, acute myocardial infarction and stroke), although the upper limit of the 95% CI=0.31 to 1.85) indicated that a doubling of the risk could not be ruled out. Another large study (Habel et al. Reference Habel, Cooper, Sox, Chan, Fireman, Arbogast, Cheetham, Quinn, Dublin, Boudreau, Andrade, Pawloski, Raebel, Smith, Achacoso, Uratsu, Go, Sidney, Nguyen-Huynh, Ray and Selby2011) in 443 198 adults and an additional one (Schelleman et al. Reference Schelleman, Bilker, Strom, Kimmel, Newcomb, Guevara, Daniel, Cziraky and Hennessy2011) in 241, 417 children (3–17 years) concur with the previous one confirming that ADHD drugs use is not associated with increased risk of severe cardiovascular events. Although these large studies are reassuring, a more a recent study found an increased risk of severe cardiovascular events in the first 2 weeks of treatment (Shin et al. Reference Shin, Roughead, Park and Pratt2016), although the methodology of this study has been criticised (BMJ, 2016).
Finally, when it comes to the evidence on the use of MPH, one should also consider evidence on the neurobiological underpinnings for the action of MPH. Of note, a meta-analysis of functional magnetic resonance imaging studies suggested that MPH normalises brain activity in key brain regions (bilateral inferior frontal cortex/insula) affected in the disorder (Rubia et al. Reference Rubia, Alegria, Cubillo, Smith, Brammer and Radua2014).
Overall, readers should consider not only evidence from RCTs, but also from other types of designs and studies.
Conclusions
The meta-analysis by Storebø generated a strong controversy. It appears that some of the issues might be attributed to clinicians’ lack of consensus as to the methodology used by the Storebø group and the potential for subjective choices on how each study was rated for both quality and potential bias. It is hoped that the very visible and immediate response to this review, will be an opportunity for the field to think of how to design and conduct better, high quality studies and how to improve the methods to appraise the level of evidence.
Financial Support
No financial support was provided for this work.
Conflicts of Interest
The author has no conflicts of interest to disclose.
Ethical Standards
The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional committee on human experimentation with the Helsinki Declaration of 1975, as revised in 2008.