Background
Major depressive disorder is a common psychiatric condition and a leading cause of disability worldwide (Bridges, Reference Bridges2014; World Health Organization, 2018). Globally, depression affects more than 300 million people and is associated with marked personal, social and economic morbidity, loss of functioning and productivity, and high levels of health-care service use (National Institute for Clinical Excellence, 2009; Thapar et al., Reference Thapar, Collishaw, Pine and Thapar2012).
The effects of psychological interventions for treatment of adult depression have been shown to be comparable to those achieved with pharmacological intervention, and are probably longer-lasting (Cuijpers and Gentili, Reference Cuijpers and Gentili2017). In particular, cognitive-behavioural therapy (CBT) is a common and effective psychological intervention for the treatment of depression (Churchill et al., Reference Churchill, Honot, Corney, Knapp, Mcguire, Tylee and Wessely2002; Butler et al., Reference Butler, Chapman, Forman and Beck2006; National Institute for Clinical Excellence, 2009; Shafran et al., Reference Shafran, Clark, Fairburn, Arntz, Barlow, Ehlers, Freeston, Garety, Hollon, Ost, Salkovskis and Williams2009; Lepping et al., Reference Lepping, Whittington, Sambhi, Lane, Poole, Leucht, Cuijpers, Mccabe and Waheed2017). CBT is considered a ‘family’ of related therapies (Beck, Reference Beck2005; Mansell, Reference Mansell2008), and intervention protocols variously incorporate a range of components such as psychoeducation, homework, behavioural activation, and problem solving. These can be used alone or in multiple combinations.
CBT interventions are complex, as they often include multiple therapeutic components and can be delivered in a number of ways. Whilst more usually administered in a traditional face-to-face setting (either individually or in groups), CBT is increasingly conducted via multimedia platforms (Button et al., Reference Button, Wiles, Lewis, Peters and Kessler2012). Multimedia CBT interventions can be provided with varying amounts of therapist interaction, with ‘self-help’ or ‘self-directed’ approaches characterised by a standardised treatment protocol that is followed by the patient without face-to-face contact with the therapist (Cuijpers and Kleiboer, Reference Cuijpers, Kleiboer, Derubeis and Strunk2017). Hybrid CBT interventions, including both face-to-face sessions and multimedia features, are also possible. Moreover, in a context where CBT is often not accessible for patients who could benefit from it (Shafran et al., Reference Shafran, Clark, Fairburn, Arntz, Barlow, Ehlers, Freeston, Garety, Hollon, Ost, Salkovskis and Williams2009; Wiles et al., Reference Wiles, Thomas, Abel, Ridgway, Turner, Campbell, Garland, Hollinghurst, Jerrom, Kessler, Kuyken, Morrison, Turner, Williams, Peters and Lewis2012), multimedia and hybrid interventions constitute promising alternatives to improve coverage for depressed adults (Cuijpers and Kleiboer, Reference Cuijpers, Kleiboer, Derubeis and Strunk2017).
The proliferation of CBT interventions raises questions about the relative effectiveness of different components and combinations of components of CBT interventions, as well as different delivery formats, and it is important to understand these effects both for clinical practice and for the development of novel interventions, such as hybrid interventions. Attempts to assess the effectiveness of different components/combinations of components of CBT for adult depression have largely taken the form of dismantling studies where two combinations of components are directly compared (Jacobson et al., Reference Jacobson, Dobson, Truax, Addis, Koerner, Jackie, Gortner and Prince1996; Vázquez et al., Reference Vázquez, Torres, Díaz, Otero, Blanco and Hermida2015). Likewise, pairwise meta-analyses of studies comparing two intervention types (e.g. face-to-face v. multimedia CBT) have been conducted (Cuijpers et al., Reference Cuijpers, Donker, Van Straten, Li and Andersson2010). However, the number of components and therapy types that can be examined using these approaches is limited. Furthermore, this does not allow the different process and content components of CBT to be simultaneously assessed. In this paper we explore the use of network meta-analysis to estimate the relative effectiveness of different content components and delivery formats reported in RCTs of CBT for depression, allowing the synthesis of a broader set of studies than dismantling studies and pairwise meta-analyses of a single intervention feature.
Network meta-analysis (NMA) (Higgins and Whitehead, Reference Higgins and Whitehead1996; Caldwell et al., Reference Caldwell, Ades and Higgins2005; Dias et al., Reference Dias, Sutton, Ades and Welton2013) allows pooling evidence on multiple interventions from a set of RCTs, each of which compare two or more of the interventions of interest. This provides a more inclusive approach than pairwise meta-analysis, since all pairwise comparisons of interventions can be examined (Melendez-Torres et al., Reference Melendez-Torres, Bonell and Thomas2015). If each combination of components and delivery method is considered a separate intervention, then NMA could be used to simultaneously compare the different interventions. However, NMA requires that the comparisons made by the RCTs form a connected network, in other words that there is a path of comparisons between any two included interventions. This is unlikely to be the case with complex interventions, such as CBT, because there are a large number of component and delivery combinations. Even if such a network is connected, the resulting analysis may lead to imprecise estimates.
Recently, component-level NMA regression methods have been developed to allow estimation of the additive contribution of components and/or combinations of components of complex interventions – such as CBT interventions – while fully respecting the randomized structure of the evidence (Welton et al., Reference Welton, Caldwell, Adamopoulos and Vedhara2009). This approach allows meaningful conclusions on effectiveness of components of complex interventions, whilst overcoming issues with connected networks and low precision with standard NMA. The use of component-level NMA in psychological interventions has been previously illustrated (Welton et al., Reference Welton, Caldwell, Adamopoulos and Vedhara2009; Cooper et al., Reference Cooper, Kendrick, Achana, Dhiman, He, Wynn, Le Cozannet, Saramago and Sutton2012; Caldwell and Welton, Reference Caldwell and Welton2016) and has the potential to address specific research questions as to what aspects of complex interventions are effective.
In this study, we performed a comprehensive systematic review of RCTs of adult outpatients with depression where the effectiveness of one or more CBT interventions was examined. We aimed to compare the effectiveness of different types of therapy, different components and combinations of components and aspects of delivery used in CBT interventions. We pooled study results using NMA and component-level NMA.
Methods
Study eligibility and selection
An extended description of the methods of this review is available in the published protocol (Davies et al., Reference Davies, Caldwell, López-López, Dawson, Wiles, Kessler, Welton and Churchill2018). Eligible studies were randomised controlled trials (RCT) including adults (⩾18 years) with a primary diagnosis of depression, in which the effectiveness of a CBT intervention during an acute phase of depression had been compared to treatment-as-usual (TAU), no treatment, wait list, psychological/attention placebo, and/or another CBT intervention. We considered CBT in its broadest sense as a family of related therapies. TAU definitions showed substantial variation across studies (a table with the verbatim descriptions of TAU across studies is provided in Web Appendix 2). In order to fully represent the broad spectrum of severity of depressive symptoms encountered in outpatient settings, we included both studies using standardised diagnostic criteria (DSM-III, DSM-III-R, DSM-IV-TR, DSM-5 or ICD-10) and those using a validated depression symptom questionnaire to identify depression based on a recognised threshold. We excluded studies focused on other disorders, studies involving inpatients, and articles written in languages other than English.
The primary outcome was depression score measured on any scale, with higher scores indicating more depressive symptoms. Our primary analyses focused on short-term effects, using the earliest measure after end of the intervention. We were also interested in mid-term effects (follow-up between 3 and 12 months after end of intervention) and long-term effects (beyond 12 months).
Secondary outcomes included outcomes designed to measure quality of life (with higher scores indicating better quality of life), remission, response, and attrition. For remission the definitions articulated in the primary studies were included. Response was defined as a decrease in depression scores of at least 50% from baseline to follow-up. Attrition related to the intervention phase and was included as an indicator of (inter alia) intervention acceptability.
Information sources
We searched MEDLINE (1950-), EMBASE, (1974-), PsycINFO (1967-) and the Cochrane Central Register of Controlled trials (CENTRAL) via the specialised register of the Cochrane Common Mental Disorders Group (CCMD-CTR) to 10 June 2016. This register contains over 40 000 reports of RCTs for common mental disorders (see Web Appendix 1 for further details). For the purpose of this network meta-analysis, we included studies identified from four separate searches of the CCMD-CTR, for a suite of Cochrane reviews (Hunot et al., Reference Hunot, Moore, Caldwell, Davies, Jones, Furukawa, Lewis and Churchill2010, Reference Hunot, Moore, Caldwell, Furukawa, Davies, Jones, Honyashiki, Chen, Lewis and Churchill2013; Churchill et al., Reference Churchill, Moore, Furukawa, Caldwell, Davies, Jones, Shinohara, Imai, Lewis and Hunot2013). Each search used a sensitive list of terms for intervention. We did not apply any restrictions on date, language or publication status to the searches. Supplementary searches were conducted for multi-media v. face-to-face CBT and complemented with additional searches to ensure that all relevant psychological and control interventions had been included were also conducted. Details of the CCMD-CTR search strategy are reported in Web Appendix 1.
Data collection and assessment of risk of bias
Two reviewers independently screened titles and abstracts and extracted data from the included studies. Authors of published studies, protocols, and trial register entries were contacted for additional information when necessary. Risk of bias was assessed by two reviewers independently using the Cochrane risk of bias tool (Higgins et al., Reference Higgins, Altman, Gotzsche, Juni, Moher, Oxman, Savovic, Schulz, Weeks and Sterne2011).
We included CBT interventions implemented with any combination of content components and delivered in any format. CBT interventions were classified by mode of delivery as either face-to-face CBT, multimedia CBT, or hybrid CBT (defined as multimedia CBT with one or more face-to-face sessions). We also recorded the number and average length of sessions. We defined multimedia CBT as any standardised CBT approach delivered using one, or a combination of, the following: self-help books, audio/video recordings, telephone, computer programmes (both online and desktop), apps, e-mail, or text messages. Web Appendix 2 gives the coding of intervention delivery components that we extracted, based on components commonly found in CBT interventions (Cuijpers and Kleiboer, Reference Cuijpers, Kleiboer, Derubeis and Strunk2017). The content components were derived using a method adapted from other reviews involving qualitative assessment of the intervention information provided in the trial reports (Faggiano et al., Reference Faggiano, Vigna-Taglianti, Versino, Zambon, Borraccino and Lemma2008; Hetrick et al., Reference Hetrick, Bailey, Rice, Simmons, Mckenzie, Montague and Parter2015), with reference to the UCL competences framework (University College London, 2018) and the Cognitive Therapy Rating Scale (Blackburn et al., Reference Blackburn, James, Milne, Baker, Standart, Garland and Reichelt2001). Potential components were first discussed with the steering group of this project – this group included psychologists, psychiatrists and academic researchers with expertise in CBT. These components were then piloted on the published literature and refined iteratively, in discussion with the author team.
Data synthesis and statistical analysis
We constructed network plots to illustrate which interventions had been compared within RCTs for each of the outcomes and time periods of interest, with node size and line thickness proportional to the number of patients contributing to each intervention and intervention comparison, respectively.
For continuous outcomes (depression and quality of life scores), we adopted a complete case analysis approach for the change from baseline to follow-up, assumed to follow a Normal likelihood. Due to the different measurement scales used across studies, we pooled results on a standardised scale, summarised as a standardised difference in mean change (sDIMC). This index compares the mean change from baseline to follow-up between two groups in standardised units (Rubio-Aparicio et al., Reference Rubio-Aparicio, Marín-Martínez, Sánchez-Meca and López-López2018), and is interpreted in the same way as a standardised mean difference. For studies that do not report mean change from baseline, we derived this from the baseline and follow-up scores where the standard error is estimated by assuming a correlation between baseline and follow-up scores. We estimated the correlation between baseline and follow-up scores from studies that report baseline, follow-up and change from baseline summaries, which gave an average value of 0.7. We used this value for studies that only reported baseline and follow-up scores.
For binary outcomes, we used intention-to-treat results (where available). We assumed a binomial likelihood, where the probability of the outcome is modelled on the log-odds scale, giving pooled intervention effects as log odd ratios, which we present as odds ratios (ORs) to facilitate interpretation of our findings. Although we assumed that participants lost to follow-up did not experience the event of interest in our main analyses, we also ran sensitivity analyses assuming the ‘best case scenario’ where all participants lost to follow-up experienced remission/response.
For each outcome we conducted NMAs and component-level NMAs to pool all evidence in the network. We considered models with increasing levels of detail to define interventions in the NMAs, namely: (1) Therapy Effects Model, where comparisons were made between TAU, no treatment, wait list, psychological/attention placebo, face-to-face CBT, hybrid CBT, and multimedia CBT; (2) Main Effects Model, in which the previous approach was extended to examine the effect of each individual component of CBT interventions, assumed to be additive, in a network meta-regression model; and (3) Full Interaction Model, in which each delivery format and combination of components was considered as a separate intervention (Welton et al., Reference Welton, Caldwell, Adamopoulos and Vedhara2009). All results are reported relative to TAU as the reference intervention but estimates between any pair of interventions can be obtained. Additional details about the NMA models are provided in Web Appendix 3.
We pre-specified (Davies et al., Reference Davies, Caldwell, López-López, Dawson, Wiles, Kessler, Welton and Churchill2018) characteristics of intervention delivery, including delivery format (group v. individual), intensity (defined as the product of the number of sessions and the average length of each session divided by 100), patient-therapist interaction (one-way v. two-way), tailored v. untailored CBT interventions, and format of multimedia CBT interventions. Where there was sufficient evidence to fit the models, we explored the influence of these potential effect modifiers using network meta-regression. We also performed sensitivity analyses excluding studies at high risk of bias on domains other than blinding and explored small-sample bias by including the inverse standard errors as a covariate using network meta-regression.
The NMAs were implemented in a Bayesian framework using OpenBUGS software (version 3.2.3). We assessed convergence based on two chains, through examination of Brooks-Gelman-Rubin diagnostic plots and history plots. All models presented achieved a satisfactory level of convergence. Model fit was assessed by examining the posterior mean residual deviance and deviance information criterion (DIC). We performed a weighted integration of the evidence through fitting both fixed and random effects network meta-analysis models; however, we only present the random effects model due to severe lack of fit for fixed effect models (see Web Appendix 6). Model fit was satisfactory for the random-effects models presented. Inconsistency was assessed by comparing the DICs of our primary analyses (based on NMA models that assume consistency between direct and indirect evidence) and the DICs yielded by inconsistency models (which provide effect estimates based on direct evidence only).
Results
Included studies
We retrieved 91 studies that met our inclusion criteria and reported results on at least one of the relevant outcomes of the review. Figure 1 presents a flow chart summarising the search process and results. A table of characteristics of included studies is provided in Web Appendix 4.
Definition of interventions
The most common comparators across studies were TAU (31 arms) and wait list (35 arms). We classified 10 interventions as attention placebo and 2 interventions as psychological placebo and merged them into a single category. Moreover, ‘no treatment’ was provided in 7 arms across the included studies.
Regarding CBT interventions, we found 86 arms that implemented face-to-face CBT, 7 arms for hybrid CBT, and 29 arms for multimedia CBT. Moreover, we identified the following components across the included studies: cognitive techniques (87 arms), behavioural activation (80 arms), psychoeducation (40 arms), homework (35 arms), problem solving (29 arms), social skills training (29 arms), relaxation (27 arms), goal setting (14 arms), final session (ability to end therapy in a planned manner and to plan for long-term maintenance of gains after treatment ends, 13 arms), mindfulness CBT (5 arms) and acceptance and commitment therapy (ACT, 5 arms). Most CBT interventions included multiple components in their definition. Web Appendix 2 provides a description of other intervention features we extracted including aspects of the treatment sessions, delivery method and multimedia methods.
Risk of bias in included studies
Most studies were judged to be at a low or unclear risk of bias for random sequence generation, allocation concealment, attrition bias and selective reporting. Given the nature of the interventions, most studies were judged to be at high risk of bias for blinding of participants and personnel and for blinding of outcome assessment. A more detailed description of risk of bias assessments is provided in Web Appendix 5 and Web Fig. 1.
Therapy-level NMA results
Change in depression scores
Change in depression scores at short term was the most widely reported outcome (76 studies, 6973 patients). The structure of the network and the NMA results for this outcome are shown in Figs 2a and 2b. There was evidence of a smaller decrease in depression scores for patients allocated to wait list, compared with patients who received TAU (sDIMC = 0.72, 95% CrI 0.09 to 1.35). All CBT interventions yielded larger decreases in depression score compared with TAU, with the biggest effect yielded by face-to-face CBT (sDIMC = −1.11, 95% CrI −1.62 to −0.60) and more uncertainty for hybrid CBT (sDIMC = −1.06, 95% CrI −2.05 to −0.08) and multimedia CBT (−0.59, −1.20 to 0.02).
Change in depression scores at mid-term was reported in 29 studies including 3441 patients (Figs 2c and 2d). There was no evidence to suggest that any of the CBT therapies differed from TAU in depression scores at mid-term, although results suggest that hybrid CBT led to a larger reduction in depression scores compared with TAU (sDIMC = −1.00, −2.13 to 0.12). However, the number of studies was not large enough to obtain precise effect estimates.
Results based on a follow-up period over 12 months were seldom reported across the included studies, which precluded examination of long-term effects.
Change in quality of life scores
Change in quality of life scores at short term was reported in 15 studies including 2425 patients (Figs 2e and 2f). Intervals around effect estimates are wide as a consequence of the small number of studies contributing to the analysis. Nonetheless, there was evidence of a reduced increase in quality of life scores for wait list compared with TAU (sDIMC = −1.23, 95% CrI −2.33 to −0.13), and results also suggest a reduced increase in quality of life scores for multimedia CBT compared with TAU (sDIMC = −0.58, 95% CrI −1.27 to 0.11). Insufficient data were available to produce estimates for mid- and long-term outcomes.
Remission, response, and attrition
The network plot and forest plot of NMA results for remission are displayed in Figs 3a and 3b, respectively. Results are based on 38 studies including 3391 patients and provide evidence that wait list (OR 0.33, 95% CrI 0.11 to 1.02) and placebo (OR 0.27, 95% CrI 0.07 to 1.03) have lower remission rates than TAU. The ORs for CBT interventions suggest higher remission rates compared to TAU, although the credible intervals are wide and all of them include the null.
Response – according to our definition – was only reported in 15 studies with a total of 1144 patients, and none of these studies included a ‘no treatment’ arm (Fig. 3c). Results show wide intervals for all treatment effect estimates (Fig. 3d), with the most precise estimate yielded by face-to-face CBT and suggesting that this therapy might increase the probability of response to treatment compared to TAU (OR 2.55, 95% CrI 0.77 to 9.51).
Attrition was reported in 75 studies including 8075 patients. The structure of the network and NMA results are shown in Figs 3e and 3f. We found no evidence of a clear difference in attrition rates among interventions.
We only present results at short term, as remission and response were not widely reported at longer follow-up periods and attrition is only meaningful as an indicator of treatment acceptability if it occurs during the intervention phase.
Component-level NMA results
For each outcome we fitted a main effects component-level NMA model where we estimated the specific effect each component of CBT adds to the ‘average’ CBT effect. We were unable to include mindfulness CBT and ACT components in this analysis, as they were only included in five studies each.
Results from the Main Effects model (where component effects are assumed additive) for change in standardised depression scores at short term are displayed in Fig. 4. The top side of this figure presents ‘main effects’, which are similar to the therapy effects for the comparators with the addition of an ‘average’ CBT effect, whereas the bottom of the figure estimates the effect modification on the overall CBT effect that can be attributed to interventions with a multimedia component, interventions with cognitive techniques, and so on. The top half of Fig. 4 shows that on average, there was strong evidence of a larger decrease in depression scores in CBT interventions yielded a larger decrease in depression scores compared to TAU (sDIMC = −1.77, 95% −2.57 to −1.01), and an increase for wait list compared to TAU (sDIMC = 0.75, 0.09 to 1.41). Regarding effect modifiers, there was no evidence of a difference between multimedia and face-to-face CBT interventions (sDIMC = 0.36, −0.24 to 0.96). Moreover, results suggest that interventions that reported having included a behavioural activation component yielded a smaller decrease in depression scores compared to other CBT interventions (sDIMC = 0.54, −0.09 to 1.16). Similar results were found for other outcomes (Web Appendix 6).
Full interaction component-level NMA model
We relaxed the assumption of additive component effects by fitting a Full Interaction model, where each combination of content components and delivery mode was considered as a separate CBT intervention. Using the information reported in the primary studies, we were unable to find any combinations of content components that were reported commonly enough to provide precise estimates of treatment effects. Instead, we defined a substantial number of CBT interventions, most of them only implemented only in one study, which yielded very imprecise estimates for most intervention effects. Results for change in depression scores at short term are presented in Web Appendix 6, and results for other outcomes are available on request.
Additional analyses
Based on model fit statistics, the therapy effects model presented above gave the best balance between model fit and complexity (according to the DIC). We therefore only explored further effect modifiers in relation to the therapy effects model. Regarding effect modifiers, results suggest that more intense face-to-face CBT interventions might be more effective in decreasing depressive symptoms (β = −0.072, 95% CrI −0.168 to 0.023), whereas for hybrid and multimedia CBT interventions effect modification for intensity was estimated to be close to 0 although with high levels of uncertainty (β = −0.010, −0.229 to 0.210). We found no evidence of effect modification for delivery method (group v. individual) neither for face-to-face (β = −0.025, −0.111 to 0.065) nor for hybrid and multimedia interventions (β = −0.043, −0.258 to 0.175). The influence of other intervention characteristics could not be examined due to lack of comparative evidence – the vast majority of CBT interventions were tailored and included two-way interactions, and the format of multimedia interventions was too heterogeneous across studies to be examined as a factor. We also ran additional analyses excluding studies assessed at high risk of bias for random sequence generation, attrition and/or selective reporting, and found the results were robust to this.
Regarding other types of bias, our analyses assuming a ‘best case scenario’ for missing data in binary outcomes yielded similar conclusions as the main analyses for remission, although there was no evidence of a difference between placebo and TAU (OR 1.21, 95% CrI 0.48 to 3.02). For response, there was evidence of a higher response rate for face-to-face CBT (OR 2.53, 1.46 to 4.68) and weak evidence (e.g. 95% CrI includes the null) of a higher response rate for multimedia CBT (OR 2.11, 0.96 to 5.30), compared to TAU (Web Appendix 6). Moreover, we found evidence of an association between the inverse standard errors and the effect estimates for depression at short term (β = 0.04, 0.02 to 0.07), suggesting that larger studies report changes in depressive symptoms that are closer to the null. This implies that the evidence we reviewed might be affected by small-sample bias.
We compared the model fit of consistency and inconsistency models (Web Appendix 6) and found no evidence of inconsistency between direct and indirect evidence, with the exception of the response outcome. However, the resulting intervention effect estimates displayed in Fig. 3d were very similar under both models.
We also repeated our analyses restricting to studies reporting depression scores using the Beck Depression Inventory (BDI), which was reported in 65 studies with a total of 4517 patients. Results at short and mid-term were comparable to those using standardised scores across scales (Web Appendix 6), although at mid-term there was stronger evidence (e.g. values within the 95% CrI were further away from the null) of an effect of CBT interventions when restricting to studies reporting the BDI.
Discussion
CBT interventions are complex and currently there is little understanding about which components and/or combinations of components are the most effective in reducing adult depression (Vázquez et al., Reference Vázquez, Torres, Díaz, Otero, Blanco and Hermida2015). Several novel approaches have been proposed for the synthesis of evidence on complex interventions (Higgins et al., Reference Higgins, López-López, Becker, Davies, Grimshaw, Rehfuess, Welton, Dawson, Easterbrook, Moore, Petticrew, Thomas and Caldwell2019). In this review we aimed to tackle this gap by performing a qualitative assessment of the included studies to identify the content components involved in each CBT intervention, and then undertake component-level NMA to examine the effectiveness of specific components and combinations of components.
We conducted a comprehensive systematic review and component-level NMA of RCTs examining the effectiveness of CBT interventions for depressed adults. We included a wide range of CBT interventions, differing in the content components and/or delivery method, and examined a number of relevant outcomes (Rush et al., Reference Rush, Kraemer, Sackeim, Fava, Trivedi, Frank, Ninan, Thase, Gelenberg, Kupfer, Regier, Rosenbaum, Ray and Schatzberg2006). We found strong evidence that CBT interventions yielded a larger decrease in depression scores compared to TAU at short term. Results for other outcomes, based on smaller numbers of studies, were unclear. We found little evidence of differential effectiveness of face-to-face v. multimedia CBT interventions, and no strong evidence of specific effects of any content components or combinations of components. There was substantial uncertainty around effect estimates for most outcomes and intervention comparisons including attention/psychological placebo, and some differences in treatment effect for which we found weak evidence might be due to chance. Nonetheless, compared with TAU, wait list interventions yielded a smaller decrease in depression scores, smaller increase in quality of life scores and lower remission rates.
The overall beneficial effect of CBT interventions in improving depressive symptoms is not surprising and has been reported before (Butler et al., Reference Butler, Chapman, Forman and Beck2006; Oei and Dingle, Reference Oei and Dingle2008; Richards and Richardson, Reference Richards and Richardson2012; Cuijpers and Gentili, Reference Cuijpers and Gentili2017). Furthermore, our findings are in line with previous studies where multimedia CBT was found to be similarly effective as face-to-face CBT (Cuijpers et al., Reference Cuijpers, Donker, Van Straten, Li and Andersson2010), although we emphasise the substantial uncertainty in the treatment effect estimates showing no difference between face-to-face and multimedia CBT. Moreover, it has been claimed that psychological interventions yield long-lasting effects (Cuijpers and Gentili, Reference Cuijpers and Gentili2017). In this respect, we found some evidence that mid-term results for hybrid and multimedia interventions are similar to short term (albeit wide intervals), whereas short-term face-to-face effects were not maintained to mid-term.
Limitations
Several limitations need to be acknowledged, which mostly represent common issues faced by systematic reviews in this field and highlight the need for a cautious interpretation of our results. One major problem we found was inconsistent reporting of interventions, with some studies providing sufficient detail about the CBT intervention/s they examined to enable coding of components, while many others only included a vague description often supported by citations to theoretical references (e.g. CBT treatment manuals) or previous applications in a similar context. Thus, our analyses of the specific contribution of different components were limited by the description of the interventions provided in the articles, with the potential risks that some interventions included components that were not reported and that some of the reported components were not received by all patients in the study. The importance of identifying and reporting core components is widely recognised in the field of complex interventions (Durlak and DuPre, Reference Durlak and Dupre2008; Hoffmann et al., Reference Hoffmann, Glasziou, Milne, Moher, Barbour, Johnston, Lamb, Dixon-Woods and Wyatt2014), and we believe that a more consistent and complete reporting of CBT interventions will facilitate the replicability and improve the understanding of the differential effectiveness of these interventions for the treatment of adult depression. Specifically, we call for a wide agreement between clinicians and researchers on the definition of the core content components that might be involved in CBT interventions as a first step, followed by a commitment from the scientific community to refer to this consensus when reporting CBT interventions in primary studies. In this vein, the UCL framework (University College London, 2018) could be a useful starting point.
Relatedly, transparent reporting of TAU in RCTs is often suboptimal and, as such, it is typically a heterogeneous grouping in all reviews. TAU is often used as a comparator in RCTs examining one or several CBT interventions, and unlike other reviews in this field, in our systematic review we separated TAU from other comparators to reduce heterogeneity. However, TAU definitions showed substantial variation within and across studies and the intensity of TAU was often not reported, and this potentially introduced additional uncertainty in the estimation of intervention effects. A table with the verbatim descriptions of TAU across studies is provided in Web Appendix 2.
Moreover, most studies included in our systematic review only reported results based on short follow-up periods, which limited our analyses of mid-term and long-term effects, and the number of studies was too small for most outcomes to get precise estimates of intervention effects. Further simulation work is warranted to explore the data requirements, particularly when the aim is to estimate interaction effects in component-level NMA. Furthermore, remission was defined in different ways across studies, which possibly introduced additional heterogeneity in the analysis of this outcome. Conversely, we only analysed response rates for studies that reported it according to our definition, but that yielded a small number of studies.
Last, our definition of ‘face-to-face’ treatments included therapy delivered on an individual basis or in groups. Whilst there has been some debate in the literature regarding the effectiveness of group v. individual CBT, evidence from earlier reviews concluded that, on balance, outcomes are similar for these two formats (Tucker and Oei, Reference Tucker and Oei2007; Oei and Dingle, Reference Oei and Dingle2008) supporting our decision to combine data accordingly. There was substantial heterogeneity in terms of the nature of the interventions, in particular the multimedia interventions. Whilst others have found that supported computerised interventions are more effective than unguided treatments (Richards and Richardson, Reference Richards and Richardson2012), little is known regarding the optimum form and nature of this support and hence given the extent of heterogeneity not only in the amount of support but also the modality of the intervention and combinations of delivery, it was not possible to explore this further.
Implications for future research and clinical practice
Our results have several implications for research and clinical practice. Technology is increasingly used in the context of CBT interventions for depression, and our study found that multimedia and hybrid CBT might be as effective as face-to-face CBT, although results need to be interpreted cautiously due to substantial uncertainty and potential small-sample biases. The effectiveness of specific combinations of content components and multimedia delivery formats remain unclear. Although this uncertainty could be explored in future trials/dismantling studies, prohibitively large samples are likely to be required to have sufficient power to detect the small differences expected (Mohr et al., Reference Mohr, Weingardt, Reddy and Schueller2017). Value of information analyses (Welton et al., Reference Welton, Sutton, Cooper, Abrams and Ades2012) are recommended to assess whether such a trial would represent value for money. Regarding specific content components for CBT interventions, it will be important that there is a full description of techniques consistent with the protocol to enable future work to build on this. It should also be important to determine the optimal number and length of sessions in order to optimize the use of health resources, and to include longer follow-up periods. Last, our results raise concerns on the use of wait list groups as comparators and suggest that other alternatives should be considered.
Throughout this paper we acknowledge that CBT interventions are complex. Our review addressed some of the features of this complexity (Grant and Calderbank-Batista, Reference Grant and Calderbank-Batista2013; Melendez-Torres et al., Reference Melendez-Torres, Bonell and Thomas2015), but other qualitative aspects should be considered to maximize treatment adherence and enhance treatment effectiveness for specific patient profiles. The importance of a clear description of both CBT interventions and comparators has been remarked previously (Mansell, Reference Mansell2008), and cannot be emphasised enough. Our study illustrates that CBT interventions are suitable for the treatment of adult depression as a class, but that further research and improvements in reporting of intervention descriptions are needed in order to determine which particular aspects characterise interventions that lead to the best outcomes for most patients.
Author ORCIDs
José A López-López, 0000-0002-9655-3616
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S003329171900120X.
Acknowledgements
This report is independent research funded by the National Institute for Health Research (Programme Grants for Applied Research, Integrated therapist and online CBT for depression in primary care, RP-PG-0514-20012). The views expressed in this publication are those of the author(s) and not necessarily those of the NHS, the National Institute for Health Research or the Department of Health and Social Care. This study was also supported by the NIHR Biomedical Research Centre at the University Hospitals Bristol NHS Foundation Trust and the University of Bristol. The study was also supported by the MRC ConDuCT-II Hub for Trials Methodology Research. This study was conducted in collaboration with the Bristol Randomised Trials Collaboration (BRTC), a UKCRC Registered Clinical Trials Unit (CTU) in receipt of National Institute for Health Research CTU support funding. This study was also supported by The Centre for the Development and Evaluation of Complex Interventions for Public Health Improvement (DECIPHer), a UKCRC Public Health Research Centre of Excellence. Joint funding (MR/KO232331/1) from the British Heart Foundation, Cancer Research UK, Economic and Social Research Council, Medical Research Council, the Welsh Government and the Wellcome Trust, under the auspices of the UK Clinical Research Collaboration, is gratefully acknowledged.
Author contributions
NW, DSK, RC, NJW, DMC, and GL conceived the project. SRD, DMC, SD, DT and RC selected articles for inclusion. SRD and AT extracted data and assessed risk of bias. JALL, NJW, NW, DSK, QW, JL and TJP selected results for inclusion. NJW, JALL and DMC planned the statistical analyses. JALL performed the NMAs. DSK and GL provided clinical expertise. JALL, NJW, NW, DMC, SD and SRD wrote the first draft of the paper and all authors revised it critically for important intellectual content. All authors have approved this version for publication.
We are also grateful to a number of colleagues who are involved with the INTERACT study as co-applicants but who have not participated in drafting this manuscript: David Coyle, Simon Gilbody, Paul Lanham, Una Macleod, Irwin Nazareth, Steve Parrott, Roz Shafran, Katrina Turner, Catherine Wevill and Chris Williams.
Conflict of interest
NJW is PI on an MRC grant in collaboration with Pfizer Ltd. Pfizer part fund a junior researcher. The project is purely methodological using historical pain relief data unrelated to this review.
Ethical standards
Not applicable. This is a review of published research, where no primary data were used.