Introduction
Major depression is the most prevalent mental health disorder worldwide. The World Health Organization estimates that 322 million people (4.4% of the global population) suffer from depression (WHO, 2017). Depression is the leading cause of disability worldwide and is associated with considerable mortality and morbidity (Cuijpers et al., Reference Cuijpers, Vogelzangs, Twisk, Kleiboer, Li and Penninx2014b; Vos et al., Reference Vos, Allen, Arora, Barber, Bhutta, Brown and Collaborators2016). Not surprisingly, it has a debilitating impact on individuals' daily functioning, quality of life, and wellbeing. In addition, depression also poses a substantial burden on society, including high levels of service use, enormous economic costs, and production losses (Berto, D'Ilario, Ruffo, Virgilio, & Rizzo, Reference Berto, D'Ilario, Ruffo, Virgilio and Rizzo2000; Friedrich, Reference Friedrich2017; Greenberg & Birnbaum, Reference Greenberg and Birnbaum2005).
Given depression's detrimental effect on individuals and society, hundreds of randomized controlled trials (RCTs) have been conducted in the past four decades. The results demonstrated that adult depression can be treated effectively with psychological treatments (Cuijpers et al., Reference Cuijpers, Quero, Noma, Ciharova, Miguel, Karyotaki and Furukawa2021b; Cuijpers, Karyotaki, de Wit, & Ebert, Reference Cuijpers, Karyotaki, de Wit and Ebert2020). However, many people with depression do not actively seek help, and the treatment uptake is low in the general population. For instance, in the USA, only 35.3% of individuals with severe depression reported having seen a mental health professional in the previous year (Pratt & Brody, Reference Pratt and Brody2014). The low help-seeking rate is associated with several barriers, such as the limited availability of professional resources, financial considerations, low perceived need, mental illness stigma, lack of time, preference of self-management, and skepticism about treatment effectiveness (Andrade et al., Reference Andrade, Alonso, Mneimneh, Wells, Al-Hamzawi, Borges and Kessler2014; Cuijpers, Reference Cuijpers2021; Mojtabai, Reference Mojtabai2009; Schnyder, Panczak, Groth, & Schultze-Lutter, Reference Schnyder, Panczak, Groth and Schultze-Lutter2017).
Considering the critical nature of this situation, several strategies have been suggested to increase depression help-seeking, including education and awareness campaigns targeting individual attitudes and mental health literacy (Salerno, Reference Salerno2016), but also training for primary care providers (Lipson, Speer, Brunwasser, Hahn, & Eisenberg, Reference Lipson, Speer, Brunwasser, Hahn and Eisenberg2014). Further, tailored interventions to increase help-seeking intentions and behaviors (Ebert et al., Reference Ebert, Franke, Kählke, Küchler, Bruffaerts, Mortier and Baumeister2019; Xu et al., Reference Xu, Huang, Kösters, Staiger, Becker, Thornicroft and Rüsch2018), and indirect prevention and treatment (Cuijpers, Reference Cuijpers2021) may also help enhance depression help-seeking. Besides, systematic screening may be another possible solution to help in reaching people not actively seeking help and providing timely treatment. For instance, it has been suggested that programs combining early recognition of depression with an integrated system for management, result in reduction or remission of depression (Siu et al., Reference Siu, Bibbins-Domingo, Grossman, Baumann, Davidson, Ebell and Pignone2016). However, several other systematic reviews found that evidence for routine screening is not very strong (Keshavarz et al., Reference Keshavarz, Fitzpatrick-Lewis, Streiner, Maureen, Ali, Shannon and Raina2013; Meijer et al., Reference Meijer, Roseman, Delisle, Milette, Levis, Syamchandra and Thombs2013; Thombs et al., Reference Thombs, Roseman, Coyne, de Jonge, Delisle, Arthurs and Ziegelstein2013; Thombs, Ziegelstein, Roseman, Kloda, & Ioannidis, Reference Thombs, Ziegelstein, Roseman, Kloda and Ioannidis2014). And therefore, some national preventive services policy groups recommend against screening programs (NICE, 2009; Thombs, Markham, Rice, & Ziegelstein, Reference Thombs, Markham, Rice and Ziegelstein2021). Therefore, it is currently controversial whether screening programs should be recommended to enhance the low uptake of treatments.
Despite this lack of consensus, a fundamental question that should be considered before screening can be recommended is if we offer treatments to those who do not actively seek help, what will happen, and whether the treatments would still be effective. A systematic review, which included six studies, investigated the effects of psychotherapy for people who do not actively seek help in primary care (Cuijpers, van Straten, van Schaik, & Andersson, Reference Cuijpers, van Straten, van Schaik and Andersson2009). That meta-analysis indicated that when patients in primary care who do not seek help were offered treatments, there was no significant difference between psychotherapy and controls [waiting list, care-as-usual (CAU), and other control groups] [k = 6, d = 0.13; 95% confidence interval (CI) −0.08 to 0.34]. However, another study found that compared with usual care, screening with adequate treatment for adult depression in primary care could improve outcomes (Pignone et al., Reference Pignone, Gaynes, Rushton, Burchell, Orleans, Mulrow and Lohr2002). Evidently, whether psychological treatments are effective when people are not actively seeking help so far remains unclear. In addition, the number of included trials was too small to draw a definite conclusion. Furthermore, those studies were only focused on primary care, and many new trials in primary care and other settings have been conducted since then.
Therefore, we decided to conduct a meta-analysis to examine the effects of psychological treatments of depression among adults in any setting not actively seeking help but who were approached by researchers and screened for trial eligibility. In addition to the main analyses of intervention effects, we explored the proportions of participants that were included in the trial compared to the total number of potential participants who completed the screening scales for depression. By doing this, we hope to know when actively identifying people who do not actively seek help in the trial, how many of them should be screened to get enough patients.
Methods
The present meta-analysis has been preregistered at the Open Science Framework (https://osf.io/8sf6m) and was reported in accordance with The Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) (Moher et al., Reference Moher, Liberati, Tetzlaff, Altman, Altman, Antes and Tugwell2009). The Amendments to the protocol can be seen in online Supplementary Appendix A.
Eligibility criteria
For the current study, we employed the following inclusion criteria: (1) patient: studies recruiting participants who do not actively seek help (we defined ‘people who do not actively seek help’ as participants who have been recruited through screening instead of advertisements and clinical referrals). (2) Outcome: the primary focus was treating adults with elevated symptoms of depression above a cut-off score on a validated self-report scale or unipolar depression based on a clinical diagnostic interview. (3) Intervention and comparison: studies comparing psychological treatment with a control condition, i.e. CAU, waiting list, and other inactive treatment. (4) Studies employed an RCT design. We excluded studies that recruited participants through advertisements and clinical referrals, and studies conducted in specialized mental health care.
Identification and selection of studies
We used an established database of studies examining psychological treatments for adult depression. This database has been described in detail elsewhere (Cuijpers et al., Reference Cuijpers, Karyotaki, de Wit and Ebert2020). For this database, we conducted an ongoing comprehensive search in PubMed, PsycINFO, Embase, and the Cochrane Library (last update: 1st January 2021). The full search string can be found in online Supplementary Appendix B. Moreover, we searched reference lists of earlier meta-analyses on psychological treatments for depression. Titles and abstracts from the search were independently screened by two reviewers. Studies considered as potentially relevant according to one of the researchers were retrieved as full text. For this specific paper, based on the existing database, two reviewers (R.Z. and A.A.) independently screened the full texts to find relevant studies. Disagreements were solved by seeking the opinion of a third, senior reviewer (P.C.).
Data extraction and classification
Data were independently extracted by two reviewers. Discrepancies were discussed and checked by revisiting the original paper. We coded the following information:
(1) Participant characteristics: age group; percentage of female participants; mean age; diagnosis/self-report as an inclusion criterion.
(2) Psychotherapy information: type [was defined according to the generic definitions of therapies given in Cuijpers et al. (Reference Cuijpers, Karyotaki, de Wit and Ebert2020), treatment format (individual, group, guided self-help, telephone, or other formats], number of sessions.
(3) General characteristics of the studies: type of control group; setting for screening; year of publication; the country where a study was conducted; western country (yes/no).
(4) Information regarding assessment tools.
(5) Outcomes: data on calculating effect sizes (number of participants in intervention and control groups, means and standard deviations for both groups at post-test); data on long-term outcomes (more than 6 months); data on study drop-out (for any reason) for each of the psychotherapy and control conditions.
(6) The number of patients who received the screening scales for depression, the number of patients that completed screening tools, and the number of patients who were randomized in the trial.
Risk of bias assessment
We used the first version of the Cochrane Risk of Bias Assessment Tool to assess the validity of the included studies (Higgins et al., Reference Higgins, Altman, Gøtzsche, Jüni, Moher, Oxman and Sterne2011). The assessments were conducted in the following domains: random sequence generation; concealment of allocation to conditions; blinding of assessors or use of self-report scales; appropriate methods for handling incomplete outcome data (which was evaluated as high risk of bias when intention-to-treat analyses were not used and/or it had overall more than 50% study dropout and/or more than 30% imbalance in missing outcomes between groups); and selective outcome reporting (rated as positive when prospectively registered primary outcomes were consistently reported in the article) (Miguel, Karyotaki, Cuijpers, & Cristea, Reference Miguel, Karyotaki, Cuijpers and Cristea2021). Each domain with a lack of information was rated as high risk. Following this, studies met at least four quality criteria were assessed as overall low risk of bias. Two researchers (R.Z. and A.A.) assessed these domains independently, and disagreements were resolved through discussion until consensus was reached.
Outcome measures
Our primary outcome was the effect size of each comparison between psychotherapy and a control group, indicating the difference between two groups at post-test assessment. We calculated the effect size (Hedges' g) based on mean scores, standard deviations, and the number of participants. As many included studies had small sample sizes, we used Hedges' g, which is an effect size that adjusts for bias in small sample studies (Olkin & Hedges, Reference Olkin and Hedges1985). The effect sizes of 0.2, 0.5, and 0.8 represent small, moderate, and large effect sizes, respectively. If means or standard deviations were not available, we used the procedures of Comprehensive Meta-Analysis software (CMA, version 3.3.070) to calculate the effect size using dichotomous outcomes or other statistics (such as t value, p value, or change scores). If a study (or a comparison between psychotherapy and a control group) included more than one depression assessment, they were initially pooled within the study or comparison before pooling across studies, so each study or comparison contributed one effect size to the overall analysis.
Meta-analyses
Data analysis (except for the calculation of effect sizes, which was done in CMA) was performed in R studio for Windows (version 1.3.959), by using the ‘meta’ (Balduzzi, Rücker, & Schwarzer, Reference Balduzzi, Rücker and Schwarzer2019), ‘metafor’ (Viechtbauer, Reference Viechtbauer2010), and ‘dmetar’ (Harrer, Cuijpers, Furukawa, & Ebert, Reference Harrer, Cuijpers, Furukawa and Ebert2022) packages in R (version 4.0.1). As we expected considerable heterogeneity among the studies, we chose a random-effects pooling model in all analyses.
We pooled all included comparisons to obtain an overall effect estimate of psychotherapy. We also calculated numbers-needed-to-treat (NNT) to explain the clinical meaning of the effect size, by using the formulae provided by Furukawa (Reference Furukawa1999). We conservatively set the event rate in the control group as 19% (according to the pooled response rate of 50% reduction of symptoms across trials in psychotherapy for depression) (Cuijpers et al., Reference Cuijpers, Karyotaki, Weitz, Andersson, Hollon and Van Straten2014a). Heterogeneity was examined with I 2 and its 95% CI. In general, heterogeneity can be considered as low (25%), moderate (50%), and substantial (75%) (Higgins, Thompson, Deeks, & Altman, Reference Higgins, Thompson, Deeks and Altman2003).
A series of sensitivity analyses were conducted to examine the effect: (1) after the exclusion of outliers (the 95% CI of the effect size did not overlap with the 95% CI of the pooled effect size); (2) for studies with low risk of bias only; (3) for studies included multiple comparisons (two sensitivity analyses in which only one effect size per study was included: one with the smallest effect size and one with the largest effect size).
Furthermore, subgroup analyses were conducted to test whether the effects are different based on the characteristics of the study (setting, age group, diagnosis at baseline, and risk of bias). All subgroup analyses were performed according to the mixed-effects model, in which studies within subgroups were pooled with the random-effects model, whereas tests for significant differences between subgroups were conducted with the fixed-effects model. We also conducted multivariate meta-regression analyses to explore potential predictors. The outcome variable was the effect size, and the predictors were setting, age group, diagnosis, and risk of bias.
Additionally, funnel plots and Egger's test were used to examine potential publication bias (Egger, Smith, Schneider, & Minder, Reference Egger, Smith, Schneider and Minder1997). When we found indications of publication bias, Duval and Tweedie's trim-and-fill procedure was implemented to adjust for possible bias, through detecting missing studies and imputing them (Duval & Tweedie, Reference Duval and Tweedie2000).
Finally, study dropout (for any reason) was calculated with the relative risk (RR). We also examined long-term effects for studies with more than 6-month follow-up outcomes since randomization.
Additional analyses
As a secondary outcome, we calculated the pooled proportion of participants randomized in the trial from the total number of potential participants who completed the screening scales for depression. We also examined the pooled proportion of participants randomized in the trial from the total number of people who received the screening scales for depression.
We conducted two separate meta-analyses using ‘metaprop’ command of the ‘meta’ package in R (version 4.0.1). In these analyses, the logit transformation was first applied to each study's proportion. After pooling binomial outcome data, we back-transformed the synthesized results to the raw proportion scale. Since considerable heterogeneity was expected, we adopted a random-effects model. The between-study component of variance τ 2 was estimated by use of the DerSimonian–Laird method.
Results
Selection and inclusion of studies
Our search strategy identified a total of 27 133 records. After removing duplicates, we screened 19 612 records based on titles and abstracts. A total of 3239 studies were retrieved for full-text review, and 3187 studies were excluded from further investigation. The PRISMA flow chart of the study selection process and reasons for exclusion are presented in Fig. 1. As a result, 52 studies (with 61 comparisons between a psychotherapy and a control group) fulfilled all inclusion criteria and were included in the meta-analyses. References are presented in online Supplementary Appendix C.
Characteristics of included studies
The key characteristics of the included studies are summarized in Table 1. In sum, 7755 participants were involved in the meta-analysis (intervention groups: N = 3788; control groups: N = 3967). The sample size ranged from 19 to 570, with a mean age range from 21.9 to 81.5. Studies were conducted in general medical care (n = 22; 42.3%), perinatal care (n = 14; 26.9%), primary care (n = 5; 9.6%), and another context (n = 11; 21.2%). Additionally, 52 studies resulted in 52 proportions of participants in the trial from the number of participants who completed the screening tools. The proportions of participants in the trial from the number of participants who received the screening tools were available for 47 studies (online Supplementary Appendix D). A comprehensive description of characteristics can be seen in online Supplementary Appendix E.
Note. Indicators of the columns: g, effect size; s.e., standard error of effect size; therapy, type of therapy (cbt, cognitive behavior therapy; bat, behavioral activation therapy; 3rd, 3rd wave cognitive behavioral therapy; pst, problem-solving therapy; dyn, psychodynamic therapy; ipt, interpersonal psychotherapy; sup, non-directive supportive therapy; other, other type of therapy); Ctr, control group (wl, waiting list; cau, care-as-usual; other, other control condition); settings, setting category (ppd, perinatal care; genmed, general medical care; gp, general practitioner care (primary care); other, in another setting); age group, adults or elderly; M age, mean age; Pr wom, proportion of women; Dx, diagnostic interview (+, yes; −, no); Frm, format (ind, individual; grp, group; gsh, guided self-help; tel, telephone; other, other/mixed format); Nsess, number of sessions; country, country of trial (North A, North America; Aus, Australia; East A, East Asia; EU, Europe; UK, United Kingdom; other, other countries); SG, sequence generation [positive or negative (negative includes unclear)]; AC, allocation concealment [positive or negative (negative includes unclear)]; BA, blinded assessment [positive or negative (negative includes unclear); sr, self-report]; IOD, incomplete outcome data [positive or negative (negative includes unclear)]; SOR, selective outcome reporting [positive or negative (negative includes unclear)].
Quality assessment
Overall, the risk of bias was considerable. Of the 52 studies, 39 reported an adequate sequence generation (75%); 31 reported allocation to conditions by an independent party (59.6%); 51 reported blinding of outcome assessors or used only self-report outcomes (98.1%); and 36 used appropriate methods to handle missing data (69.2%). Besides, 10 studies (19.2%) were rated as free from selective reporting. In total, 26 studies (50.0%) that met at least four quality criteria received a low risk of bias. The remaining half of the included studies met less than four criteria were rated as a high risk of bias.
Overall effects of psychotherapies for depression
As shown in Fig. 2 and Table 2, the overall pooled effect size of 61 comparisons between psychotherapies and control conditions was 0.55 (95% CI 0.41–0.69), which corresponds with an NNT of 5.46. Heterogeneity was high (I 2 = 75%; 95% CI 68–80).
ES, effect size; NNT, numbers-needed-to-treat; ROB, risk of bias.
a According to the random effects model.
b The p value indicates whether the difference between subgroups is significant.
Exclusion of nine outliers led to a somewhat smaller effect size (g = 0.47; 95% CI 0.39–0.55; NNT = 6.64), and to a considerable reduction of heterogeneity (I 2 = 30%; 95% CI 2–51). Limiting the analyses to studies with low risk of bias did not affect much to the effect size (g = 0.57; 95% CI 0.33–0.82; NNT = 5.25), and the level of heterogeneity (I 2 = 82%; 95% CI 75–87).
Because more than one psychotherapy was compared with the same control group in seven studies (five had two psychotherapy arms, and two had three arms), we conducted two sensitivity analyses to examine the effect size of these studies. We found this did not change the overall pooled effect size and the level of heterogeneity (Table 2).
We found strong indications for publication bias. The funnel plot is presented in online Supplementary Appendix F.1. Egger's test of the intercept was significant (intercept: 1.53; 95% CI 0.55–2.51; p = 0.003), and Duval and Tweedie's trim-and-fill procedure resulted in 14 imputed studies. Adjusted effect size decreased (g = 0.35; 95% CI 0.17–0.52; NNT = 9.29; p < 0.001).
Subgroup and meta-regression analyses
In subgroup analyses (Table 2), different settings significantly moderated effect sizes (p = 0.017). The studies in which patients were recruited in primary care resulted in a lowest effect size (g = 0.26) compared with studies conducted in other settings (perinatal care: g = 0.65; general medical care: g = 0.40; non-specific settings: g = 0.83). We further conducted a post-hoc analysis to assess whether the number of sessions, which is related to the intensity of the psychotherapy, is a significant moderator for the effect sizes of psychotherapy. We categorized the number of sessions into three categories (6 or fewer; 7–11; 12 or more). But the result was not significant (p = 0.61) (online Supplementary Appendix K). Besides, we found no evidence indicating that effect sizes differed significantly across age group (adults v. elderly, p = 0.050), diagnosis at baseline (cut-off score v. diagnostic interview, p = 0.209), or risk of bias (low v. high, p = 0.844).
Additionally, we conducted multivariate meta-regression analyses. Because of the relatively small number of studies, only one available continuous variable (publication year) and five categorical variables [setting (general medical care v. other), age group (older v. younger adults), diagnosis at baseline (clinical interview v. score above a cut-off on a self-report scale), risk of bias (low v. high), and type of psychotherapy (cognitive behavior therapy v. other)] were entered as predictors. As can be seen in online Supplementary Appendix G, no significant predictors were found (p values >0.19).
Study dropout and long-term effects
We found no significant difference between treatment and control conditions for study dropout (RR = 1.16; 95% CI 1.00–1.35; I 2 = 37%; 95% CI 12–54; p = 0.054). Twenty-six studies assessed the long-term effects of psychotherapies. We classified the follow-up length into four groups: 6–8 months, 9–12 months, 13–24 months, and more than 24 months after baseline (online Supplementary Appendix H). A small-to-moderate effect size was found for studies assessing the effects of psychotherapy for 6–8 months (g = 0.33; 95% CI 0.14–0.52; NNT = 9.83; I 2 = 75%; 95% CI 63–83). We also found a small but significant effect size for studies assessing the longer-term outcomes for 9–12 months (g = 0.24; 95% CI 0.11–0.37; NNT = 13.99; I 2 = 31%; 95% CI 0–64) after baseline and more than 24 months after baseline (g = 0.18; 95% CI 0.15–0.21; NNT = 18.85; I 2 = 0%; 95% CI 0) with low heterogeneity. Studies examined effects at 13–24 months did not point at significant effects of the interventions (g = 0.32; 95% CI −0.19 to 0.82; NNT = 10.23; I 2 = 92%; 95% CI 87–96).
Proportion of participants randomized in the trial from the number of participants who completed the depression questionnaires
The pooled proportion of participants randomized in the trial from the total number of participants who completed the depression questionnaires was 0.13 (95% CI 0.08–0.19), with high heterogeneity (I 2 = 99.4%; 95% CI 99.4–99.5). The forest plot is given in online Supplementary Appendix I.2. Removal of 38 outliers affected the proportion only marginally (proportion = 0.12; 95% CI 0.10–0.15). The studies with low risk of bias resulted in a similar proportion (proportion = 0.14; 95% CI 0.07–0.26), and heterogeneity was still high (I 2 = 99.5%; 95% CI 99.5–99.6) (Table 3). Egger's test of the intercept was not significant (intercept: −2.84; t = 1.61, p = 0.114). According to the funnel plot (online Supplementary Appendix F.3), we found no indication of publication bias. Duval and Tweedie's trim-and-fill method resulted in seven imputed studies. Adjusted proportion decreased to 0.09 (95% CI 0.05–0.14).
Prop, proportion.
In subgroup analyses (Table 3), we found several significant differences between subgroups. We found significant difference among settings (p = 0.007). The pooled proportion in general medical care (proportion = 0.20) was higher than that in primary care (proportion = 0.09), perinatal care (proportion = 0.06), and other settings (proportion = 0.13). The proportion was significantly lower in people with a diagnosed mood disorder (proportion = 0.04) compared to those scoring above a cut-off on a self-report depression measure (proportion = 0.18) (p < 0.0001). However, the proportion was not significantly associated with the age group (p = 0.244) and the risk of bias (p = 0.457).
Furthermore, we conducted the same analysis for the proportion of participants randomized in the trial from the number of participants who received the depression questionnaires. The results were reported in online Supplementary Appendix J. The forest plot is given in online Supplementary Appendix I.1. The funnel plot is given in online Supplementary Appendix F.2.
Discussion
This is the first meta-analysis that examined the psychotherapy effectiveness among patients who do not actively seek help in all settings. Based on 52 studies with 61 comparisons, we found that psychological treatments had a significant, moderate-to-large effect on reducing depressive symptoms compared to control groups (g = 0.55). Although significant publication bias was identified, the effects remained small but significant after adjustment for the bias (g = 0.35). Limiting to studies with high quality did not change the effects (g = 0.57). The effects were retained at 12 months' follow-up, and it was small but significant (g = 0.24).
An earlier systematic review examining the interventions for adults found indications that actively identifying people not seeking help in primary care resulted in a small and not significant effect size (d = 0.13) (Cuijpers et al., Reference Cuijpers, van Straten, van Schaik and Andersson2009). However, we could not confirm that in the current meta-analysis. The current meta-analysis expanded to different settings and found a moderate-to-high effect size. The findings of the previous meta-analysis were based on a small number of trials and the quality of these trials was suboptimal. Therefore, we have more faith in the current findings, because of the broader set of studies (not only primary care), the larger number of studies and the larger number of studies with high quality. We did find that the setting where the recruitment was conducted is a significant moderator. It suggests that the effects of psychotherapy could differ considerably across settings. Only five studies were conducted in primary care, and it yielded a small and non-significant effect (g = 0.26), which was consistent with the previous study. One plausible explanation is because of low power (only a few numbers of studies were included). It is also possible that the depressive symptoms of patients in primary care may have had less severe conditions than those in secondary care-based studies, and little improvement is possible when they get treatments (Bortolotti, Menchetti, Bellini, Montaguti, & Berardi, Reference Bortolotti, Menchetti, Bellini, Montaguti and Berardi2008; Schwenk, Coyne, & Fechner-Bates, Reference Schwenk, Coyne and Fechner-Bates1996; Simon & VonKorff, Reference Simon and VonKorff1995). Another potential explanation could be that screening tools are less valid in primary care patients and may not be sensitive enough to detect the change between pre- and post-tests (Cuijpers et al., Reference Cuijpers, van Straten, van Schaik and Andersson2009; Thombs et al., Reference Thombs, Coyne, Cuijpers, de Jonge, Gilbody, Ioannidis and Ziegelstein2011). But we checked the instruments used in five primary care studies, and we found that two studies used the Center for Epidemiologic Studies Depression Scale (CES-D), one study used the Primary Care Evaluation of Mental Disorders Patient Questionnaire (PRIME-MD PQ), one study used the medical outcomes study depression screening inventory, and one study used Geriatric Depression Scale. All of these screening instruments are validated, but only one study used Patient Health Questionnaire (PHQ) specifically designed to screen depression in primary care. Besides, our findings could also be explained by the intensity of the psychotherapy (Cape, Whittington, Buszewicz, Wallace, & Underwood, Reference Cape, Whittington, Buszewicz, Wallace and Underwood2010). A previous study found that psychotherapy provided within primary care settings for depression is usually brief (Thomas & Corney, Reference Thomas and Corney1992). A common treatment length in the UK is six sessions (Stiles, Barkham, Connell, & Mellor-Clark, Reference Stiles, Barkham, Connell and Mellor-Clark2008). Following this, we conducted a post-hoc analysis for the number of sessions, which is related to the intensity of the psychotherapy, but we found no evidence indicating that effect sizes differed significantly across the number of sessions (p = 0.61). But we did find the length of the study may be a factor that limits the potential effect sizes in primary care. In the current meta-analysis with five studies conducted in primary care, three studies had six or fewer sessions, two studies had more than seven sessions. Additionally, the technical competence of the therapists or training of therapists may be related to the small effect size in primary care. However, we have not very extensively explored what kind of clinicians delivered the therapies because most trials have a mix of professionals and paraprofessionals (Cuijpers, Quero, Dowrick, & Arroll, Reference Cuijpers, Quero, Dowrick and Arroll2019). However, the result should be interpreted with caution due to a limited number of studies. Besides, according to a previous meta-analysis, the overall prevalence of depression has been estimated to be 19.5% in primary care (Mitchell, Vaze, & Rao, Reference Mitchell, Vaze and Rao2009). However, consistent with depressed patients in other settings, this large proportion of patients is not likely to be actively seeking help. Therefore, appropriate treatments for these clients should be a public health priority. More studies are needed to explore the true effect size of psychotherapies and to draw firm conclusions on the benefits of treatment effectiveness among those not seeking help in primary care.
We also found that in different settings, the effects of psychotherapies in people not actively seeking help are consistent with studies examining the effects of psychotherapies in all kinds of participants (Cuijpers, Reference Cuijpers2017; Cuijpers, Quero, Papola, Cristea, & Karyotaki, Reference Cuijpers, Quero, Papola, Cristea and Karyotaki2021c). The psychotherapy for people not actively seeking help in perinatal care resulted in a moderate-to-high effect size (g = 0.65) (Cuijpers, Brännmark, & van Straten, Reference Cuijpers, Brännmark and van Straten2008), and in general medical care yielded in a small-to-moderate effect size (g = 0.40) (van Straten, Geraedts, Verdonck-de Leeuw, Andersson, & Cuijpers, Reference van Straten, Geraedts, Verdonck-de Leeuw, Andersson and Cuijpers2010). This implies that in most settings with exception of primary care, psychological treatments are effective for people with depression who do not actively seek help, and the effects are sustained in the 12 months' follow-up. It is widely recognized that depression accompanying medical disorders is associated with lower compliance to treatment, delayed recovery, and increased mortality (Baune, Adrian, & Jacobi, Reference Baune, Adrian and Jacobi2007; Evans & Charney, Reference Evans and Charney2003). As for women during pregnancy who are depressed, depression may lead to serious health consequences for both the mother and her offspring (Dagher, Bruckheim, Colpe, Edwards, & White, Reference Dagher, Bruckheim, Colpe, Edwards and White2021; Field, Reference Field2017). Therefore, treating comorbid depression should be one of the priorities in these settings, and our findings are good news for depressive patients who do not actively seek help in these settings. It is possible that if they are actively identified and provided further treatment, they could benefit from it.
Our secondary goal was to explore the proportions of patients who completed the screening questionnaire who end up in the clinical trial. We found that 13% of patients who completed the screening questionnaire would meet the criteria for depression and agree to be randomized in the trial. Additionally, the setting is a significant moderator. The proportion of patients randomized from screening in general medical care was 20%, the proportion in primary care was 9%, and in perinatal care 6%. This is not surprising because the prevalence of depression is higher for patients with medical disorders (Katon, Reference Katon1996; van Straten et al., Reference van Straten, Geraedts, Verdonck-de Leeuw, Andersson and Cuijpers2010).
However, the results of the current study should be interpreted with caution considering several limitations. One important limitation is that we demonstrated strong indications for publication bias in the present study. After adjustment for this bias, the effect found for the psychotherapies was decreased (g = 0.35), although it remained statistically significant. It indicates that the actual effects might be smaller than the primary analyses indicate. Second, there was a significant and high level of heterogeneity among the studies. In the primary analyses, we found that nine outliers contributed to heterogeneity. However, the level of heterogeneity was extremely high in the analyses of secondary outcomes (proportions), which is commonly found rather than an exception in the meta-analyses of proportions (Cuijpers et al., Reference Cuijpers, Karyotaki, Ciharova, Miguel, Noma and Furukawa2021a; Lim et al., Reference Lim, Tam, Lu, Ho, Zhang and Ho2018; Sheldon et al., Reference Sheldon, Simmonds-Buckley, Bone, Mascarenhas, Chan, Wincott and Barkham2021). A possible explanation is that the included studies varied widely in terms of the recruitment for people not actively seeking help and numbers of enrolled, screened, and randomized. Therefore, the resulting proportions differ so much across studies. The high level of heterogeneity implies that it cannot be reliably predicted what the true mean proportion is. Third, it is important to note that the quality of half of the included studies was not optimal. We tried to minimize the impact of study quality by limiting the analyses to studies with a low risk of bias, but the effect size was still moderate-to-high (g = 0.57). Although the outcomes were comparable, this still means that the results should be interpreted with caution. Even if it might be challenging to examine, the quality in which psychotherapy is delivered is an important factor that should also be considered, because it might increase the chances of inflated treatment effects in comparison with the true effects in the population (Cuijpers et al., Reference Cuijpers, Karyotaki, de Wit and Ebert2020; Miguel et al., Reference Miguel, Karyotaki, Cuijpers and Cristea2021). However, since the subgroup analyses (grouped by the risk of bias assessment) did not show significant differences in effect size (p = 0.844), this should not be a big concern. Fourth, as mentioned before, we found few studies actively identifying people who do not seek help in primary care to the trial. Therefore, the generalizability of the results to primary care is limited. Fifth, we found that the number of sessions, which is related to the intensity of the psychotherapy, may be related to the small effect size in primary care, but intensity could be also the therapy duration in months or whether the sessions were biweekly or once a week. Unfortunately, since our database does not currently include that information, we cannot explore more about the intensity of psychotherapy.
Despite these limitations, the present study suggests that when people who do not seek help are actively identified and offered treatments, psychotherapies have the potential to decrease depressive symptoms, especially in perinatal care and general medical care settings. Our results of a moderate-to-high and sustained effect can encourage physicians and researchers to identify depressed patients and offer them timely treatment. However, the evidence is not conclusive considering the publication bias, high levels of heterogeneity, and the risk of bias in the half of included studies. The overall weakness of the results suggests that more high-quality trials exploring the effects of psychological treatments of depression among people not actively seeking help are urgently needed. Besides, since few included studies were conducted in primary care, our results of the effects of psychotherapy for depressed primary care patients were not stable. Therefore, further research concerning the large pool of primary care patients who have depression and might not actively seek help might be necessary.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S0033291722003518
Acknowledgements
We thank Clara Miguel and Marketa Ciharova for their kind help during data extraction.
Author contributions
RZ, EK, SYS, and PC conceptualized and designed the study. RZ conducted the analyses and prepared the first draft of the manuscript. RZ, AA, EK, and PC contributed to the data collection and extraction. All authors reviewed and critically revised the manuscript for important intellectual content.
Financial support
RZ is financially supported by the Chinese Scholarship Council Grant no. 202007720039 for her Ph.D. The funding source had no role in the design or execution of the research.
Conflict of interest
None.