Introduction
Depression is the leading cause of disability worldwide and its prevalence is increasing (World Health Organization, 2017). Hence, top priorities in clinical mental health research are a better understanding of disease mechanisms and improved treatment efficacy (Elfeddali et al., Reference Elfeddali, van der Feltz-Cornelis, van Os, Knappe, Vieta, Wittchen and Haro2014). Experience sampling and ecological momentary assessment (EMA) techniques (Larson & Csikszentmihalyi, Reference Larson, Csikszentmihalyi and Reis1983; Shiffman, Stone, & Hufford, Reference Shiffman, Stone and Hufford2008) have increased our general understanding of depression by providing novel insights into the daily emotional dynamics and physiological changes that accompany this condition (Myin-Germeys et al., Reference Myin-Germeys, Oorschot, Collip, Lataster, Delespaul and van Os2009; Telford, McCarthy-Jones, Corcoran, & Rowse, Reference Telford, McCarthy-Jones, Corcoran and Rowse2012). More recently, researchers have suggested that these real-life self-assessments might also inform individual patients' clinical diagnosis and treatment (van Os, Delespaul, Wigman, Myin-Germeys, & Wichers, Reference van Os, Delespaul, Wigman, Myin-Germeys and Wichers2013a, Reference Van Os, Delespaul, Wigman, Myin-Germeys and Wichersb; Wichers, Reference Wichers2014). Moreover, self-monitoring with person-specific feedback has been put forward as a treatment in itself: it might reduce depressive symptoms by increasing self-awareness and inducing behavioral change (Myin-Germeys et al., Reference Myin-Germeys, Kasanova, Vaessen, Vachon, Kirtley, Viechtbauer and Reininghaus2018). Basic self-monitoring has been shown to improve emotional self-awareness, which enabled the recovery of depression (Kauer et al., Reference Kauer, Reid, Crooke, Khor, Hearps, Jorm and Patton2012). EMA can provide an even more fine-grained film of the dynamics of depressive symptomatology, which can reveal previously implicit dysfunctional patterns and therefore provide new leads for behavioral change (Kramer et al., Reference Kramer, Simons, Hartmann, Menne-Lothmann, Viechtbauer, Peeters and Wichers2014). While studies have underscored the acceptability and feasibility of ecological momentary interventions (EMIs) with a self-monitoring component, there has been limited research on their efficacy (Colombo et al., Reference Colombo, Fernández-Álvarez, Patané, Semonella, Kwiatkowska, García-Palacios and Botella2019; Myin-Germeys, Klippel, Steinhart, & Reininghaus, Reference Myin-Germeys, Klippel, Steinhart and Reininghaus2016).
To date, only four EMIs have been evaluated in clinically depressed patients (Colombo et al., Reference Colombo, Fernández-Álvarez, Patané, Semonella, Kwiatkowska, García-Palacios and Botella2019) by studies with generally modest evidential strength. Two single-arm pilot studies (n = 8) of mobile applications incorporating symptom self-monitoring reported a decrease in depressive symptoms (Burns et al., Reference Burns, Begale, Duffecy, Gergle, Karr, Giangrande and Mohr2011: Mobylize!; Mohr et al., Reference Mohr, Stiles-Shields, Brenner, Palac, Montague, Kaiser and Duffecy2015: MedLink). A pilot randomized controlled trial (RCT) that did include a control group (n = 14) in addition to an experimental group (n = 14) was not powered to statistically test group differences, but reported ‘potentially meaningful improvements’ in depressive symptoms among regular (but not casual) usersFootnote 1 of their self-monitoring support platform (Burton et al., Reference Burton, Szentagotai Tatar, McKinstry, Matheson, Matu and Moldovan2016: Help4Mood). Finally, a larger RCT (n = 102) reported that self-monitoring in addition to pharmacological treatment decreased depressive symptoms. These improvements were maintained over time in the group that received weekly feedback (Kramer et al., Reference Kramer, Simons, Hartmann, Menne-Lothmann, Viechtbauer, Peeters and Wichers2014: REsource MObilisation Device In Depression, REMOD-ID).
The EMI in the REMOD-ID study was based on a behavioral activation approach: the aim was to open up custom opportunities to increase the experience of positive affect (PA) by increasing personal insights in PA patterns and the context in which it is experienced (Kramer et al., Reference Kramer, Simons, Hartmann, Menne-Lothmann, Viechtbauer, Peeters and Wichers2014). While results were promising, the intervention had a substantial face-to-face component (6 weekly feedback sessions), which goes against the EMI principle of delivering psychological support in daily life (Colombo et al., Reference Colombo, Fernández-Álvarez, Patané, Semonella, Kwiatkowska, García-Palacios and Botella2019; Heron & Smyth, Reference Heron and Smyth2010) and could have driven treatment effects. Moreover, the intervention was evaluated in the relative absence of psychotherapy, a common part of depression treatment. Therefore, it remains unclear whether an EMI based on self-monitoring and person-specific feedback can add beneficial effects to regular depression treatment.
The first aim of the current study was to evaluate the efficacy of an EMI for depression in routine clinical practice. Early self-monitoring with personalized feedback might help patients obtain more insight in the processes involved in their depressive symptoms and day-to-day functioning, which might help them make the most of potential waitlist periods and commence treatment programs with a head start (Bastiaansen et al., Reference Bastiaansen, Meurs, Stelwagen, Wunderink, Schoevers, Wichers and Oldehinkel2018). Therefore, the patients who participated in this study started self-monitoring as soon as possible after clinical intake. Given that, for many patients, the essence of recovery is to rise above the presumed limitations associated with mental illness (Huber et al., Reference Huber, Knottnerus, Green, Horst, Jadad, Kromhout and Smid2011), we not only investigated the impact of an EMI on depressive symptomatology (primary outcome), but also on social functioning and feelings of empowerment; outcome domains that were also reported in complementary articles on the REMOD-ID study (Simons et al., Reference Simons, Hartmann, Kramer, Menne-Lothmann, Höhn, van Bemmel and Wichers2015; Snippe et al., ).
The term EMI merely describes a method; the approach and content of self-monitored items and feedback vary from system to system (Burns et al., Reference Burns, Begale, Duffecy, Gergle, Karr, Giangrande and Mohr2011; Burton et al., Reference Burton, Szentagotai Tatar, McKinstry, Matheson, Matu and Moldovan2016; Kramer et al., Reference Kramer, Simons, Hartmann, Menne-Lothmann, Viechtbauer, Peeters and Wichers2014; Mohr et al., Reference Mohr, Stiles-Shields, Brenner, Palac, Montague, Kaiser and Duffecy2015), with unknown effects on efficacy. Hence, the second aim of the study was to examine the impact of EMI content on its efficacy. Participants who were randomized into one of the two intervention modules engaged in similar procedures, but the self-monitored items and feedback had a different focus: PA and activities in the ‘Do’-module (reminiscent of the REMOD-ID study), and negative affect (NA) and thinking patterns in the ‘Think’-module. These two EMI modules are both conceivably beneficial, as they link up with the two main angles of psychotherapy for depression: behavioral activation through positive reinforcement of activities and cognitive therapy to help individuals recognize and replace negative thinking patterns (Beck, Rush, Shaw, & Emery, Reference Beck, Rush, Shaw and Emery1987).
Our study is the first to examine the effects of – two different – EMI modules as an add-on to regular depression treatment. We hypothesized that the EMI groups would show more or faster positive changes over time compared to the control group (i.e. treatment-as-usual, TAU). We also looked into differences between the two EMI modules but did not have clear expectations of which one of the two would outperform the other based on the current literature.
Methods
Study design
The ZELF-i study was designed as a pragmatic RCT to allow evaluation of the intervention in real-life care facilities. The study protocol has been published elsewhere (Bastiaansen et al., Reference Bastiaansen, Meurs, Stelwagen, Wunderink, Schoevers, Wichers and Oldehinkel2018) and will be briefly explained below. The study was approved by the Medical Ethical Committee of the University Medical Center Groningen (UMCG, no. 2015/530). The trial has been registered prospectively in the Dutch Trial Register (Nederlands Trial Register, NTR5707, http://www.trialregister.nl) at 1 February 2016.
Participants
We recruited adult patients referred for depressive complaints to five general or specialized outpatient teams at four secondary mental health care organizations in the Netherlands. Specialized teams for affective disorders were asked to assess every new admission for eligibility, while general teams only assessed admissions listed as depression. Eligibility criteria were broad as to include a sample representative of clinical practice. Inclusion criteria were: (a) a clinical diagnosis of depression and primary indication for depression treatment by the mental health care professional (hereafter named: practitioner); (b) age between 18 and 65 years; and (c) written informed consent. Exclusion criteria were (based on practitioners' appraisals): (a) crisis intervention warranted; (b) presence of psychotic or manic symptoms; and (c) incapability of following research procedures due to inadequate Dutch language proficiency, significant auditory or visual impairments or mental retardation.
Randomization comprised a two-stage procedure. First, randomization was stratified per treatment location to account for clinical features that may influence outcomes: reported current psychotherapy (yes v. no) and antidepressant use in the 8 weeks prior to study entry (new/switch v. no/maintenance), similar to the REMOD-ID study. This stratification required the generation of 20 random allocation sequences (one for each of the four strata times five study locations) in stage two. For each sequence, block randomization was used to achieve balance in the allocation of participants to the study arms. That is, each of the three conditions was present twice (in random order) in a block of six. Participants were individually assigned to the control group or one of the two EMI groups (allocation ratio 1:1:1) by research assistants who did not have access to the random allocation sequences. The allocation was implemented by sequentially numbered sealed envelopes.
The sample size calculation was based on the primary outcome measure (depressive symptom severity) and indicated that 40 participants were needed in each of the three study groups (Bastiaansen et al., Reference Bastiaansen, Meurs, Stelwagen, Wunderink, Schoevers, Wichers and Oldehinkel2018, pp. 7–8). In the previous RCT (Kramer et al., Reference Kramer, Simons, Hartmann, Menne-Lothmann, Viechtbauer, Peeters and Wichers2014), the Inventory of Depressive Symptomatology Self Report (IDS-SR) scores for the experimental group showed an initial 3-point drop 8 weeks after baseline (Cohen's f = 0.125). With a sample size of 40 per group, an alpha of 0.05, an intraclass correlation of 0.6 (between pre- and post-intervention measurements), and six measurements, the power to detect a group × time interaction of f = 0.125 was 97% (G×Power 3.1: Faul, Erdfelder, Lang, & Buchner, Reference Faul, Erdfelder, Lang and Buchner2007). We considered smaller effects irrelevant and hence stopped recruitment when each study group included 40 ‘completers’, that is, participants who completed the baseline, EMI (in case of the treatment groups), and post-EMA assessment. Data acquisition took place between May 2016 (first study intake) and March 2019 (last follow-up).
Intervention
Both EMA intervention modules comprised 28 consecutive days of systematic self-monitoring in combination with 4 weekly digital feedback reports and one face-to-face session to discuss the fourth and final feedback report with a research assistant. Eligible patients were enrolled in the study as soon as possible after clinical intake. Circumstances were otherwise kept as ‘natural’ as possible; regular treatment was not adapted and started upon availability, regardless of the study phase.
Self-assessments
Participants filled out brief questionnaires via a web application (RoQua, https://www.roqua.nl) on their smartphones. The measurements were set at five fixed moments during waking hours with an interval of 3 hours, programmed to optimally fit a participant's daily routineFootnote 2. Each measurement comprised a momentary part, a module-specific retrospective part (past 3 hours) and a module-specific prospective (next 3 hours) part (for the full item list see Bastiaansen et al., Reference Bastiaansen, Meurs, Stelwagen, Wunderink, Schoevers, Wichers and Oldehinkel2018).
In both intervention modules, each measurement started with questions on momentary well-being, momentary affect (6 PA and 6 NA items), and momentary physical state, and ended with the question how much the participants were bothered by the measurement. In between, participants in the Do-module retrospectively recorded experienced pleasure, motivation, physical activity, busyness, time spent at home, in pleasant social contexts, and outdoors, and performed activities; and prospectively recorded anticipatory pleasure and motivation. Items deliberately focusedFootnote 3 on positive contexts and activities to help participants monitor changes in their behavioral patterns, and ultimately increase their activity level, especially in pleasurable activities. Participants in the Think-module retrospectively recorded how much they focused on feelings, the amount of brooding, the occurrences of specific negative and positive events, and the presence of both negative and positive thoughts; and prospectively recorded worrying. Items were chosen to increase personal insights in daily events and participants' reactions to them with the ultimate goal of reducing NA.
In both modules, the morning measurement additionally included a question about sleep. Furthermore, the evening measurement also included four questions on how participants experienced the past day (retrospective well-being, coping, motivation, and mindfulness). Questions were mainly rated on visual analogue scales [usually ranging from not at all (0) to very much (100)], and in some cases on dichotomous scales (e.g. for activities and minor events).
Feedback reports
Standardized feedback reports (Online Supplementary Appendix A and https://osf.io/m6hvg/) were generated based on individual data and emailed to the participant after each week of EMA measurements with each successive report containing richer information. In line with behavioral activation, the Do-module reports comprised various graphs showing PA and activity patterns. In line with cognitive therapy, graphs in the Think-module focused on events, thinking patterns, and NA over time. The fourth report additionally included feedback on temporal relationships between sets of variables [e.g. PA and physical activity (Do-moduleFootnote 4), or NA and rumination (Think-module)], but only for participants who filled out more than 75% of the measurements. This fourth and final feedback report was discussed with a research assistant in a face-to-face session in the week after the last EMA measurement (Online Supplementary Appendix A). Participants were encouraged to share the feedback report with their (future) therapist.
Measures
Demographic and clinical characteristics were queried at study intake (baseline). Participants completed online self-report questionnaires on depressive symptoms, psychosocial functioning, and empowerment at baseline (at the study site), in the week after the 28-day intervention period (post-EMA) and at four follow-up measurements 1, 2, 3, and 6 months post-EMA (at home). Participants in the treatment groups additionally completed an evaluation questionnaire (on site) at the post-EMA measurement. For the full list of questionnaires see Bastiaansen et al. (Reference Bastiaansen, Meurs, Stelwagen, Wunderink, Schoevers, Wichers and Oldehinkel2018).
Depressive symptom severity
Change in depressive symptom severity was measured by the total score on the 30-item IDS-SR (Rush, Gullion, Basco, Jarrett, & Trivedi, Reference Rush, Gullion, Basco, Jarrett and Trivedi1996) across the six-time points. The IDS-SR includes all Diagnostic and Statistical Manual of Mental Disorders, fourth edition diagnostic criterion items for major depressive disorder, as well as commonly associated symptoms such as irritability. Each symptom item is scored on a scale from 0 to 3, with higher scores denoting greater symptom severity. The IDS-SR has good psychometric properties with high concurrent and internal validity and is sensitive to treatment change (Rush et al., Reference Rush, Trivedi, Ibrahim, Carmody, Arnow, Klein and Keller2003). In our study, Cronbach's alpha for the 30 items was 0.84 at baseline.
Social functioning
The Outcome Questionnaire-45 (OQ-45) is a 45-item self-report scale that measures subjective discomfort (SD), disturbance in interpersonal relations (IR) with partners, family and friends, and functioning in social roles (SR) such as work and school (Lambert et al., Reference Lambert, Burlingame, Umphress, Hansen, Vermeersch, Clouse and Yanchar1996). Each item is scored on a 5-point scale from never (0) to almost always (4). We administered – at each of the 6-time points – the Dutch version of the OQ, whose psychometric properties are adequate and similar to the original instrument (de Jong et al., Reference de Jong, Nugter, Polak, Wagenborg, Spinhoven and Heiser2007). In our analyses, we used the 11-item IR subscale (Cronbach's α = 0.69 at baseline) and the 9-item SR subscale (Cronbach's α = 0.65). Higher values on the IR and SR subscales indicate more disturbances in IR (range: 0–44) and SR functioning (range: 0–36), respectively.
Empowerment
The Netherlands Empowerment List [(NEL), Boevink, Kroon, & Giesen, Reference Boevink, Kroon and Giesen2008] is a 40-item self-rating scale to assess patient empowerment, developed by the Dutch Trimbos Institute in collaboration with patients (for an English translation see van den Berg, van Amstel, Ottevanger, Gielissen, & Prins, Reference van den Berg, van Amstel, Ottevanger, Gielissen and Prins2013). Previous research has shown construct validity is satisfactory and internal consistency is high (Boevink et al., Reference Boevink, Kroon and Giesen2008). The NEL incorporates six dimensions: professional help, social support, own wisdom, sense of belonging, self-management, and community inclusion. Items are formulated in positive statements of strengths as perceived by the individual and are rated on 5-point scales ranging from 1 (‘strongly disagree’) to 5 (‘strongly agree’). To prevent confounding by treatment status, we had to adjust our original plan by excluding the professional help subscale from the total empowerment score (36 items, range: 36–180, Cronbach's α = 0.87 at baseline).
Start to treatment
We extracted information from the electronic patient records to determine care use and the time (in days) between study intake and the first psychotherapy session.
Statistical analysis
To reduce experimenter bias, analyses and data handling procedures were preregistered (https://osf.io/6kwre). The results of all preregistered analyses and (the rationale for) any deviations from the original preregistration are described in this article and related materials. The analysis code is openly available online (https://osf.io/m6hvg/).
The data had a hierarchical structure, with multiple assessments of the IDS-SR, OQ-45 and NEL being clustered within participants. We used R (R Core Team, 2018) and the lme4 (Bates, Mächler, Bolker, & Walker, Reference Bates, Mächler, Bolker and Walker2015) and lmerTest (Kuznetsova, Brockhoff, & Christensen, Reference Kuznetsova, Brockhoff and Christensen2017) packages to perform a multilevel regression analysis for each of the four outcome measures (IDS-SR, IR, SR, and NEL). Models included time (in months, not weeks as noted in the preregistration), group, and the two-way interaction between time and group as fixed effects; quadratic trends (time2 and group × time2) were added to the model if they improved model fit. Models additionally included a random intercept and a random slope for time, which effectively allowed participants to vary in their experienced symptoms at baseline and in trajectories of change over time. We used full information maximum likelihood estimation, which can deal with data missing at random relatively well.
Our main analyses were based on the intention-to-treat principle: participants were compared within the groups to which they were initially randomized, independently of having received the allocated treatment, having dropped out of the study or having violated the initial protocol (for whatever reason). That is, participants were included in the main analyses regardless of the number of completed self-assessments, the number of feedback reports that were read, and whether the post-EMA feedback session was attended. In addition, we examined the efficacy of the add-on tool exclusively among participants who did not drop out of the intervention by means of a per-protocol analysis.
Results
Sample characteristics
Participant flow throughout the study is shown in Fig. 1. Approximately half of the eligible participants did not participate in the study; they either directly declined at clinical intake or were ‘lost before study start’ (i.e. they initially indicated interest but could not be reached by the research team or eventually did not attend or finish the study intake). The target sample size (n = 120 completers) was reached amply: 130 of the 161 patients randomized to one of the three study arms were completers. Ten controls did not complete the post-EMA measurement. One partial completer finished the 28-day intervention period but did not attend the post-EMA feedback session. Twenty participants dropped out of the intervention due to practical or time constraints (Do: 4, Think: 6), negative effects from completing the measurements (Do: 0, Think: 4), or for unknown reasons (Do: 3, Think: 3). There were no statistically significant differences (all p > 0.31) between participants who fully completed the intervention period and those who did not in baseline depressive symptoms, IR and SR functioning, and empowerment (Online Supplementary Appendix B). The three groups were very comparable in socio-demographic and clinical characteristics at baseline (Table 1). Regarding TAU, almost all participants (97%, n = 155) received a form of psychotherapy at one point during the study period. Participants mostly received (group or individual) cognitive behavioral therapy (n = 108) in combination with a wide array of other treatments. Sixty-seven participants (42%) used antidepressant medication at study start and 21 (13%) started, stopped or changed their medication during the intervention period.
Note. Numbers represent percentages or mean ± standard deviation. IDS-SR = Inventory of Depressive Symptomatology Self Report, OQ-45 = Outcome Questionnaire-45, NEL = NetherlandsEmpowerment List.
a Educational level – low: no/primary/low secondary, middle: high school/low vocational, high: higher vocational/university.
b Date of the first psychotherapy session at one of the study locations (as recorded in patients' files) minus the date of the study intake. We excluded the additionally preregistered self-report question on the start of psychotherapy, which proved difficult to answer (e.g. due to confusion with other appointments) and yielded many discrepancies with the patient files.
c Participants indicated whether they used antidepressants in the 8 weeks prior to study intake and whether the usage was stable (maintenance) or changed. In case of change, participants subsequently indicated which changes occurred (multiple responses possible).
d Change in antidepressant use between study intake and post-EMA assessment.
e Four participants (Do: 2, Think: 2, Control: 0) scored below the IDS-SR criterion for remission (i.e. 14: Meesters, Duijzer, Nolen, Schoevers, & Ruhé, 2016).
f The total empowerment score has been calculated excluding the professional help subscale (range: 36–180), which was not applicable at baseline for 24 participants in the Do-module, 25 participants in the Think-module, and 23 controls.
Treatment adherence
Response compliance was high for the self-assessments (Table 2): after removal of dropouts, the average percentage of completed measurements was approximately 76% in both intervention modules. For half of the participants, response compliance was higher than 75%; they received additional feedback on temporal relationships in their final feedback report. The majority of participants read all weekly feedback reports and 92% intended to discuss the feedback with their therapist. However, at the 2-month follow-up (FU2), only 57% indicated to have actually done so.
Note. Numbers represent percentage, mean ± standard deviation, or median [25th;75th percentile]. FU2 = the follow-up assessment 2 months after the post-EMA assessment.
a The sample size differs per variable due to missing data or different measurement levels (i.e. diary duration is based on the number of valid measurements rather than the number of participants).
b Reasons why participants did not discuss the feedback report with their therapist were: treatment did not start yet (n = 8), report not useful (n = 5), did not know how to discuss it (n = 4), did not finish the intervention (n = 2), or other (n = 7, e.g. forgotten, never came up, did not get around to it yet).
Treatment evaluation
Detailed information on the patient-perceived feasibility and usability of the intervention can be found in Online Supplementary Appendix C. In brief, participants were positive about the usability of the web application. The feedback reports were rated fairly positively on comprehensibility and usefulness (≈60–70 out of 100) and contained the right amount of information according to the majority (86%). The face-to-face feedback session with a research assistant was perceived as very useful (≈80 out of 100). Of the 90 (out of 110) participants who completed the intervention, 86% would recommend it to others.
Treatment outcomes
All mixed model assumptions were satisfied (Online Supplementary Appendices D-G). Figure 2 displays the results of the multilevel regression analyses of the interaction between time and group for each of the four outcome measures.
Contrary to our primary hypothesis, neither of the intervention groups showed a significantly greater or faster decline over time in depression severity compared to the control group (Fig. 2a, Do v. Control: B linear = 0.1, t(538) = 0.1, p = 0.94, and B quadratic = 0.0, t(470) = 0.3, p = 0.79; Think v. Control: B linear = 0.7, t(540) = 0.8, p = 0.44, and B quadratic = −0.1, t(470) = −0.5, p = 0.59). The Do-module and Think-module did not differ significantly from each other either (B linear = −0.7, t(539) = −0.7, p = 0.48, and B quadratic = 0.1, t(471) = 0.8, p = 0.50). Results were very similar for the per-protocol analysis (all interaction terms p > 0.29), which only included participants who completed the intervention (Do-module: n = 48/55, Think-module: n = 42/55). Overall, depression severity showed an early decline, which levelled off after FU1 (i.e. a combination of linear and quadratic trends). The average decline in depression severity from baseline to FU4 was 11 points on the IDS-SR with large between-person differences (s.d. = 14). Full analysis details, including the random effects (i.e. variation in individual effects), are presented in Online Supplementary Appendix D.
Full analysis details for disturbances in IR and SR functioning can be found in Online Supplementary Appendices E and F, respectively. All groups showed a (modest) linear decrease in disturbances in IR (Fig. 2b) and in SR functioning (Fig. 2c) over time, and did not differ significantly from one another in the decline over time (IR scale, Do v. Control: B linear = 0.2, t(85) = 0.9, p = 0.36; Think v. Control: B linear = −0.1, t(84) = −0.3, p = 0.73; Do v. Think: B linear = 0.2, t(86) = 1.3, p = 0.20; all per-protocol p > 0.17; SR scale, Do v. Control: B linear = −0.5, t(533) = −1.2, p = 0.25, and B quadratic = 0.1, t(476) = 1.5, p = 0.14; Think v. Control: B linear = −0.3, t(535) = −0.8, p = 0.41, and B quadratic = 0.0, t(477) = 0.8, p = 0.42; Do v. Think: B linear = −0.1, t(531) = −0.3, p = 0.74, and B quadratic = 0.0, t(477) = 0.7, p = 0.51; all per-protocol p > 0.13).
All groups showed an overall linear increase in empowerment over time, and did not differ significantly from one another (Fig. 2d, Do v. Control: B linear = −0.4, t(106) = −0.7, p = 0.46; Think v. Control: B linear = −0.1, t(105) = −0.2, p = 0.83; Do v. Think: B linear = −0.3, t(106) = −0.5, p = 0.60; all per-protocol p > 0.56). Results were very similar if we used the empowerment measure – as was done in a previous study (Simons et al., Reference Simons, Hartmann, Kramer, Menne-Lothmann, Höhn, van Bemmel and Wichers2015) – with ‘imputed’ scores for the professional help scale (although the Akaike information criterion favored a model with additional quadratic terms). Details for all empowerment analyses can be found in Online Supplementary Appendix G.
Post-hoc analyses
Detailed information on post hoc analyses can be found in Online Supplementary Appendix H. First, we explored depressive symptom trajectories in subgroups that were constructed based on compliance, and observed a more favorable course in the highly compliant group (⩾75%) compared to the less compliant group. Given the intermediate trajectory of the control group (who did not complete any repeated self-assessments), compliance is more likely to be a marker for a favorable course than its cause. Second, we explored whether the EMI might have added value in the absence of TAU: we observed that although participants who engaged in the EMI while waiting for psychotherapy showed early symptom declines, similar declines were seen in controls waiting for psychotherapy. Formal testing was impossible because subgroups were small and the waitlist condition was not ‘clean’: before the start of psychotherapy patients turned out to often have had other appointments (e.g. diagnostic testing, medication consults). Third, we merged the figure comprising depressive outcome data of the experimental group and control group of the previous RCT (Kramer et al., Reference Kramer, Simons, Hartmann, Menne-Lothmann, Viechtbauer, Peeters and Wichers2014) with our outcome data. Upon reviewer request, we also reran our intention-to-treat analyses including covariates (educational level, type of TAU, etc.) to study the impact on efficacy estimates. Estimates were essentially the same as those of the original model, with none of the interaction terms reaching significance. This indicates that the original comparison of treatment groups was reasonable. Furthermore, we added separate comparisons between baseline and each of the five follow-up measurements (post-EMA, FU1-4), in which each measurement (e.g. FU1) was regressed onto the baseline measure with the group as an added predictor. Group differences were small for each of the individual comparisons (for all outcome measures); none of the group comparisons reached an uncorrected significance level of 0.05, let alone after Bonferroni adjustment (p < 0.01).
Discussion
Overall, the study participants showed significant improvements over time in depressive symptoms, social functioning, and empowerment. However, the EMI groups did not show more or faster changes over time than the control group. Furthermore, the trajectories of the two EMI modules were very similar. Hence, we did not find statistical evidence that EMIs based on self-monitoring and person-specific feedback could augment the efficacy of regular depression treatment.
Our results seem to stand in sharp contrast with the only previous RCT (Kramer et al., Reference Kramer, Simons, Hartmann, Menne-Lothmann, Viechtbauer, Peeters and Wichers2014: REMOD-ID), which concluded that an EMI could be an effective therapeutic tool for depression. However, differential outcomes between these studies appear to relate less to the response in the experimental groups – the symptom decline in these groups was rather comparable across studies (Online Supplementary Appendix H) - than to the symptom trajectories of the control groups: a flat line for depressive patients receiving pharmacotherapy only (REMOD-ID) and a decline for those receiving TAU including psychotherapy (ZELF-i). The EMI groups of neither study seemed to outperform the latter control group.
More specifically, the EMI group from the REMOD-ID study that did not receive psychotherapy showed changes comparable to our control group, which received no EMI but psychotherapy. Together, these studies suggest that EMIs activate similar mechanisms as psychotherapy. This is well conceivable given that the EMIs in both studies were based on common behavioral or cognitive strategies that have been found effective in the treatment of depression (Cuijpers, Karyotaki, de Wit, & Ebert, Reference Cuijpers, Karyotaki, de Wit and Ebert2019). Although cognitive and behavioral therapies are more comprehensive, EMIs might be seen as a sophisticated extension of the self-assessment homework that psychotherapy usually entails already: paper-and-pencil activity monitoring and ABC (antecedent-beliefs-consequences) worksheets. Whereas targeted depression mechanisms may be similar, the EMI approach specifically aims to mobilize patients as active agents in their recovery process (Myin-Germeys et al., Reference Myin-Germeys, Kasanova, Vaessen, Vachon, Kirtley, Viechtbauer and Reininghaus2018; van Os et al., Reference van Os, Delespaul, Wigman, Myin-Germeys and Wichers2013a, Reference Van Os, Delespaul, Wigman, Myin-Germeys and Wichersb), and hence might be expected to enhance empowerment. However, we did not find evidence for an EMI-specific empowerment increase. In the REMOD-ID study, empowerment increases in the experimental group were not significantly different from the control group either (p = 0.061; Simons et al., Reference Simons, Hartmann, Kramer, Menne-Lothmann, Höhn, van Bemmel and Wichers2015).
If the activated mechanisms of EMI and psychotherapy are similar, EMIs might primarily be effective when interventions involving face-to-face sessions are not – yet – available. The REMOD-ID findings are not conclusive in this respect, despite the comparison with passive pharmacotherapy, because its EMI involved repeated face-to-face feedback sessions itself. Our study could not provide a definite answer on whether the EMI had added value in the absence of TAU because subgroups that had to wait for psychotherapy oftentimes did have intermediate consultations. Whereas additional studies in groups with no actual access to psychotherapy could be useful in this matter, we believe a more fruitful endeavor would be to examine whether EMIs could partly substitute standard psychological treatment without sacrificing efficacy. The only non-inferiority trial in depressed patients to date tentatively suggests that a blended treatment – including four face-to-face sessions and a self-monitoring smartphone application – could possibly treat nearly twice as many depressed patients compared to a full behavioral activation treatment, with comparable results (Ly et al., Reference Ly, Topooco, Cederlund, Wallin, Bergström, Molander and Andersson2015). Thus, perhaps the promise of EMI might not be increased efficacy, but a more efficient way of delivering care, which could reach many more patients in need.
Although we did not find evidence for an EMI effect on depressive symptoms, empowerment, or social functioning, we cannot exclude the possibility that our EMI modules had an impact on other domains. In fact, the high percentage of study participants who reported that they would recommend the intervention seems to suggest that they did experience some utility from it (although we should note that such a subjective measure comes with a risk of overestimation due to, for instance, agreement bias: Chang, Gillespie, & Shaverdian, Reference Chang, Gillespie and Shaverdian2019). EMIs might, for instance, help patients acquire better self-insight. A recent study showed that the experimental groups of the REMOD-ID study improved in emotion differentiation (Widdershoven et al., Reference Widdershoven, Wichers, Kuppens, Hartmann, Menne-Lothmann, Simons and Bastiaansen2019). Furthermore, an intervention study in adolescents with emotional problems, which included a self-report measure of emotional self-awareness, found that self-monitoring had a direct effect on emotional self-awareness (Kauer et al., Reference Kauer, Reid, Crooke, Khor, Hearps, Jorm and Patton2012: mobiletype). Moreover, changes in depressive symptoms were mediated by increases in emotional self-awareness in the intervention group. Thus, although an EMI may not lead to better mood per se, it might advance self-insight, which could be a first step on the road to recovery. Future research needs to evaluate whether we are studying the right outcome domains. Qualitative work on the impact of EMIs from patients' perspectives might be a good starting point.
Strong suits of this study are the rigorous study design (RCT) comprising two highly comparable intervention modules, the inclusion of clinical as well as functional outcome measures across multiple time points, the good treatment adherence, and the naturalistic setting, which allows the generalization of results to regular clinical practice. A drawback of the naturalistic setting is the resulting heterogeneity of the TAU condition (in both timing and content) that may have added noise and leaves open the possibility that the EMI is effective under certain conditions. In addition, given that therapist guidance might bolster the effectiveness of smartphone interventions (Linardon, Cuijpers, Carlbring, Messer, & Fuller-Tyszkiewicz, Reference Linardon, Cuijpers, Carlbring, Messer and Fuller-Tyszkiewicz2019), the EMI might have had a stronger effect if it had been more integrated with psychotherapy: about half of the patients who received the EMI indicated they did not discuss their feedback reports with their therapist. Furthermore, our final sample comprised only patients who were willing and able to participate in a research study beside starting their regular treatment. It is unlikely that this inevitable selection influenced the main outcomes of our trial. If anything, one would expect an overestimation – not absence – of a treatment effect. Finally, it is important to highlight that this study regarded only one type of EMI; other clinical applications are conceivable and might be more effective.
To conclude, we did not find statistical evidence that the EMI impacted clinical or functional outcomes beyond the effects of TAU, regardless of module content. This does not rule out that EMIs could have a positive effect on other domains or provide a more efficient way of delivering care. However, EMIs promise of effectiveness has not materialized yet.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S0033291720004845.
Acknowledgements
The study was conceived within the framework of a partnership between the University Medical Center Groningen and Friesland Mental Health Care Services. The authors gratefully acknowledge support from these organizations and two other Dutch mental health care providers: Lentis/PsyQ, and Synaeda Psycho Medisch Centrum. The authors thank Wibke Franzen, Wendy Folkersma, Jan-Reindert Voogdt, Simone Beem, Vera Veerman, Marieke Wichers, Peter Groot, and patients in care at Friesland Mental Health Care Services for contributions to the development of the intervention, Ando Emerencia and Elske Bos for implementing automated time-series analysis procedures, and Henk van der Veen and Rivka de Vries for the development of the automatized feedback reports. Moreover, the authors thank Renee Stelwagen, Esther Alberts and all other research assistants for their support throughout the data collection. This work was supported by grants to JAB from the Gratama foundation (2015–05) and the charitable foundation Stichting tot Steun VCVGZ (239) in collaboration with the Dutch Depression Foundation. Smartphones were kindly provided by the iLab of the department of Psychiatry of the UMCG (http://ilab-psychiatry.nl/en_US/). The funding bodies have no role in the design of the study, data collection, analysis, or interpretation of data, nor in writing the manuscript.
Conflict of interest
None.
Ethical standards
The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008.
Author note
Maaike Meurs is now at NIVEL, Netherlands Institute for Health Services Research, Utrecht, The Netherlands