Hostname: page-component-cd9895bd7-7cvxr Total loading time: 0 Render date: 2024-12-24T16:33:49.448Z Has data issue: false hasContentIssue false

Predicting the naturalistic course in anxiety disorders using clinical and biological markers: a machine learning approach

Published online by Cambridge University Press:  11 June 2020

Wicher A. Bokma*
Affiliation:
Department of Psychiatry, Amsterdam UMC, Vrije Universiteit, Amsterdam Public Health research institute, The Netherlands GGZ inGeest Specialized Mental Health Care, Amsterdam, The Netherlands
Paul Zhutovsky
Affiliation:
Department of Psychiatry, Amsterdam UMC, Location AMC, University of Amsterdam, Amsterdam Neuroscience, Amsterdam, The Netherlands
Erik J. Giltay
Affiliation:
Department of Psychiatry, Leiden University Medical Center (LUMC), Leiden, The Netherlands
Robert A. Schoevers
Affiliation:
Department of Psychiatry, University Medical Center Groningen, Groningen, The Netherlands
Brenda W.J.H. Penninx
Affiliation:
Department of Psychiatry, Amsterdam UMC, Vrije Universiteit, Amsterdam Public Health research institute, The Netherlands GGZ inGeest Specialized Mental Health Care, Amsterdam, The Netherlands
Anton L.J.M. van Balkom
Affiliation:
Department of Psychiatry, Amsterdam UMC, Vrije Universiteit, Amsterdam Public Health research institute, The Netherlands GGZ inGeest Specialized Mental Health Care, Amsterdam, The Netherlands
Neeltje M. Batelaan
Affiliation:
Department of Psychiatry, Amsterdam UMC, Vrije Universiteit, Amsterdam Public Health research institute, The Netherlands GGZ inGeest Specialized Mental Health Care, Amsterdam, The Netherlands
Guido A. van Wingen
Affiliation:
Department of Psychiatry, Amsterdam UMC, Location AMC, University of Amsterdam, Amsterdam Neuroscience, Amsterdam, The Netherlands
*
Author for correspondence: Wicher A. Bokma, E-mail: [email protected]
Rights & Permissions [Opens in a new window]

Abstract

Background

Disease trajectories of patients with anxiety disorders are highly diverse and approximately 60% remain chronically ill. The ability to predict disease course in individual patients would enable personalized management of these patients. This study aimed to predict recovery from anxiety disorders within 2 years applying a machine learning approach.

Methods

In total, 887 patients with anxiety disorders (panic disorder, generalized anxiety disorder, agoraphobia, or social phobia) were selected from a naturalistic cohort study. A wide array of baseline predictors (N = 569) from five domains (clinical, psychological, sociodemographic, biological, lifestyle) were used to predict recovery from anxiety disorders and recovery from all common mental disorders (CMDs: anxiety disorders, major depressive disorder, dysthymia, or alcohol dependency) at 2-year follow-up using random forest classifiers (RFCs).

Results

At follow-up, 484 patients (54.6%) had recovered from anxiety disorders. RFCs achieved a cross-validated area-under-the-receiving-operator-characteristic-curve (AUC) of 0.67 when using the combination of all predictor domains (sensitivity: 62.0%, specificity 62.8%) for predicting recovery from anxiety disorders. Classification of recovery from CMDs yielded an AUC of 0.70 (sensitivity: 64.6%, specificity: 62.3%) when using all domains. In both cases, the clinical domain alone provided comparable performances. Feature analysis showed that prediction of recovery from anxiety disorders was primarily driven by anxiety features, whereas recovery from CMDs was primarily driven by depression features.

Conclusions

The current study showed moderate performance in predicting recovery from anxiety disorders over a 2-year follow-up for individual patients and indicates that anxiety features are most indicative for anxiety improvement and depression features for improvement in general.

Type
Original Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
Copyright © The Author(s), 2020. Published by Cambridge University Press

Introduction

Anxiety disorders are characterized by highly heterogeneous clinical course trajectories. After 2 years, the prognosis varies across disorders with remittance rates of 72.5% for panic disorder without agoraphobia, 69.7% for generalized anxiety disorder, 53.5% for social phobia and 52.7% for panic disorder with agoraphobia (Hendriks, Spijker, Licht, Beekman, & Penninx, Reference Hendriks, Spijker, Licht, Beekman and Penninx2013). Remitted patients experience a relatively benign course with moderate remaining symptom severity, disability and a low subjective need for care (Batelaan, Rhebergen, Spinhoven, van Balkom, & Penninx, Reference Batelaan, Rhebergen, Spinhoven, van Balkom and Penninx2014; Spinhoven et al., Reference Spinhoven, Batelaan, Rhebergen, van Balkom, Schoevers and Penninx2016; van Beljouw, Verhaak, Cuijpers, van Marwijk, & Penninx, Reference van Beljouw, Verhaak, Cuijpers, van Marwijk and Penninx2010). However, around 60% of patients have persistent symptoms, relapses, or chronic disease up to 6 years after the diagnosis (Batelaan et al., Reference Batelaan, Rhebergen, Spinhoven, van Balkom and Penninx2014; Spinhoven et al., Reference Spinhoven, Batelaan, Rhebergen, van Balkom, Schoevers and Penninx2016). Disease course in these patients is often characterized by substantial levels of disability. Predicting long-term disease course can be seen as an important step towards personalized medicine (Steyerberg, Reference Steyerberg2009). This would make targeted treatment efforts viable, in which treatments are tailored towards the individual risk for a poor disease outcome (McGorry, Ratheesh, & O'Donoghue, Reference McGorry, Ratheesh and O'Donoghue2018). However, in anxiety disorders, there is a lack of robust course predictors. For instance, different DSM anxiety disorder diagnoses were shown to be poorly predictive of subsequent course (Batelaan et al., Reference Batelaan, Rhebergen, Spinhoven, van Balkom and Penninx2014). In current clinical practice, in the absence of valid risk prediction models, course prediction relies solely on clinician's opinions, which show poor accuracy (Randall, Sareen, Chateau, & Bolton, Reference Randall, Sareen, Chateau and Bolton2019).

Several clinical, psychological, biological, sociodemographic and lifestyle markers are related to the disease course. For instance, higher baseline severity of anxiety symptoms, presence of somatic or psychiatric comorbidity, and higher levels of disability are linked to worse outcomes at 1-year (van Beljouw et al., Reference van Beljouw, Verhaak, Cuijpers, van Marwijk and Penninx2010), 2-year (Batelaan et al., Reference Batelaan, Rhebergen, Spinhoven, van Balkom and Penninx2014; Hendriks et al., Reference Hendriks, Spijker, Licht, Beekman and Penninx2013; Scholten et al., Reference Scholten, Batelaan, van Balkom, Penninx, Smit and Van Oppen2013), 6-year (Spinhoven et al., Reference Spinhoven, Batelaan, Rhebergen, van Balkom, Schoevers and Penninx2016), and 12-year follow-up (Bruce et al., Reference Bruce, Yonkers, Otto, Eisen, Weisberg, Pagano and Keller2005). Contrastingly, some authors suggest the same factors lead to better initial treatment results (Baldwin & Tiwari, Reference Baldwin and Tiwari2009; Rodriguez et al., Reference Rodriguez, Weisberg, Pagano, Bruce, Spencer, Culpepper and Keller2006). Also, a chronic duration of anxiety was linked to worse outcomes in most studies (Batelaan et al., Reference Batelaan, Rhebergen, Spinhoven, van Balkom and Penninx2014; Hendriks et al., Reference Hendriks, Spijker, Licht, Beekman and Penninx2013; Scholten et al., Reference Scholten, Batelaan, van Balkom, Penninx, Smit and Van Oppen2013; Spinhoven et al., Reference Spinhoven, Batelaan, Rhebergen, van Balkom, Schoevers and Penninx2016), while not showing any effect on disease course in another study (Nay, Brown, & Roberson-Nay, Reference Nay, Brown and Roberson-Nay2013). Most studies showed that a younger age at onset was associated with a chronic course (Batelaan et al., Reference Batelaan, Rhebergen, Spinhoven, van Balkom and Penninx2014; Beesdo-Baum et al., Reference Beesdo-Baum, Knappe, Fehm, Höfler, Lieb, Hofmann and Wittchen2012; Rodriguez et al., Reference Rodriguez, Weisberg, Pagano, Bruce, Spencer, Culpepper and Keller2006), while others showed no such age effect (Nay et al., Reference Nay, Brown and Roberson-Nay2013; Scholten et al., Reference Scholten, Batelaan, van Balkom, Penninx, Smit and Van Oppen2013). Inconsistent findings are likely due to methodological differences between studies. Other factors possibly related to worse disease course were duration of untreated illness (Baldwin & Tiwari, Reference Baldwin and Tiwari2009), the use of anti-anxiety medication (Bruce et al., Reference Bruce, Yonkers, Otto, Eisen, Weisberg, Pagano and Keller2005; Scholten et al., Reference Scholten, Batelaan, van Balkom, Penninx, Smit and Van Oppen2013), and presence of childhood trauma (Asselmann & Beesdo-Baum, Reference Asselmann and Beesdo-Baum2015; Batelaan et al., Reference Batelaan, Rhebergen, Spinhoven, van Balkom and Penninx2014; Scholten et al., Reference Scholten, Batelaan, van Balkom, Penninx, Smit and Van Oppen2013). Psychological factors that negatively impact anxiety disorder disease course up till 6-year follow-up included high neuroticism (Asselmann & Beesdo-Baum, Reference Asselmann and Beesdo-Baum2015; Scholten et al., Reference Scholten, Batelaan, van Balkom, Penninx, Smit and Van Oppen2013; Spinhoven et al., Reference Spinhoven, Batelaan, Rhebergen, van Balkom, Schoevers and Penninx2016), low extraversion (Spinhoven et al., Reference Spinhoven, Batelaan, Rhebergen, van Balkom, Schoevers and Penninx2016), high anxiety sensitivity (Asselmann & Beesdo-Baum, Reference Asselmann and Beesdo-Baum2015; Scholten et al., Reference Scholten, Batelaan, van Balkom, Penninx, Smit and Van Oppen2013), high levels of worrying (Spinhoven et al., Reference Spinhoven, Batelaan, Rhebergen, van Balkom, Schoevers and Penninx2016), and low mastery (Asselmann & Beesdo-Baum, Reference Asselmann and Beesdo-Baum2015; Scholten et al., Reference Scholten, Batelaan, van Balkom, Penninx, Smit and Van Oppen2013). Only a few studies linked biological parameters to disease course in anxiety disorders: C-reactive protein (CRP) levels were longitudinally associated with anxiety symptoms (Copeland, Shanahan, Worthman, Angold, & Costello, Reference Copeland, Shanahan, Worthman, Angold and Costello2012), increasing cortisol levels were linked to higher 6-month anxiety severity in girls (Schiefelbein & Susman, Reference Schiefelbein and Susman2006), and lower Brain-Derived Neurotropic Factor (BDNF) levels were found in patients with a poor response to treatment (Kobayashi et al., Reference Kobayashi, Shimizu, Hashimoto, Mitsumori, Koike, Okamura and Iyo2005). However, most research into biological parameters for anxiety disorders was done cross-sectionally, showing that anxiety disorder status is linked to higher CRP-levels (Copeland et al., Reference Copeland, Shanahan, Worthman, Angold and Costello2012; Pitsavos et al., Reference Pitsavos, Panagiotakos, Papageorgiou, Tsetsekou, Soldatos and Stefanadis2006; Vogelzangs, Beekman, De Jonge, & Penninx, Reference Vogelzangs, Beekman, De Jonge and Penninx2013), higher metabolic syndrome markers (Carroll et al., Reference Carroll, Phillips, Thomas, Gale, Deary and Batty2009; Kahl et al., Reference Kahl, Schweiger, Correll, Müller, Busch, Bauer and Schwarz2015; Perez-Cornago, Ramírez, Zulet, & Martinez, Reference Perez-Cornago, Ramírez, Zulet and Martinez2014), higher tumour necrosis factor-α (TNF-α) levels (Hoge et al., Reference Hoge, Brandstetter, Moshier, Pollack, Wong and Simon2009; Pitsavos et al., Reference Pitsavos, Panagiotakos, Papageorgiou, Tsetsekou, Soldatos and Stefanadis2006), and lower BDNF levels (Molendijk et al., Reference Molendijk, Bus, Spinhoven, Penninx, Prickaerts, Oude Voshaar and Elzinga2012). Inconsistently, anxiety symptoms were linked to both higher (Zoccola, Dickerson, & Yim, Reference Zoccola, Dickerson and Yim2011) and lower (O ’Donovan et al., Reference O ’Donovan, Hughes, Slavich, Lynch, Cronin, O ’Farrelly and Malone2010) cortisol, as well as higher (Hoge et al. Reference Hoge, Brandstetter, Moshier, Pollack, Wong and Simon2009; O ’Donovan et al. Reference O ’Donovan, Hughes, Slavich, Lynch, Cronin, O ’Farrelly and Malone2010; Pitsavos et al. Reference Pitsavos, Panagiotakos, Papageorgiou, Tsetsekou, Soldatos and Stefanadis2006) and lower (Vogelzangs et al. Reference Vogelzangs, Beekman, De Jonge and Penninx2013) interleukin-6 (IL-6) measurements. Finally, sociodemographic and lifestyle factors such as education years (van Beljouw et al., Reference van Beljouw, Verhaak, Cuijpers, van Marwijk and Penninx2010), age (Asselmann & Beesdo-Baum, Reference Asselmann and Beesdo-Baum2015; Catarino et al., Reference Catarino, Bateup, Tablan, Innes, Freer, Richards and Blackwell2018), partner status (Asselmann & Beesdo-Baum, Reference Asselmann and Beesdo-Baum2015; Batelaan et al., Reference Batelaan, Rhebergen, Spinhoven, van Balkom and Penninx2014), social support (van Beljouw et al., Reference van Beljouw, Verhaak, Cuijpers, van Marwijk and Penninx2010), smoking status (Bruce et al., Reference Bruce, Yonkers, Otto, Eisen, Weisberg, Pagano and Keller2005), nicotine dependency (Nay et al., Reference Nay, Brown and Roberson-Nay2013), current financial problems (Nay et al., Reference Nay, Brown and Roberson-Nay2013), employment status (van Beljouw et al., Reference van Beljouw, Verhaak, Cuijpers, van Marwijk and Penninx2010), and income (van Beljouw et al., Reference van Beljouw, Verhaak, Cuijpers, van Marwijk and Penninx2010) were associated with anxiety disorder disease course. In spite of these many variables that predict disease course at the group level, it is not known whether this translates to accurate predictions for individual patients. Currently, no encompassing model exists with sufficient sensitivity and specificity in disease course prediction to be feasible for use at the level of the individual patient.

A possible explanation for the lack of accuracy in course prediction in anxiety disorders is the complex, multicausal aetiology of anxiety disorders. Univariable and multivariable analyses of predictors of disease course showed low levels of explained variance (Bokma, Batelaan, Hoogendoorn, Penninx, & van Balkom, Reference Bokma, Batelaan, Hoogendoorn, Penninx and van Balkom2020). Furthermore, the inference is typically done on the group-level which does not allow for generalizable statements for the single individual. Multivariable machine learning (ML) methods provide a possible solution for this problem, as they are well-suited for solving problems with high numbers of predictors in complex, multicausal disorders (Iniesta, Stahl, & McGuffin, Reference Iniesta, Stahl and McGuffin2016). The use of ML in the field of psychiatry may have great potential for its application in the prediction of disease course trajectories (Hahn, Nierenberg, & Whitfield-Gabrieli, Reference Hahn, Nierenberg and Whitfield-Gabrieli2017). Prediction of the disease course can be regarded as a ‘classification’ problem, which can be solved using supervised algorithms (Deo, Reference Deo2015). In these, algorithms are trained on patients with known predictor and outcome variables to derive a function that can be applied to unseen patients to predict their outcome based on the values of their predictor variables. In anxiety disorders, supervised algorithms were applied a few times cross-sectionally, to relate predictors from various domains to current disease status (Woo, Chang, Lindquist, & Wager, Reference Woo, Chang, Lindquist and Wager2017) or to predict short-term treatment effects (Lueken & Hahn, Reference Lueken and Hahn2016). To our best knowledge, however, no studies applied supervised ML algorithms to predict the disease course in anxiety disorders.

The aim of this study was to predict long-term anxiety disorder course, using an ML approach applied to clinical, psychological, biological, sociodemographic and lifestyle baseline data. Specifically, we investigated the utility of a random forest classifier (RFC) (Breiman, Reference Breiman2001) to predict clinical course in patients with any baseline anxiety disorder. Our main outcome was recovery from anxiety disorders at 2-year follow-up. As secondary outcome recovery from all common mental disorders (CMDs) at 2-year follow-up was used. CMDs include anxiety disorders, but also depressive disorders and substance use disorders as these disorders often co-occur, show diagnostic instability over time (Hovenkamp-Hermelink et al., Reference Hovenkamp-Hermelink, Riese, Van Der Veen, Batelaan, Penninx and Schoevers2016; Lamers et al., Reference Lamers, van Oppen, Comijs, Smit, Spinhoven, van Balkom and Penninx2011; Scholten et al., Reference Scholten, Batelaan, Penninx, Balkom, Van Smit, Schoevers and Van Oppen2016; Verduijn et al., Reference Verduijn, Verhoeven, Milaneschi, Schoevers, van Hemert, Beekman and Penninx2017), and recovery from one but not the other does not index a major improvement in health. Finally, we assessed which predictor domains contributed most to disease course predictions. We hypothesized that RFCs using a wide array of baseline data from different domains would yield adequate 2-year recovery predictions for both outcomes. Furthermore, we hypothesized that the combination of the five domains would yield the best predictions.

Methods

Study sample

The participants in this study were selected from the multi-site Netherlands Study of Depression and Anxiety (NESDA), an ongoing naturalistic cohort study into the course of depression and anxiety. The baseline sample consists of 2981 participants who were recruited from the community, primary care and specialized mental health care centres. All participants had a lifetime or current depressive disorder or anxiety disorder diagnosis (n = 2329, 78.1%) or were healthy controls (n = 652, 21.9%). NESDA allowed for the presence of comorbid psychiatric disorders, with the exception of psychotic disorders, obsessive-compulsive disorder, post-traumatic stress disorder, bipolar disorders, or severe substance use disorders. Exclusion criterion consisted of insufficient proficiency of the Dutch language. Baseline data collection was performed in 2004–2007 and was followed by 1-year, 2-year, 4-year, 6-year, and 9-year follow-up measurements. Full descriptions of the design of NESDA were published previously (Penninx et al., Reference Penninx, Beekman, Smit, Zitman, Nolen, Spinhoven and Van Dyck2008). The study protocol was approved by the Ethical Review Board of all participating institutes and written informed consent was obtained from all participants.

For the purpose of this study, patients with current (6-month) panic disorder (PD, with or without agoraphobia), generalized anxiety disorder (GAD) or social anxiety disorder (SAD) diagnoses at baseline were selected (n = 1206). In our sample, psychiatric comorbidity was allowed. The diagnosis was established according to DSM-IV criteria with the Composite International Diagnostic Interview (CIDI, version 2.1) (American Psychiatric Association, 2000; Wittchen, Reference Wittchen1994; World Health Organization, 1998). From these patients, 212 were excluded due to missing diagnostic information at 2-years follow-up. A further 107 patients were removed due to having more than 20% missing variables across predictor variables at baseline. This yielded a final sample of 887 anxiety disorder patients with sufficient data available. Excluded patients showed comparable symptom severity at baseline – mean anxiety severity (Beck's Anxiety Inventory; BAI): 20.35 ± 11.74 v. 18.30 ± 10.48, t = 1.81, p = 0.07; mean depression severity (Inventory of Depressive Symptomatology-Self Report; IDS-SR): 30.71 ± 12.65 v. 29.39 ± 12.65, t = 0.97, p = 0.33. Excluded patients were younger (mean age: 38.25 ± 12.05 v. 41.92 ± 12.20 years, t = 4.62, p < 0.001), and had a lower mean number of education years: 11.03 ± 3.15 v. 11.88 ± 3.35, t = 3.97, p < 0.001, consistent with differences across the whole NESDA sample (Lamers et al., Reference Lamers, Hoogendoorn, Smit, van Dyck, Zitman, Nolen and Penninx2012). Gender did not differ between excluded and included patients (% female in excluded sample 68.2%, in included sample 66.8%, χ2 = 0.22, p = 0.64).

Investigated classifications

Two distinct classification tasks predicting outcomes at 2-year follow-up were performed. Both were binary classification tasks predicting (1) recovery from anxiety disorders or (2) recovery from all CMDs. Anxiety disorders were defined as either PD, agoraphobia, GAD, or SAD. Recovery from anxiety disorders was deemed present if no anxiety disorder diagnoses persisted at follow-up. These diagnoses referred to all follow-up anxiety disorders, not only the index disorder(s). Anxiety disorders, dysthymia, major depressive disorder (MDD) and alcohol dependency are sometimes collectively referred to as CMDs (Ormel et al., Reference Ormel, Jeronimus, Kotov, Riese, Bos, Hankin and Oldehinkel2013; Vollebergh et al., Reference Vollebergh, Iedema, Bijl, de Graaf, Smit and Ormel2001). For the purpose of this study, we defined recovery from all CMDs if at follow-up no anxiety disorders, MDD, dysthymia or alcohol dependency diagnoses were present. Assessment of CMDs is relevant as it is evident from population-based studies that depressive disorders and alcohol dependency are the most commonly occurring comorbidities in anxiety disorders (Alonso & Lépine, Reference Alonso and Lépine2007; Judd et al., Reference Judd, Kessler, Paulus, Zeller, Wittchen and Kunovac1998; Wittchen, Kessler, Pfister, & Lieb, Reference Wittchen, Kessler, Pfister and Lieb2000), rates of diagnostic instability across anxiety disorders, depressive disorders and alcohol dependency are high (Gustavson et al., Reference Gustavson, Knudsen, Nesvåg, Knudsen, Vollset and Reichborn-Kjennerud2018; Hovenkamp-Hermelink et al., Reference Hovenkamp-Hermelink, Riese, Van Der Veen, Batelaan, Penninx and Schoevers2016; Scholten et al., Reference Scholten, Batelaan, Penninx, Balkom, Van Smit, Schoevers and Van Oppen2016) and recovery from one but not the other does not imply a major improvement in health. We assessed recovery from anxiety disorders as a primary outcome measure and recovery from all CMDs as a secondary outcome measure. These two outcome measures describe recovery in a narrow and a broad perspective (Verduijn et al., Reference Verduijn, Verhoeven, Milaneschi, Schoevers, van Hemert, Beekman and Penninx2017).

Baseline predictor variables

At baseline, a wide array of putative predictors from five domains (clinical, psychological, sociodemographic, biological and lifestyle) were selected, yielding a total of 651 variables. In our analyses, only information at the individual item level was used. Total summary scores for questionnaires were not calculated, as these would be correlated to the individual items. The exception was the NEO Five-Factor Inventory (NEO-FFI), as its domains (e.g. neuroticism) are of specific clinical relevance. Items were excluded if more than 20% of patients were missing the corresponding item. This resulted in the inclusion of 569 predictors at baseline (see Table 1). If a variable did not apply for a patient, it was re-coded as a new category for ordinal or nominal variables or as 0 for continuous variables (all continuous variables were positive). Such an encoding allowed to maintain the variable for classification and encoded it with a not naturally occurring value implying that this variable did not apply for this patient. All additional missing variables were imputed using median/mode imputation calculated on the training set (see below) to obtain a full data set. No variable had more than 10% missing values before imputation was applied. Additional information about measurement instruments, variable scoring and collection can be found in the Supplementary Methods. We investigated the predictive capability of all domains individually and the combination of all five domains.

Table 1. Included baseline predictor variables across the five predictor domains

Machine learning algorithm

RFCs (Breiman, Reference Breiman2001) were used in all analyses. RFCs have been shown to perform well on many different machine learning problems (Fernández-Delgado, Cernadas, Barro, & Amorim, Reference Fernández-Delgado, Cernadas, Barro and Amorim2014), specifically in biomedical sciences (Olson, Cava, Mustahsan, Varik, & Moore, Reference Olson, Cava, Mustahsan, Varik and Moore2018). An RFC is built as an ensemble of many decision trees (Breiman, Friedman, Olshen, & Stone, Reference Breiman, Friedman, Olshen and Stone1984) which themselves are trained by considering random subsamples of variables and patients for each tree. Such a procedure leads to improved and robust prediction performance in comparison to individual trees (Breiman, Reference Breiman2001). Details on hyperparameters used in the analysis can be found in the Supplementary Methods. All analyses were implemented using the scikit-learn (version 0.20.2) (Pedregosa et al., Reference Pedregosa, Varoquaux, Gramfort, Michel, Thirion, Grisel and Duchesnay2011) and imbalanced-learn toolboxes (version 0.4.3) (Lemaître, Nogueira, & Aridas, Reference Lemaître, Nogueira and Aridas2017) in the Python programming language (version 3.7.2).

Evaluation

To evaluate the performance of our classifiers 10-times-repeated-10-fold-cross-validation was applied. In this procedure, the data set is repeatedly (n = 100) divided into disjoint training (90% of data) and test (10% of data) sets and the RFC is only fit on the training data and evaluated on the independent test data. The final performance is obtained as an average across all test set evaluations. We measured performance as area-under-the-receiver-operator-curve (AUC). In addition, we calculated sensitivity, specificity, balanced accuracy – average between sensitivity and specificity – and positive/negative predictive values. To further validate our classification performance label-permutation tests (n = 1000) of average AUC values were performed (Ojala & Garriga, Reference Ojala and Garriga2010). The obtained p values were Bonferroni-corrected across five individual and one combination of all domains and alpha was set to 0.05.

To systematically compare the performance of different predictor domains patients were distributed in exactly the same way for each of the classifications, i.e. the train and test set of any cross-validation iteration included the same patients for each predictor domain. This allowed the calculation of normalized average differences in AUC scores across cross-validation iterations for each pair of predictor domains (including the combination of all domains). Non-parametric sign-flipping tests (n = 10 000) were then employed to derive p values which were Bonferroni-corrected for 30 comparisons with alpha set to 0.05.

Variable importance

In addition to its strong classification performance RFCs allow to quantify the importance of each variable towards the classification task (Breiman, Reference Breiman2001). However, the standard calculation of variable importance has been shown to be biased (Strobl, Boulesteix, Zeileis, & Hothorn, Reference Strobl, Boulesteix, Zeileis and Hothorn2007) and a permutation-based variable importance scheme has been suggested instead (Altmann, Toloşi, Sander, & Lengauer, Reference Altmann, Toloşi, Sander and Lengauer2010; Hapfelmeier & Ulm, Reference Hapfelmeier and Ulm2013; Strobl et al., Reference Strobl, Boulesteix, Zeileis and Hothorn2007). Following this approach, we calculated p values for each variable by permuting (n = 1000) every variable separately. The computed p values were then corrected according to the false discovery rate (FDR) (Benjamini & Hochberg, Reference Benjamini and Hochberg2000) and significance was set to 0.05. Given that variable importance was calculated every cross-validation iteration, important variables were defined as variables which were consistently significant under FDR for at least 50% of all cross-validation iterations. This very stringent procedure for identifying important variables was employed to calculate valid variable importance information specific to the classification task. Variable importance were only investigated for the classifications using the data from the combination of all domains. In addition, we investigated differences in the average rankings of important variables between the two classification tasks. A detailed description of this approach can be found in the Supplementary Methods.

Results

At 2-year follow-up, 484 patients (54.6%) recovered from anxiety disorders, and 362 patients (40.8%) did not have any CMD. Baseline clinical, psychological, sociodemographic, biological and lifestyle variables are provided for patients with and without anxiety disorders at follow-up (Table 2) and for patients with and without CMD at follow-up (online Supplementary Table 1). Various clinical and psychological variables showed differences between the two groups. By contrast, biological and lifestyle status did not differ between the two groups.

Table 2. Baseline characteristics of anxiety disorder sample, group comparisons between patients who had no anxiety disorder (n = 484) at 2-year follow-up and patients who did (n = 403)

PD, panic disorder; SAD, social anxiety disorder; GAD, generalized anxiety disorder; MDD, major depressive disorder; FQ, Fear Questionnaire; PSWQ, Penn State Worry Questionnaire; SSI, Suicidal Ideation Scale; 4DSQ, Four-Dimensional Symptom Questionnaire; IDS-SR, Inventory of Depressive Symptomatology-SR; ISR, Insomnia Rating Scale; BAI, Beck's Anxiety Inventory; LCI, life chart interview; NEO-FFI, NEO Five-Factor Inventory; LEIDS, Leiden Index of Depression Sensitivity; ASI, Anxiety Sensitivity Index; BMI, Body Mass Index; CRP, c-reactive protein; IL-6, interleukin-6; TNF-α, tumour necrosis factor-α; BDNF, Brain-Derived Neurotrophic Factor. p values shown in bold are <0.05.

a Childhood life events (<16 years of age) were parental divorce, being placed in a juvenile prison, raised in a foster family, placed in a child home, death of a parent.

b Childhood trauma included emotional neglect, psychological abuse, physical abuse, and sexual abuse.

c As measured with the AUDIT. Scores above 8 are reflective of hazardous drinking, scores at 13 or higher (females) and 15 or higher (males) are indicative of probable alcohol dependency.

Recovery from anxiety disorders

Classification performance

Results of our evaluation of the RFC when predicting recovery from anxiety disorders are reported in Table 3 and Fig. 1A. AUC values for the predictor domains ranged from 0.49 to 0.67 with significant (pBonferroni < 0.05) AUC values obtained for the clinical (0.67), and psychological (0.65) domains, as well as for the combination of all domains (0.67). Classification accuracies were small to moderate with the highest accuracy achieved by the combination of all domains (62.4%) with a sensitivity of 62.0% and specificity of 62.8%. In addition, we investigated the performance of the RFC for subgroups of patients who had any comorbidity (MDD, dysthymia, or alcohol dependency, n = 252 recovered, n = 248 persistent) at baseline and for patients who did not ( n = 232 recovered, n = 155 persistent). For that, the RFC trained on all data domains and all patients of the training set was evaluated within the two subgroups on the test set separately. The RFC obtained an average AUC of 0.64 within the no-comorbidity group and an AUC of 0.68 within the comorbidity group showing slightly increased performance for predictions within the comorbidity group.

Table 3. Evaluation of the 2-year recovery from anxiety disorders classification [mean (s.d.)]

AUC, area-under-receiver-operator-curve; PPV, positive predictive value; NPV, negative predictive value; *pBonferroni < 0.05.

p values shown in bold are <0.05.

Fig. 1. Classification performance of random forest classifiers. Performance is quantified by area-under-the-receiver-operator-curve (AUC) values calculated for each test set of all cross-validation iterations and is shown in box-and-whisker plots for all data domains. (a) Performance of the recovery from anxiety disorders prediction,(b) Performance of the recovery from all common mental disorders prediction. Asterisks mark a significant classification performance according to label-permutation tests (n = 1000) and Bonferroni-correction for six tests.The dashed line indicates chance-level performance.

Domain comparisons

When comparing different domains according to their AUC a clear ordering was observed: The clinical domain outperformed every other domain except for the combination of all domains (pBonferroni < 0.05), the psychological domain outperformed the sociodemographic, biological, and lifestyle domains (pBonferroni < 0.05), the sociodemographic domain outperformed the biological and lifestyle domains (pBonferroni < 0.05), and the biological domain outperformed the lifestyle domain (pBonferroni < 0.05). The combination of all domains was better than any domain except for the clinical domain (pBonferroni < 0.05).

Variable importance

Consistently selected significant variables (N = 17) identified through a permutation-based variable importance calculation of the RFC are reported in online Supplementary Table 2. Only variables from the clinical and psychological domain were selected. These variables were derived from different measurement instruments (BAI, IDS-SR, Fear Questionnaire (FQ), NEO-FFI, WHO-Disability Assessment (WHO-DAS), Four-Dimensional Symptom Questionnaire (4DSQ), Mastery scale) but all referred to characteristic anxiety symptoms, with an emphasis on anxious arousal items.

Recovery from all common mental disorders

Classification performance

Results of the second classification procedure predicting recovery from CMDs are reported in Table 4 and Fig. 1B. AUC values ranged from 0.53 to 0.70 with significant (pBonferroni < 0.05) AUC values obtained for the clinical (0.70), psychological (0.67), and sociodemographic domain (0.65) as well as the combination of all domains (0.70). The highest accuracy was achieved by the combination of all domains (63.4%) with a sensitivity of 64.6% and a specificity of 62.3%. As in the case of the prediction of the recovery from anxiety disorders, we investigated the performance of the RFC for subgroups of patients who had (n = 164 recovered, n = 336 persistent) or did not (n = 198 recovered, n = 189 persistent) have any comorbidities at baseline. For that, the RFC trained on the combintation of all domains and all patients of the training set was evaluated within the two subgroups on the test set separately. The RFC obtained an AUC of 0.62 within the no-comorbidity group and an AUC of 0.73 within the comorbidity group. As in the case of the prediction of recovery from anxiety disorders the RFC was showing better performance for patients with comorbidities at baseline.

Table 4. Evaluation of the 2-year recovery from all common mental disorders classification [mean (s.d.)]

AUC, area-under-receiver-operator-curve; PPV, positive predictive value; NPV, negative predictive value; *pBonferroni < 0.05.

p values shown in bold are <0.05.

Domain comparisons

The best performing domains for this classification were the same as in the recovery from anxiety disorders classification. The clinical domain and the combination of all domains did not differ in their performance but outperformed any other domain during the classification. The order for the performance of the other domains was the same as with the recovery from anxiety disorders classification.

Variable importance

48 variables were identified as being consistently selected significant variables contributing to the classification (online Supplementary Table 3). In this classification, selected variables included a larger set of measures related to mood disorders and not only anxiety symptomatology. With one exception (sociodemographic) all variables were again selected from the clinical or psychological domain.

Difference in important variables between prediction analyses

Variables which were more (or less) important in the prediction of recovery from anxiety disorders than the prediction of all CMDs are reported in online Supplementary Table 4. These results confirmed the importance of anxiety-related variables for the prediction of recovery from anxiety, and the importance of depression-related variables for the prediction of recovery from all CMDs.

Transfer analysis

We replicated the classification of recovery from anxiety disorders at 2-year follow-up in a transfer learning setting: in such an approach we utilized the labels indicating recovery of CMDs during the training of the RFC classifier (training set) but subsequently evaluated its performance on the test set using the recovery from anxiety disorder labels. The result of this analysis can be seen in online Supplementary Table 5. Utilizing the transfer learning approach led to improved performance in predicting anxiety disorder recovery (AUC = 0.71 v. AUC = 0.67 for both training and testing on anxiety disorder recovery labels using either only the clinical or the combination of all domains). The increased performance was observed due to an increase in sensitivity of the classification for correctly identifying recovered anxiety patients. For all individual domains and the combination of them, sensitivity increased by 7.6 ± 1.9 when training on the CMDs labels first. Specificity only decreased slightly (mean decrease: 2.7 ± 0.8) which led to the improved overall performance.

Discussion

One of the most important goals in personalized medicine is providing individual disease course predictions. Our results show that individual prediction of 2-year course in anxiety disorders is possible using various predictors but it is only moderately successful. The main outcome measure was recovery from anxiety disorders and our predictions reached a balanced accuracy of 62.4% with an AUC of 0.67. The current performance by itself does not warrant implementation of our models in routine psychiatric care as it would yield too many false positives/negatives. However, predictive properties of clinician opinion in predicting disease course in anxiety disorders are not available and therefore it remains unclear which predictive performance threshold is needed for a statistical model to surpass clinician opinion and become an improvement over current routine care.

Our study yielded two models with comparable accuracy for predicting 2-year anxiety disorder course: one consisting of predictors from all five domains and one consisting of predictors only from the clinical domain. Biological, lifestyle, and sociodemographic predictors did not contribute significantly to course prediction. This is surprising as these domains were previously shown to be related to anxiety disorder aetiology. Our results thereby suggest that the underlying aetiology is of less importance to course prediction after the development of threshold disorders and that after anxiety disorders have developed, phenotypical characteristics have more impact on subsequent disease course. This is evident from the individual features that contributed most to the classification. All of these features reflected symptoms, psychological states or traits associated with the emotions of fear and anxiety, such as the presence of ‘phobic symptoms’, difficulty ‘walking alone in a busy street’ or ‘dealing with people you don't know’, ‘feeling tense’, ‘not liking to be where the action is’, and ‘feeling faint or lightheaded’. A previous NESDA study that aimed to predict the naturalistic course in depression showed similar performance to the current study when 2-year follow-up MDD diagnosis was correctly classified with an AUC of 0.66 and balanced accuracy of 62% (Dinga et al., Reference Dinga, Marquand, Veltman, Beekman, Schoevers, van Hemert and Schmaal2018). In this study, clinical features were most important as well, though the nature of those items was related to depression.

As anxiety disorders and other psychiatric disorders frequently co-occur and show diagnostic instability over time, a secondary outcome was assessed. This broad perspective model was trained on recovery from all CMDs and showed marginally higher accuracy (63.4%) and AUC (0.70) in comparison with the main narrow perspective outcome. Like in the narrow perspective, omitting all domains except the clinical domain did not lead to a significant loss of predictive power (accuracy = 62.2% and AUC = 0.70). The individual features that were most consistently chosen during the classification again were almost exclusively from the clinical and psychological domains. Symptoms, psychological traits, and psychological states associated with depression and worrying contributed most to the classification. For instance: ‘feeling down’, ‘feeling sad’, having ‘a desire to die’, ‘suffering from worry’, ‘feeling tense’, and ‘having little control about the things that happen’. This suggests that predictions for recovery from all CMDs were largely driven by co-occurring depressive symptoms. Our decision to investigate the CMDs classification was also supported by the results of the additional transfer analysis which showed improved performance (accuracy = 63.3% and AUC = 0.71 for the combination of all domains data) when using the recovery from all CMDs labelling during training and the recovery from anxiety labels during model evaluation. This analysis showed that patients suffering from any mental disorder at 2-year follow-up – anxiety or not – constituted a more homogenous group while patients who fully recovered were more easily identified than patients only recovering from anxiety disorders (but having an additional CMD instead). This suggests that applying a broad perspective in future attempts in clinical prediction is more feasible for anxiety disorders.

Previous ML studies in anxiety disorders were invariably small in sample size and most focused on predicting immediate treatment response using neuroimaging data (Ball, Stein, Ramsawh, Campbell-Sills, & Paulus, Reference Ball, Stein, Ramsawh, Campbell-Sills and Paulus2014; Doehrmann et al., Reference Doehrmann, Ghosh, Polli, Reynolds, Horn, Keshavan and Gabrieli2013; Hahn et al., Reference Hahn, Kircher, Straube, Wittchen, Konrad, Ströhle and Lueken2014; Pantazatos, Talati, Schneier, & Hirsch, Reference Pantazatos, Talati, Schneier and Hirsch2014; Whitfield-Gabrieli et al., Reference Whitfield-Gabrieli, Ghosh, Nieto-Castanon, Saygin, Doehrmann, Chai and Gabrieli2016). Some studies used clinical, biological and/or neuroimaging data to distinguish between different types of anxiety disorders and healthy controls (Carpenter, Sprechmann, Calderbank, Sapiro, & Egger, Reference Carpenter, Sprechmann, Calderbank, Sapiro and Egger2016; Frick et al., Reference Frick, Gingnell, Marquand, Howner, Fischer, Kristiansson and Furmark2014; Hilbert, Lueken, Muehlhan, & Beesdo-Baum, Reference Hilbert, Lueken, Muehlhan and Beesdo-Baum2017; Pantazatos et al., Reference Pantazatos, Talati, Schneier and Hirsch2014). To the best of our knowledge, this is the first study into individual long-term course prediction in anxiety disorders. A strength of this study is the use of a large dataset with a high number of variables from a variety of predictor domains, most of which were previously related to disease course at the group level. In addition, using RFCs allowed for combining large numbers of predictors into an overall model and allowed the identification of the most contributing predictors, providing insight into the possible processes involved with recovery in anxiety disorders.

In spite of the wide array of predictors, the current study showed only moderate accuracy. This has a number of explanations. First, NESDA is a naturalistic cohort study in which the exposure to environmental stressors and treatment regimens varied across patients during the 2-year follow-up period. These different exposures will have impacted the 2-year outcomes. Furthermore, different data types might improve predictive accuracy. For instance, previous ML studies showed the strong potential of neuroimaging data to predict treatment response in anxiety disorders (Ball et al., Reference Ball, Stein, Ramsawh, Campbell-Sills and Paulus2014; Doehrmann et al., Reference Doehrmann, Ghosh, Polli, Reynolds, Horn, Keshavan and Gabrieli2013; Hahn et al., Reference Hahn, Kircher, Straube, Wittchen, Konrad, Ströhle and Lueken2014; Pantazatos et al., Reference Pantazatos, Talati, Schneier and Hirsch2014; Whitfield-Gabrieli et al., Reference Whitfield-Gabrieli, Ghosh, Nieto-Castanon, Saygin, Doehrmann, Chai and Gabrieli2016), sometimes exceeding predictions made using clinical data (Ball et al., Reference Ball, Stein, Ramsawh, Campbell-Sills and Paulus2014; Doehrmann et al., Reference Doehrmann, Ghosh, Polli, Reynolds, Horn, Keshavan and Gabrieli2013). Our study did not encompass neuroimaging data, as these were only available in a subset of NESDA participants (Janssen, Mourão-Miranda, & Schnack, Reference Janssen, Mourão-Miranda and Schnack2018). Other examples include gait analysis (Zhao et al., Reference Zhao, Zhang, Wang, Wang, Li, Zhu and Xiang2019), actigraphy (Merikangas et al., Reference Merikangas, Swendsen, Hickie, Cui, Shou, Merikangas and Zipunnikov2019), or social media data (Reece & Danforth, Reference Reece and Danforth2017). Additionally, more frequent data collection might improve predictive accuracy (Kubben, Dumontier, & Dekker, Reference Kubben, Dumontier and Dekker2019), which has now been implemented in the most recent wave of NESDA (Difrancesco et al., Reference Difrancesco, Lamers, Riese, Merikangas, Beekman, Hemert and Penninx2019). However, it is worth noting that our analyses showed that using a large set of variables from various domains (either combined or independently) did not outperform the clinical domain alone. Finally, future studies could explore differences in predictive performance across different patient subgroups, by analyzing separate patient groups consisting of different anxiety disorders, or groups with different comorbidity patterns separately.

Clinical care for anxiety disorders would benefit greatly from improved course prediction as it would pave the way for targeted treatments. The current study showed moderate accuracy in predicting recovery from anxiety disorders over a 2-year follow-up for individual patients. Items from the clinical and psychological domain were the most contributing predictors, while biological, lifestyle, and sociodemographic predictors were contributing less. The limited performance while using a wide array of predictors does not justify application in routine clinical care. The results from our study can, however, be used as a benchmark for future studies, with future studies likely resulting in further enhancements of the predictive properties. It has long been argued that statistical modelling will exceed clinician opinion in prediction problems (Ayres, Reference Ayres2007; Meehl, Reference Meehl1954), with clinician interpretation of statistical models likely yielding the best predictive power (Kuhn & Johnson, Reference Kuhn and Johnson2013). As a result, statistical models will increasingly become an addition to clinician opinion. Eventually, targeted treatment regimens and secondary prevention strategies will become more feasible if predictive models further evolve. This study provides an important first step towards valid long-term ML-based predictions in anxiety disorders.

Supplementary material

The supplementary material for this article can be found at https://doi.org/10.1017/S0033291720001658.

Acknowledgements

This study was supported by the Netherlands Organization for Scientific Research (NWO/ZonMW Vidi 016.156.318) and the AMC Research Council (150622). The infrastructure for the NESDA study (www.nesda.nl) is funded through the Geestkracht program of the Netherlands Organisation for Health Research and Development (ZonMw, grant number 10-000-1002) and financial contributions by participating universities and mental health care organizations (VU University Medical Center, GGZ inGeest, Leiden University Medical Center, Leiden University, GGZ Rivierduinen, University Medical Center Groningen, University of Groningen, Lentis, GGZ Friesland, GGZ Drenthe, Rob Giel Onderzoekscentrum).

Conflict of interest

Dr Penninx reports grants from Dutch Ministry of Health/NWO, research funds from Janssen pharmaceuticals, and research funds from Boehringer-Ingelheim, during the conduct of the study. The other authors do not have potential conflicts of interest.

Ethical standards

The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008.

Footnotes

*

These authors contributed equally to this work.

References

Alonso, J., & Lépine, J. (2007). Overview of key data from the European Study of the Epidemiology of Mental Disorders (ESEMeD). Journal of Clinical Psychiatry, 68(suppl 2), 39.Google Scholar
Altmann, A., Toloşi, L., Sander, O., & Lengauer, T. (2010). Permutation importance: A corrected feature importance measure. Bioinformatics (Oxford, England), 26(10), 13401347. doi:10.1093/bioinformatics/btq134.CrossRefGoogle ScholarPubMed
American Psychiatric Association (2000). Diagnostic and statistical manual of mental disorders DSM-IV-TR (4th ed). New York, NY, US: American Psychiatric Association.Google Scholar
Asselmann, E., & Beesdo-Baum, K. (2015). Predictors of the course of anxiety disorders in adolescents and young adults. Current Psychiatry Reports, 17(2), 18. doi:10.1007/s11920-014-0543-z.CrossRefGoogle ScholarPubMed
Ayres, I. (2007). Super crunchers: Why thinking-By-numbers is the new way to be smart. New York, NY, US: Bantam.Google Scholar
Baldwin, D. S., & Tiwari, N. (2009). The pharmacologic treatment of patients with generalized anxiety disorder: Where are we now and where are we going? CNS Spectrums, 14(2 Suppl 3), 512.CrossRefGoogle ScholarPubMed
Ball, T. M., Stein, M. B., Ramsawh, H. J., Campbell-Sills, L., & Paulus, M. P. (2014). Single-subject anxiety treatment outcome prediction using functional neuroimaging. Neuropsychopharmacology, 39(5), 12541261. doi:10.1038/npp.2013.328.CrossRefGoogle ScholarPubMed
Batelaan, N. M., Rhebergen, D., Spinhoven, P., van Balkom, A. J., & Penninx, B. W. J. H. (2014). Two-Year course trajectories of anxiety disorders: Do DSM classifications matter? The Journal of Clinical Psychiatry, 75(09), 985993. doi:10.4088/JCP.13m08837.CrossRefGoogle ScholarPubMed
Beesdo-Baum, K., Knappe, S., Fehm, L., Höfler, M., Lieb, R., Hofmann, S. G., & Wittchen, H. U. (2012). The natural course of social anxiety disorder among adolescents and young adults. Acta Psychiatrica Scandinavica, 126(6), 411425. doi:10.1111/j.1600-0447.2012.01886.x.CrossRefGoogle ScholarPubMed
Benjamini, Y., & Hochberg, Y. (2000). On the adaptive control of the false discovery rate in multiple testing with independent statistics. Journal of Educational and Behavioral Statistics, 25(1), 6083. doi:10.3102/10769986025001060.CrossRefGoogle Scholar
Bokma, W. A., Batelaan, N. M., Hoogendoorn, A. W., Penninx, B. W. J. H., & van Balkom, A. J. L. M. (2020). A clinical staging approach to improving diagnostics in anxiety disorders: Is it the way to go? Australian & New Zealand Journal of Psychiatry, 54(2), 173184.CrossRefGoogle ScholarPubMed
Breiman, L. (2001). Random forests. Machine Learning, 45, 532. doi:10.1023/A:1010933404324.CrossRefGoogle Scholar
Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and regression trees. Monterey, CA: Wadsworth & Brooks/Cole Advanced Books & Software.Google Scholar
Bruce, S. E., Yonkers, K. A., Otto, M. W., Eisen, J. L., Weisberg, R. B., Pagano, M., … Keller, M. B. (2005). Influence of psychiatric comorbidity on recovery and recurrence in generalized anxiety disorder, social phobia, and panic disorder: A 12–year prospective study. American Journal of Psychiatry, 162(6), 11791187. doi:10.1176/appi.ajp.162.6.1179.CrossRefGoogle ScholarPubMed
Carpenter, K. L. H., Sprechmann, P., Calderbank, R., Sapiro, G., & Egger, H. L. (2016). Quantifying risk for anxiety disorders in preschool children: A machine learning approach. PLoS One, 11(11), 120. doi:10.7910/DVN/N42LWG.CrossRefGoogle ScholarPubMed
Carroll, D., Phillips, A. C., Thomas, G. N., Gale, C. R., Deary, I., & Batty, G. D. (2009). Generalized anxiety disorder is associated with metabolic syndrome in the Vietnam experience study. Biological Psychiatry, 66(1), 9193. doi:10.1016/j.biopsych.2009.02.020.CrossRefGoogle ScholarPubMed
Catarino, A., Bateup, S., Tablan, V., Innes, K., Freer, S., Richards, A., … Blackwell, A. D. (2018). Demographic and clinical predictors of response to internet-enabled cognitive–behavioural therapy for depression and anxiety. BJPsych Open, 4(5), 411418. doi:10.1192/bjo.2018.57.CrossRefGoogle ScholarPubMed
Copeland, W. E., Shanahan, L., Worthman, C., Angold, A., & Costello, E. J. (2012). Generalized anxiety and C-reactive protein levels: A prospective, longitudinal analysis. Psychological Medicine, 42(12), 26412650. doi:10.1017/S0033291712000554.CrossRefGoogle ScholarPubMed
Deo, R. C. (2015). Machine learning in medicine. Circulation, 132(20), 19201930. doi:10.1161/CIRCULATIONAHA.115.001593.CrossRefGoogle ScholarPubMed
Difrancesco, S., Lamers, F., Riese, H., Merikangas, K. R., Beekman, A. T. F., Hemert, A. M., … Penninx, B. W. J. H. (2019). Sleep, circadian rhythm, and physical activity patterns in depressive and anxiety disorders: A 2-week ambulatory assessment study. Depression and Anxiety, 36, 975986. 10.1002/da.22949.CrossRefGoogle ScholarPubMed
Dinga, R., Marquand, A. F., Veltman, D. J., Beekman, A. T. F., Schoevers, R. A., van Hemert, A. M., … Schmaal, L. (2018). Predicting the naturalistic course of depression from a wide range of clinical, psychological, and biological data: A machine learning approach. Translational Psychiatry, 8(1), 241. doi:10.1038/s41398-018-0289-1.CrossRefGoogle ScholarPubMed
Doehrmann, O., Ghosh, S. S., Polli, F. E., Reynolds, G. O., Horn, F., Keshavan, A., … Gabrieli, J. D. (2013). Predicting treatment response in social anxiety disorder from functional magnetic resonance imaging. JAMA Psychiatry, 70(1), 8797. doi:10.1001/2013.jamapsychiatry.5.CrossRefGoogle ScholarPubMed
Fernández-Delgado, M., Cernadas, E., Barro, S., & Amorim, D. (2014). Do we need hundreds of classifiers to solve real world classification problems? Journal of Machine Learning Research, 15, 31333181.Google Scholar
Frick, A., Gingnell, M., Marquand, A. F., Howner, K., Fischer, H., Kristiansson, M., … Furmark, T. (2014). Classifying social anxiety disorder using multivoxel pattern analyses of brain function and structure. Behavioural Brain Research, 259, 330335. doi:10.1016/j.bbr.2013.11.003.CrossRefGoogle ScholarPubMed
Gustavson, K., Knudsen, A. K., Nesvåg, R., Knudsen, G. P., Vollset, S. E., & Reichborn-Kjennerud, T. (2018). Prevalence and stability of mental disorders among young adults: Findings from a longitudinal study. BMC Psychiatry, 18(1), 115. doi:10.1186/s12888-018-1647-5.CrossRefGoogle ScholarPubMed
Hahn, T., Kircher, T., Straube, B., Wittchen, H.-U., Konrad, C., Ströhle, A., … Lueken, U. (2014). Predicting treatment response to cognitive behavioral therapy in panic disorder with agoraphobia by integrating local neural information. JAMA Psychiatry, 72, 6874. 10.1001/jamapsychiatry.2014.1741.CrossRefGoogle Scholar
Hahn, T., Nierenberg, A. A., & Whitfield-Gabrieli, S. (2017). Predictive analytics in mental health: Applications, guidelines, challenges and perspectives. Molecular Psychiatry, 22(1), 3743. doi:10.1038/mp.2016.201.CrossRefGoogle Scholar
Hapfelmeier, A., & Ulm, K. (2013). A new variable selection approach using Random Forests. Computational Statistics and Data Analysis, 60(1), 5069. doi:10.1016/j.csda.2012.09.020.CrossRefGoogle Scholar
Hendriks, S. M., Spijker, J., Licht, C. M. M., Beekman, A. T. F., & Penninx, B. W. J. H. (2013). Two-year course of anxiety disorders: Different across disorders or dimensions? Acta Psychiatrica Scandinavica, 128(3), 212221. doi:10.1111/acps.12024.CrossRefGoogle ScholarPubMed
Hilbert, K., Lueken, U., Muehlhan, M., & Beesdo-Baum, K. (2017). Separating generalized anxiety disorder from major depression using clinical, hormonal, and structural MRI data: A multimodal machine learning study. Brain and Behavior, 7(3), 111. doi:10.1002/brb3.633.CrossRefGoogle ScholarPubMed
Hoge, E. A. A., Brandstetter, K., Moshier, S., Pollack, M. H. H., Wong, K. K. K., & Simon, N. M. M. (2009). Broad spectrum of cytokine abnormalities in panic disorder and posttraumatic stress disorder. Depression and Anxiety, 26, 447455. doi:10.1002/da.20564.CrossRefGoogle ScholarPubMed
Hovenkamp-Hermelink, J. H. M., Riese, H., Van Der Veen, D. C., Batelaan, N. M., Penninx, B. W. J. H., & Schoevers, R. A. (2016). Low stability of diagnostic classifications of anxiety disorders over time: A six-year follow-up of the NESDA study. Journal of Affective Disorders, 190, 310315. doi:10.1016/j.jad.2015.10.035.CrossRefGoogle ScholarPubMed
Iniesta, R., Stahl, D., & McGuffin, P. (2016). Machine learning, statistical learning and the future of biological research in psychiatry. Psychological Medicine, 46, 24552465. 10.1017/S0033291716001367.CrossRefGoogle ScholarPubMed
Janssen, R. J., Mourão-Miranda, J., & Schnack, H. G. (2018). Making individual prognoses in psychiatry using neuroimaging and machine learning. Biological Psychiatry: Cognitive Neuroscience and Neuroimaging, 3(9), 798808. 10.1016/j.bpsc.2018.04.004.Google ScholarPubMed
Judd, L. L., Kessler, R. C., Paulus, M. P., Zeller, P. V., Wittchen, H. U., & Kunovac, J. L. (1998). Comorbidity as a fundamental feature of generalized anxiety disorders: Results from the National Comorbidity Study (NCS). Acta Psychiatrica Scandinavica. Supplementum, 393, 611.CrossRefGoogle Scholar
Kahl, K. G., Schweiger, U., Correll, C., Müller, C., Busch, M. L., Bauer, M., & Schwarz, P. (2015). Depression, anxiety disorders, and metabolic syndrome in a population at risk for type 2 diabetes mellitus. Brain and Behavior, 5(3), e00306. doi:10.1002/brb3.306.CrossRefGoogle Scholar
Kobayashi, K., Shimizu, E., Hashimoto, K., Mitsumori, M., Koike, K., Okamura, N., … Iyo, M. (2005). Serum brain-derived neurotrophic factor (BDNF) levels in patients with panic disorder: As a biological predictor of response to group cognitive behavioral therapy. Progress in Neuro-Psychopharmacology and Biological Psychiatry, 29(5), 658663. doi: 10.1016/j.pnpbp.2005.04.010.CrossRefGoogle ScholarPubMed
Kubben, P., Dumontier, M., & Dekker, A. (2019). Fundamentals of clinical data science. Cham, Switzerland: Springer Open.CrossRefGoogle ScholarPubMed
Kuhn, M., & Johnson, K. (2013). Applied predictive modeling (5th ed.). New York: Springer. doi:10.1007/978-1-4614-6849-3.CrossRefGoogle Scholar
Lamers, F., Hoogendoorn, A. W., Smit, J. H., van Dyck, R., Zitman, F. G., Nolen, W. A., & Penninx, B. W. (2012). Sociodemographic and psychiatric determinants of attrition in the Netherlands Study of Depression and Anxiety (NESDA). Comprehensive Psychiatry, 53(1), 6370. doi:10.1016/j.comppsych.2011.01.011.CrossRefGoogle Scholar
Lamers, F., van Oppen, P., Comijs, H. C., Smit, J. H., Spinhoven, P., van Balkom, A. J. L. M., … Penninx, B. W. J. H. (2011). Comorbidity patterns of anxiety and depressive disorders in a large cohort study: The Netherlands Study of Depression and Anxiety (NESDA). The Journal of Clinical Psychiatry, 72(3), 341348. doi:10.4088/JCP.10m06176blu.CrossRefGoogle Scholar
Lemaître, G., Nogueira, F., & Aridas, C. K. (2017). Imbalanced-learn: A python toolbox to tackle the curse of imbalanced datasets in machine learning. Journal of Machine Learning Research, 18, 15. Retrieved from http://www.jmlr.org/papers/volume18/16-365/16-365.pdf.Google Scholar
Lueken, U., & Hahn, T. (2016). Functional neuroimaging of psychotherapeutic processes in anxiety and depression: From mechanisms to predictions. Current Opinion in Psychiatry, 29(1), 2531. doi:10.1097/YCO.0000000000000218.CrossRefGoogle ScholarPubMed
McGorry, P. D., Ratheesh, A., & O'Donoghue, B. (2018). Early intervention—an implementation challenge for 21st century Mental Health Care. JAMA Psychiatry, 75(6), 545546. doi:10.1001/jamapsychiatry.2018.0621.CrossRefGoogle ScholarPubMed
Meehl, P. E. (1954). Clinical. v. statistical prediction. Minneapolis, MN, US: University of Minnesota.Google Scholar
Merikangas, K. R., Swendsen, J., Hickie, I. B., Cui, L., Shou, H., Merikangas, A. K., … Zipunnikov, V. (2019). Real-time mobile monitoring of the dynamic associations among motor activity, energy, mood, and sleep in adults with bipolar disorder. JAMA Psychiatry, Feb, 190198. doi:10.1001/jamapsychiatry.2018.3546.CrossRefGoogle Scholar
Molendijk, M. L., Bus, B. A., Spinhoven, P., Penninx, B. W., Prickaerts, J., Oude Voshaar, R. C., & Elzinga, B. M. (2012). Gender specific associations of serum levels of brain-derived neurotrophic factor in anxiety. World Journal of Biological Psychiatry, 13(7), 535543. doi:10.3109/15622975.2011.587892.CrossRefGoogle ScholarPubMed
Nay, W., Brown, R., & Roberson-Nay, R. (2013). Longitudinal course of panic disorder with and without agoraphobia using the national epidemiologic survey on alcohol and related conditions (NESARC). Psychiatry Research, 208(1), 5461. doi:10.1016/j.psychres.2013.03.006.CrossRefGoogle Scholar
O ’Donovan, A., Hughes, B. M., Slavich, G. M., Lynch, L., Cronin, M.-T., O ’Farrelly, C., & Malone, K. M. (2010). Clinical anxiety, cortisol and interleukin-6: Evidence for specificity in emotion–biology relationships. Brain. Behavior, and Immunity, 24, 10741077. doi:10.1016/j.bbi.2010.03.003.CrossRefGoogle Scholar
Ojala, M., & Garriga, G. C. (2010). Permutation tests for studying classifier performance. Journal of Machine Learning Research, 11, 18331863.Google Scholar
Olson, R. S., Cava, W. L., Mustahsan, Z., Varik, A., & Moore, J. H. (2018). Data-driven advice for applying machine learning to bioinformatics problems. Pacific Symposium on Biocomputing, 23, 192203. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/29218881%0Ahttp://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=PMC5890912.Google ScholarPubMed
Ormel, J., Jeronimus, B. F., Kotov, R., Riese, H., Bos, E. H., Hankin, B., … Oldehinkel, A. J. (2013). Neuroticism and common mental disorders: Meaning and utility of a complex relationship. Clinical Psychology Review, 33(5), 686697. doi:10.1016/j.cpr.2013.04.003.CrossRefGoogle ScholarPubMed
Pantazatos, S. P., Talati, A., Schneier, F. R., & Hirsch, J. (2014). Reduced anterior temporal and hippocampal functional connectivity during face processing discriminates individuals with social anxiety disorder from healthy controls and panic disorder, and increases following treatment. Neuropsychopharmacology, 39(2), 425434. doi:10.1038/npp.2013.211.CrossRefGoogle ScholarPubMed
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., … Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 28252830. doi:10.1007/s13398-014-0173-7.2.Google Scholar
Penninx, B. W. J. H., Beekman, A. T. F., Smit, J. H., Zitman, F. G., Nolen, W. A., Spinhoven, P., … Van Dyck, R. (2008). The Netherlands Study of Depression and Anxiety (NESDA): Rationale, objectives and methods. International Journal of Methods in Psychiatric Research, 17(3), 121140. doi:10.1002/mpr.256.CrossRefGoogle ScholarPubMed
Perez-Cornago, A., Ramírez, M. J., Zulet, , & Martinez, J. A. (2014). Effect of dietary restriction on peripheral monoamines and anxiety symptoms in obese subjects with metabolic syndrome. Psychoneuroendocrinology, 47, 98106. doi:10.1016/j.psyneuen.2014.05.003.CrossRefGoogle ScholarPubMed
Pitsavos, C., Panagiotakos, D. B., Papageorgiou, C., Tsetsekou, E., Soldatos, C., & Stefanadis, C. (2006). Anxiety in relation to inflammation and coagulation markers, among healthy adults: The ATTICA study. Atherosclerosis, 185(2), 320326. doi:10.1016/j.atherosclerosis.2005.06.001.CrossRefGoogle ScholarPubMed
Randall, J. R., Sareen, J., Chateau, D., & Bolton, J. M. (2019). Predicting future suicide: Clinician opinion v. a standardized assessment tool. Suicide and Life-Threatening Behavior, 49(4), 941951. doi:10.1111/sltb.12481.CrossRefGoogle Scholar
Reece, A. G., & Danforth, C. M. (2017). Instagram photos reveal predictive markers of depression. EPJ Data Science, 6(1), 116. 10.1140/epjds/s13688-017-0110-z.Google Scholar
Rodriguez, B. F., Weisberg, R. B., Pagano, M. E., Bruce, S. E., Spencer, M. A., Culpepper, L., & Keller, M. B. (2006). Characteristics and predictors of full and partial recovery from generalized anxiety disorder in primary care patients. Journal of Nervous and Mental Disease, 194(2), 9197. doi:10.1097/01.nmd.0000198140.02154.32.CrossRefGoogle ScholarPubMed
Schiefelbein, V. L., & Susman, E. J. (2006). Cortisol levels and longitudinal cortisol change as predictors of anxiety in adolescents. Journal of Early Adolescence, 26(4), 397413. doi:10.1177/0272431606291943.CrossRefGoogle Scholar
Scholten, W. D., Batelaan, N. M., Penninx, B. W. J. H., Balkom, A. J. L. M., Van Smit, J. H., Schoevers, R. A., & Van Oppen, P. (2016). Diagnostic instability of recurrence and the impact on recurrence rates in depressive and anxiety disorders. Journal of Affective Disorders, 195, 185190. doi:10.1016/j.jad.2016.02.025.CrossRefGoogle ScholarPubMed
Scholten, W. D., Batelaan, N. M., van Balkom, A. J., Penninx, B. W., Smit, J. H., & Van Oppen, P. (2013). Recurrence of anxiety disorders and its predictors. Journal of Affective Disorders, 147(1–3), 180185. doi:10.1016/j.jad.2012.10.031.CrossRefGoogle ScholarPubMed
Spinhoven, P., Batelaan, N. M., Rhebergen, D., van Balkom, A. L., Schoevers, R., & Penninx, B. W. (2016). Prediction of 6-yr symptom course trajectories of anxiety disorders by diagnostic, clinical and psychological variables. Journal of Anxiety Disorders, 44, 92101. doi:10.1016/j.janxdis.2016.10.011.CrossRefGoogle ScholarPubMed
Steyerberg, E. W. (2009). Clinical prediction models: A practical approach to development, validation, and updating. New York, NY, US: Springer.CrossRefGoogle Scholar
Strobl, C., Boulesteix, A.-L., Zeileis, A., & Hothorn, T. (2007). Bias in random forest variable importance measures: Illustrations, sources and a solution. BMC Bioinformaticsoot, 8(1), 121. doi:25\r10.1186/1471-2105-8-25.Google ScholarPubMed
van Beljouw, I. M., Verhaak, P. F., Cuijpers, P., van Marwijk, H. W., & Penninx, B. W. (2010). The course of untreated anxiety and depression, and determinants of poor one-year outcome: A one-year cohort study. BMC Psychiatry, 10, 86. doi:10.1186/1471-244X-10-86.CrossRefGoogle ScholarPubMed
Verduijn, J., Verhoeven, J. E., Milaneschi, Y., Schoevers, R. A., van Hemert, A. M., Beekman, A. T. F., & Penninx, B. W. J. H. (2017). Reconsidering the prognosis of major depressive disorder across diagnostic boundaries: Full recovery is the exception rather than the rule. BMC Medicine, 15(1), 19. doi:10.1186/s12916-017-0972-8.CrossRefGoogle Scholar
Vogelzangs, N., Beekman, A. T. F., De Jonge, P., & Penninx, B. W. J. H. (2013). Anxiety disorders and inflammation in a large adult cohort. Translational Psychiatry, 3(e249), 18. 10.1038/tp.2013.27.CrossRefGoogle Scholar
Vollebergh, W. A. M., Iedema, J., Bijl, R. V., de Graaf, R., Smit, F., & Ormel, J. (2001). The structure and stability of common mental disorders. Archives of General Psychiatry, 58(6), 597. doi:10.1001/archpsyc.58.6.597.CrossRefGoogle ScholarPubMed
Whitfield-Gabrieli, S., Ghosh, S. S., Nieto-Castanon, A., Saygin, Z., Doehrmann, O., Chai, X. J., … Gabrieli, J. D. E. (2016). Brain connectomics predict response to treatment in social anxiety disorder. Molecular Psychiatry, 21(5), 680685. doi:10.1038/mp.2015.109.CrossRefGoogle ScholarPubMed
Wittchen, H. (1994). Reliability and validity studies of the WHO-composite international diagnostic interview (CIDI): A critical review. Journal of Psychiatric Research, 28, 5784.CrossRefGoogle ScholarPubMed
Wittchen, H. U., Kessler, R. C., Pfister, H., & Lieb, M. (2000). Why do people with anxiety disorders become depressed? A prospective-longitudinal community study. Acta Psychiatrica Scandinavica. 102 (Suppl 406), 1423.CrossRefGoogle Scholar
Woo, C.-W., Chang, L. J., Lindquist, M. A., & Wager, T. D. (2017). Building better biomarkers: Brain models in translational neuroimaging. Nature Neuroscience, 20(3), 365377. doi:10.1038/nn.4478.CrossRefGoogle ScholarPubMed
World Health Organization (1998). World health organization, composite international diagnostic interview (CIDI), core version 2.1. Geneva: World Health Organization.Google Scholar
Zhao, N., Zhang, Z., Wang, Y., Wang, J., Li, B., Zhu, T., & Xiang, Y. (2019). See your mental state from your walk: Recognizing anxiety and depression through Kinect-recorded gait data. PLoS ONE, 14(5), 113. doi:10.1371/journal.pone.0216591.CrossRefGoogle ScholarPubMed
Zoccola, P. M., Dickerson, S. S., & Yim, I. S. (2011). Trait and state perseverative cognition and the cortisol awakening response. Psychoneuroendocrinology, 36(4), 592595. doi:10.1016/j.psyneuen.2010.10.004.CrossRefGoogle ScholarPubMed
Figure 0

Table 1. Included baseline predictor variables across the five predictor domains

Figure 1

Table 2. Baseline characteristics of anxiety disorder sample, group comparisons between patients who had no anxiety disorder (n = 484) at 2-year follow-up and patients who did (n = 403)

Figure 2

Table 3. Evaluation of the 2-year recovery from anxiety disorders classification [mean (s.d.)]

Figure 3

Fig. 1. Classification performance of random forest classifiers. Performance is quantified by area-under-the-receiver-operator-curve (AUC) values calculated for each test set of all cross-validation iterations and is shown in box-and-whisker plots for all data domains. (a) Performance of the recovery from anxiety disorders prediction,(b) Performance of the recovery from all common mental disorders prediction. Asterisks mark a significant classification performance according to label-permutation tests (n = 1000) and Bonferroni-correction for six tests.The dashed line indicates chance-level performance.

Figure 4

Table 4. Evaluation of the 2-year recovery from all common mental disorders classification [mean (s.d.)]

Supplementary material: File

Bokma et al. supplementary material

Bokma et al. supplementary material

Download Bokma et al. supplementary material(File)
File 45.5 KB