Introduction
Schizophrenia is a potentially severe psychiatric disorder, characterized by delusions, hallucinations, and disorganized thinking (Hu et al., Reference Hu, Zong, Mann, Zheng, Liao, Li, He, Chen and Tang2017), which affects 2–3% of the world's population (Rajji et al., Reference Rajji, Miranda and Mulsant2014; Nowak et al., Reference Nowak, Sabariego, Switaj and Anczewska2016). The aetiology and neuropathology of this complex disorder are not well understood, and an accurate diagnosis can be slow due to the lack of reliable biomarkers. When neuroimaging became widely available two decades ago, it was hoped that this would lead to the development of imaging-based biomarkers that could be used to inform diagnostic and prognostic assessment of individual patients. The development of such biomarkers, however, has proved elusive due to the presence of complex, distributed, and subtle alterations that vary from one individual to another depending on their unique clinical profile. Traditional analytical techniques, which provide average estimates at the group level, have proved inadequate for detecting such alterations and dealing with the high degree of clinical heterogeneity amongst patients. In order to address this challenge, in the past decade, the neuroimaging community has begun to make use of an alternative analytic approach known as machine learning (ML). The advantage of ML relative to traditional analytical techniques is two-fold: firstly, ML methods are multivariate and therefore take the inter-correlation between input variables (e.g. voxels) into account; secondly, ML methods allow inferences at the individual rather than group level, and therefore generate results with greater potential of being translated into clinical tests.
Several studies have applied ML to neuroimaging data acquired from people at various stages of psychiatric illness with varying degree of success (Orrù et al., Reference Orrù, Pettersson-Yeo, Marquand, Sartori and Mechelli2012; Pettersson-Yeo et al., Reference Pettersson-Yeo, Benetti, Marquand, Dell'acqua, Williams, Allen, Prata, McGuire and Mechelli2013; Kim et al., Reference Kim, Calhoun, Shim and Lee2016, ). The vast majority of studies have used whole-brain images as input for single-subject classification without performing any a priori feature extraction (Lueken et al., Reference Lueken, Hilbert, Wittchen, Reif and Hahn2015; Rehme et al., Reference Rehme, Volz, Feis, Bomilcar-Focke, Liebig, Eickhoff, Fink and Grefkes2015). However, the human brain is a highly interconnected network, and the emergence of psychiatric illness is thought to be underpinned by a disruption of normal functional integration amongst cortical and subcortical regions. Therefore, a number of studies have employed a second approach that involves estimating functional integration across the whole brain and using this as input for single-subject classification (Shen et al., Reference Shen, Wang, Liu and Hu2010; Iidaka, Reference Iidaka2015). In addition, following recent advances in graph-based theoretical analysis, it is now possible to estimate the topological properties of the human brain in health and disease (Bullmore and Sporns, Reference Bullmore and Sporns2009; Bullmore and Bassett, Reference Bullmore and Bassett2011). This has revealed that the human brain has a small-world organization (characterized by a high local specialization and a high global integration between the brain regions) (Salvador et al., Reference Salvador, Suckling, Coleman, Pickard, Menon and Bullmore2005; He et al., Reference He, Chen and Evans2007), and that these networks are anatomically and functionally disrupted in individuals with psychiatric diseases (Lynall et al., Reference Lynall, Bassett, Kerwin, McKenna, Kitzbichler, Muller and Bullmore2010; Pettersson-Yeo et al., Reference Pettersson-Yeo, Allen, Benetti, McGuire and Mechelli2011). Several recent studies, therefore, have employed a third approach that involves using graph-based analytic methods to estimate the topological properties of the brain in patients and then using this information as input for single-subject classification (Cheng et al., Reference Cheng, Newman, Goni, Kent, Howell, Bolbecker, Puce, O'Donnell and Hetrick2015; Khazaee et al., Reference Khazaee, Ebrahimzadeh and Babajani-Feremi2016).
Because none of the existing studies on schizophrenia has employed all three approaches, at present, it is unclear which type of alteration best captures the neural correlates of the disorder at the level of the individual patient. This information would be important both from a clinical translation and theoretical modelling perspective. From a clinical translation perspective, understanding which type of approach is most discriminant between patients and controls would help us develop more accurate ML algorithms. From a theoretical modelling perspective, such understanding would shed light on the neural correlates of schizophrenia and inform current neurobiological models of the illness.
The present article, therefore, aims to elucidate the neural correlates of schizophrenia by comparing the diagnostic accuracy of three distinct metrics of brain function: pre-processed whole-brain images, connectome-wide functional connectivity and graph-based metrics. In order to assess the reliability of the findings, we perform this comparison using three ML methods of varying degrees of complexity and abstraction: logistic regression (LR) (lower complexity), support vector machine (SVM) (medium complexity) and deep learning (DL) technology (higher complexity). We used resting-state functional magnetic resonance imaging (rs-fMRI) data acquired at five different sites from a total of 295 patients with schizophrenia and 452 healthy controls. Our main dependent variable of interest was the accuracy of classification for the comparison between patients and controls. In light of the current understanding of psychiatric disorders as abnormalities in connectome-wide functional connectivity (Rubinov and Bullmore, Reference Rubinov and Bullmore2013) and topological properties (Suo et al., Reference Suo, Lei, Li, Li, Dai, Wang, He, Zhu, Kemp and Gong2018), we hypothesized that (i) the use of measures of functional integration (i.e. connectome-wide functional connectivity and graph-based metrics) would allow diagnostic classification with higher level of accuracy than whole-brain images. In addition, given that whole-brain functional integration may provide a richer characterization of network-level functioning than topological properties, we hypothesized that (ii) the use of connectome-wide matrices would allow a higher level of accuracy than graph-based analytic metrics. Finally, based on a previous large-scale investigation (Sabuncu et al., Reference Sabuncu and Konukoglu2015), we hypothesized that (iii) these findings would be consistent across different ML methods including logistic regression, SVM and DL technology, thereby supporting the reliability of our conclusions.
Materials and methods
Participants
We used five datasets, each including subjects with a diagnosis of schizophrenia and healthy controls. For information on participants in each dataset, see online Supplemental Information.
The combination of the five datasets yielded 295 patients with schizophrenia and 452 healthy controls. The demographic and clinical characteristics of the participants are presented in Table 1.
a Data are presented as mean±standard deviation.
b Patients were diagnosed with established schizophrenia if duration of illness was more than 24 months.
c Data available for 49 of 56 patients.
d Data available for 28 of 32 patients.
e Data available for 50 of 68 patients.
f Data available for 88 of 90 patients.
g Data available for 50 of 56 patients.
h Data available for 24 of 32 patients.
SCZ, schizophrenia; CON, control; EST, established; FE, first episode; PANSS, Positive and Negative Syndrome Scale; SAPS, Scale for the Assessment of Positive Symptoms; SANS, Scale for the Assessment of Negative Symptoms; M, male; F, female; R, right; L, left; B, ambidextrous; An, antipsychotic medication; Dn, drug-naïve; NA, not available.
MRI acquisition
At each site, the rs-fMRI images were acquired by the Echo-Planar Imaging (EPI) sequence. For information on MRI acquisition for each dataset, see online Supplemental Information.
Data pre-processing
Image pre-processing was performed using SPM8 (http://www.fil.ion.ucl.ac.uk/spm). The first 10 volumes were discarded to minimize the impact of the instability in the initial MRI signal. The remaining images were corrected for intra-volume acquisition time delay and inter-volume geometric displacement of head motion. None of the subjects included in the present investigation showed excessive head motion during scanning (defined as translational movement >1.5 mm and/or rotation >1.5°). After these corrections, the images were spatially normalized to a 3 × 3 × 3 mm3 Montreal Neurological Institute (MNI) 152 template and then linearly detrended and temporally bandpass filtered (0.01–0.08 Hz) to remove low-frequency drift and high-frequency physiological noise. Finally, the global signal, the white matter signal, the cerebrospinal fluid (CSF) signal and the motion parameters were regressed out (Fox et al., Reference Fox, Zhang, Snyder and Raichle2009). The resulting images were then used as input features for the subsequent ML analyses, in order to examine the diagnostic value of pre-processed whole-brain images.
Network construction
The graph theoretical network construction was performed using GRETNA software (http://www.nitrc.org/projects/gretna/) (Wang et al., Reference Wang, Wang, Xia, Liao, Evans and He2015). First, to define the brain nodes, the whole brain was divided into 90 cortical and subcortical regions of interest – each representing a network node – using the automated anatomical labelling atlas. Next, to define the edges of the network, we extracted the mean time series of each region, and calculated Pearson's correlations of the mean time series between all pairs of nodes. This process resulted in a 90 × 90 weighted correlation matrix for each subject. This correlation matrix was then used as input feature for the subsequent ML analyses, in order to examine the diagnostic value of connectome-wide functional connectivity.
Network analysis and graph-based metrics extraction
To address the problem that networks of different subjects differed in the number of edges (Wen et al., Reference Wen, Zhu, He, Kochan, Reppermund, Slavin, Brodaty, Crawford, Xia and Sachdev2011), we applied a range of sparsity thresholds, S, to the correlation matrices consistent with previous studies (Lei et al., Reference Lei, Li, Li, Chen, Huang, Lui, Li, Bi and Gong2015; Suo et al., Reference Suo, Lei, Li, Chen, Li, Li, Huang, Lui, Li, Kemp and Gong2015). Here S was defined as the ratio of the number of existing edges divided by the maximum possible number of edges in a network. This thresholding strategy results in each graph having the same number of edges, thereby enabling the investigation of between group differences in relative network organization (Zhang et al., Reference Zhang, Wang, Wu, Kuang, Huang, He and Gong2011). Here the use of a range of sparsity thresholds S generated a threshold range of 0.10 < S < 0.34 with an interval of 0.01.
For each sparsity level, we calculated both global and nodal network metrics. The global metrics were of two kinds: (i) small-world parameters [for definitions see (Watts and Strogatz, Reference Watts and Strogatz1998)] including the clustering coefficient C p, characteristic path length L p [calculated as the harmonic mean distance between all possible pairs of regions to address the disconnected graphs dilemma (Newman, Reference Newman2003)], normalized clustering coefficient γ, normalized characteristic path length λ, and small-worldness σ; and (ii) network efficiency parameters [for definitions see (Latora and Marchiori, Reference Latora and Marchiori2001)] including the local efficiency E loc and global efficiency E glob. The nodal centrality metrics were the nodal degree, nodal efficiency, and nodal betweenness. Finally, for each network metric, we calculated the area under the curve (AUC) over the sparsity range from S 1 to S n with an interval of ΔS, where S 1 = 0.10, S n = 0.34 and ΔS = 0.01. The AUC provides a summarized scalar for the topological characterization of brain networks that is independent of a single threshold selection and sensitive to topological alterations in brain disorders (Zhang et al., Reference Zhang, Wang, Wu, Kuang, Huang, He and Gong2011). The AUC values for the above global and nodal metrics were then used as input features for the subsequent ML analyses, in order to examine the diagnostic value of graph-based metrics.
Controlling for age and gender effects
For each measure of brain function (i.e. whole-brain images, connectome-wide matrices or graph-based metrics) in each dataset, we build a regression model that represented how the measure varied with age and gender in the control sample, and then subtracted age- and gender-related variance from the actual measures. This was done using the Gaussian process regression method and kernel function that were used in a previous investigation (Kostro et al., Reference Kostro, Abdulkadir, Durr, Roos, Leavitt, Johnson, Cash, Tabrizi, Scahill, Ronneberger, Kloppel and Track2014), with the regression model based on control subjects only. This ensured that the residuals represented variation due to disease-related effects only, and could not be explained by age or gender effects. These residuals were then used as features for the ML analyses.
ML models
In the present study, we used three ML methods – LR, SVM and DL to perform single-subject classification. These methods were chosen in light of their widespread use amongst the neuroimaging community and their varying degrees of complexity and abstraction. In particular, LR is considered to be the ‘simplest’ ML method; SVM is associated with medium levels of complexity and abstraction, and is the most widely used ML method in psychiatric neuroimaging; and DL is thought to have the greatest potential of detecting hidden patterns in the data, due to its higher levels of complexity and abstraction. These methods are described in detail in online Supplemental Information.
Measuring the performance of ML models
In order to determine which type of input data would allow the highest accuracy of classification, we applied the three ML approaches (i.e. LR, SVM and DL) to the whole-brain normalized images (4D volume with time as the fourth dimension), the connectome-wide matrices (90 × 90 Pearson correlation matrix) and the graph-based analytic metrics (i.e. seven global metrics C p, L pγ, h λ, σ; E loc, E glob and 270 nodal metrics including nodal degree, nodal efficiency, and nodal betweenness × 90 brain regions). For each type of data and each type of ML method, a stratified 5-folds cross-validation was performed to measure the mean balanced accuracy of single-subject classification. This involved dividing the entire dataset into five folds that preserved the relative proportion of the two classes, and then using four folds as training set and the remaining fold as test set. To estimate the significance for each ML model and data features, we performed a nonparametric permutation test to calculate a p value for the balanced accuracy (Golland and Fischl, Reference Golland and Fischl2003). This involved repeating the classification procedure 1000 times with different random permutations of the group labels. We then counted the number of times the balanced accuracy was higher for the permuted labels than the real labels, and divided this number by 1000 to calculate a p value. Figure 1 shows an overview of the employed classification approach showing the main steps in the pipeline.
Results
Classification performance
The results of the single-subject classification of patients and healthy controls, including accuracies, sensitivities, specificities, and p values, are reported in Table 2. It can be seen that when using whole-brain normalized images data, the mean balanced accuracy of classification was at chance level; in particular, the average balanced accuracy of classification across the five datasets was about 53% for all three ML methods.
a Sensitivity and specificity were computed considering the patient group as the positive class.
b Statistical significance was estimated using the permutation method (1000 permutations).
SVM, support vector machine; LR, logistic regression; DL, deep learning
When using graph-based analytic metrics, the mean balanced accuracy of classification across the five datasets was around 69% (68.25%, 72.00%, and 68.61% for LR SVM, and DL, respectively).
In contrast, using connectome-wide matrices, it was possible to discriminate between patients and controls at single-subject level with above chance mean balanced accuracy; in particular, the average balanced accuracy of classification across the five datasets was 80.97%, 81.74%, and 81.03% for LR SVM, and DL, respectively.
This pattern of results suggests that (i) connectome-wide matrices provide superior balanced accuracy of classification between patients and controls relative to graph-based analytic metrics; and (ii) the superiority of connectome-wide matrices is expressed across a range of ML method with varying degrees of complexity and abstraction.
Which regions provided the greatest contribution to single-subject classification?
Having identified connectome-wide functional connectivity as the most powerful measure for capturing neurophysiological alternations in schizophrenia, we proceeded to examine the regions contributing to its superior performance. In the LR and SVM models, we computed the mean absolute value of the weights of the model across the different folds of the cross-validation, while in the DL models we used the Kim's approach (Kim et al., Reference Kim, Calhoun, Shim and Lee2016) where they used a linear combination of the weights across layers. For each region, the mean value of the weights for its functional connectivity with the remaining 89 regions was calculated. The 10 brain regions with the highest mean values, computed by averaging the weights across the five datasets, are reported in Table 3 and represented in Fig. 2. It can be seen that the inferior temporal gyrus, the temporal pole, the precentral gyrus and the thalamus featured consistently across the different ML methods. In contrast, other regions (e.g. putamen) were detected in some cases (e.g. DL) but not others (e.g. LR and SVM).
All the brain regions are from AAL (automated anatomical labelling).
R, right; L, left; SVM, support vector machine; LR, logistic regression; DL, deep learning.
Exploring cross-site generalizability using leave-one-site-out cross-validation
In the present investigation, the cross-validation was carried out within each site and the average accuracy corresponded to the mean accuracy across all five sites. The reason for this approach is that, since different sites employed different recruitment criteria, different scanners and different scanning parameters, we expected site-related differences to be significant and larger than the differences between patients and controls. For completeness, however, we also used leave-one-site-out cross-validation, which involved training an algorithm based on four datasets and then testing it based on the remaining dataset; as expected, the accuracy using this alternative approach was lower than the accuracies for each individual site (50% for whole-brain images, 62.54–68.1% for functional connectivity, and 55–64.85% for graph-based metrics).
Discussion
In the present article, we aimed to elucidate the neural correlates of schizophrenia by examining the relative diagnostic value of pre-processed whole-brain images, connectome-wide functional connectivity and graph-based metrics. Consistent with our first hypothesis, measures of functional integration (i.e. connectome-wide functional connectivity and graph-based metrics) allowed diagnostic classification with higher levels of accuracy than pre-processed whole-brain images. Critically, whole-brain images differ from connectome-wide functional connectivity and graph-based metrics in that they do not contain temporal information (i.e. functional connectivity). This pattern of results, therefore, provides support for the emerging notion that functional connectivity based on resting-state functional neuroimaging data, can be a powerful tool for characterizing brain disorders at the level of the individual (Camchong et al., Reference Camchong, MacDonald, Bell, Mueller and Lim2011; Lei et al., Reference Lei, Li, Li, Chen, Huang, Lui, Li, Bi and Gong2015; Suo et al., Reference Suo, Lei, Li, Chen, Li, Li, Huang, Lui, Li, Kemp and Gong2015). For instance, Wee et al. (Reference Wee, Yap, Denny, Browndyke, Potter, Welsh-Bohmer, Wang and Shen2012) used this approach to classify patients with amnestic mild cognitive impairment and healthy controls, achieving an accuracy of 86%. More recently, Iidaka and colleagues demonstrated similar results in autism spectrum disorder, by developing an algorithm that could classify patients and controls with approximately 90% accuracy (Iidaka, Reference Iidaka2015). With respect to schizophrenia, Skatun et al. (Reference Skatun, Kaufmann, Doan, Alnaes, Cordova-Palomera, Jonsson, Fatouros-Bergman, Flyckt, Melle, Andreassen, Agartz and Westlye2017) were able to discriminate between 182 patients with a schizophrenia spectrum diagnosis and 348 healthy controls with an accuracy of up to 80%, by using a whole-brain data-driven definition of network nodes and regularized partial correlations which revealed differences in functional connectivity within frontal, sensory, and subcortical networks between groups. In addition, Cheng et al. (Reference Cheng, Newman, Goni, Kent, Howell, Bolbecker, Puce, O'Donnell and Hetrick2015), using measures of betweenness centrality extracted from resting-state functional MRI data, were able to classify 19 schizophrenic patients and 29 non-psychiatric controls with an accuracy of around 80%. Here we expand the existing literature by demonstrating that, amongst measures of functional integration, connectome-wide functional connectivity has greater diagnostic value than graph-based metrics, consistent with our second hypothesis. Nevertheless, the average diagnostic accuracy of connectome-wide functional connectivity was not very high, at around 81%. A possible explanation is that this might be due to the limited sample size of our datasets compared to some of the previous studies; e.g. Iidaka (Reference Iidaka2015) used rs-fMRI data to compare 312 patients with autism spectrum disorder and 328 controls with typical development. However, previous studies using a similar or smaller number of participants have reported comparable or even higher accuracies, suggesting that sample size may not be the only explanation. For example Cheng and colleagues used a total sample of 48 (19 patients, 29 controls) and reported accuracies up to 80% (Cheng et al., Reference Cheng, Newman, Goni, Kent, Howell, Bolbecker, Puce, O'Donnell and Hetrick2015; Khazaee et al., Reference Khazaee, Ebrahimzadeh and Babajani-Feremi2015); whereas Khazaee used 40 subjects (20 patients, 20 controls) and reported accuracies up to 100% (Cheng et al., Reference Cheng, Newman, Goni, Kent, Howell, Bolbecker, Puce, O'Donnell and Hetrick2015; Khazaee et al., Reference Khazaee, Ebrahimzadeh and Babajani-Feremi2015). An alternative explanation is that our average diagnostic accuracy of about 81% is an accurate reflection of the limited diagnostic value of connectome-wide functional connectivity when patients with a diverse range of symptoms and different durations of illness are combined into a single diagnostic group. A further source of heterogeneity is the type, dose, and duration of antipsychotic medications in four of the five datasets. We suggest that future studies might achieve higher diagnostic and prognostic accuracies by focusing on sub-groups of patients with similar clinical presentations; this, however, would require larger sample sizes than those used in the present investigation.
Graph-based theoretical analysis is thought to provide a powerful framework for characterizing topological properties of brain networks (Bullmore and Sporns, Reference Bullmore and Sporns2009; Biswal et al., Reference Biswal, Mennes, Zuo, Gohel, Kelly, Smith, Beckmann, Adelstein, Buckner, Colcombe, Dogonowski, Ernst, Fair, Hampson, Hoptman, Hyde, Kiviniemi, Kotter, Li, Lin, Lowe, Mackay, Madden, Madsen, Margulies, Mayberg, McMahon, Monk, Mostofsky, Nagel, Pekar, Peltier, Petersen, Riedl, Rombouts, Rypma, Schlaggar, Schmidt, Seidler, Siegle, Sorg, Teng, Veijola, Villringer, Walter, Wang, Weng, Whitfield-Gabrieli, Williamson, Windischberger, Zang, Zhang, Castellanos and Milham2010). Because many of these topological properties, including small-worldness, efficiency and nodal degree, have been shown to be altered in patients with schizophrenia relative to healthy controls (Fornito et al., Reference Fornito, Zalesky, Pantelis and Bullmore2012; Drakesmith et al., Reference Drakesmith, Caeyenberghs, Dutt, Zammit, Evans, Reichenberg, Lewis, David and Jones2015), one might expect this information to have high diagnostic value at the level of the individual. However, in the present investigation, the use of graph-based metrics led to average accuracies that were just about higher than chance level (69%). This suggests that, while the extraction of graph-based analytic metrics from whole-brain time series reveals some important functional features at the group level, important information about network-level functioning at the individual level is lost during the computing. An alternative explanation is that our methodological approach, in which all graphic metrics were considered as simple vectors and were treated equally, was not optimal for the investigation of the diagnostic value of this specific feature. Future studies could examine the impact of using an alternative methodological approach, in which the different graphic metrics are associated with different a priori weights; here the main challenge would be to decide which a priori weights should be assigned to which graphic metrics which is unclear from the existing literature.
Consistent with our third hypothesis, our findings were consistent across different ML algorithms associated with varying degree of complexity and abstraction, including logistic regression, SVM and deep neural networks. This replicates a previous large-scale investigation (Sabuncu et al., Reference Sabuncu and Konukoglu2015) reporting that the choice of measurement type used for classification has a much larger effect on the final accuracy than the choice of the ML algorithm. Previous studies have suggested that DL – a type of ML capable of high orders of complexity and abstraction (LeCun et al., Reference LeCun, Bengio and Hinton2015) which has already produced several promising results in medical image analysis (Kim et al., Reference Kim, Calhoun, Shim and Lee2016; Havaei et al., Reference Havaei, Davy, Warde-Farley, Biard, Courville, Bengio, Pal, Jodoin and Larochelle2017) – may yield higher classifier accuracy than SVM (Pinaya et al., Reference Pinaya, Gadelha, Doyle, Noto, Zugman, Cordeiro, Jackowski, Bressan and Sato2016; Vieira et al., Reference Vieira, Pinaya and Mechelli2017). However, in the present study, DL did not perform better than shallow ML methods such as SVM or even LR despite significant differences in computing time and required computational resources. This might be explained by the relatively small sample size in each site, as DL models involve the estimation of a higher number of parameters that in turn require the use of a larger sample size.
Taken collectively, the results of the present investigation are consistent with the so-called ‘dysconnectivity hypothesis’ of schizophrenia, which attempts to elucidate the link between the symptoms of the illness with the underlying molecular and neuronal pathophysiology. Specifically, this hypothesis suggests that schizophrenia is best understood in terms of system-level aberrant neuromodulation of synaptic efficacy that mediates the influence of intrinsic and extrinsic connectivity (Friston et al., Reference Friston, Brown, Siemerkus and Stephan2016). This aberrant neuromodulation is not equally distributed across the brain but is thought to affect certain neural circuitries more than others (Pettersson-Yeo et al., Reference Pettersson-Yeo, Allen, Benetti, McGuire and Mechelli2011). Here we found that the thalamus and temporal regions were amongst the areas providing the greatest contribution to classification according to all three ML methods. Alterations in thalamic functional connectivity are well-known features of psychotic disorders (Woodward and Heckers, Reference Woodward and Heckers2016), and include both reduced prefrontal-thalamic connectivity and increased sensorimotor-thalamic (Woodward et al., Reference Woodward, Karbasforoushan and Heckers2012). These alterations are also evident in individuals at clinical high risk for schizophrenia, especially those who later go on to convert to psychosis (Anticevic et al., Reference Anticevic, Haut, Murray, Repovs, Yang, Diehl, McEwen, Bearden, Addington, Goodyear, Cadenhead, Mirzakhanian, Cornblatt, Olvet, Mathalon, McGlashan, Perkins, Belger, Seidman, Tsuang, van Erp, Walker, Hamann, Woods, Qiu and Cannon2015), suggesting that they may represent a marker of future risk. Alterations in temporal connectivity have also been reported in several previous studies of people with psychosis as well as their unaffected relatives (Xiao et al., Reference Xiao, Wang, Liu, Meng, He and Luo2017). The thalamus and temporal regions have also been reported to show abnormal (i.e. increased) functional connectivity in patients with schizophrenia (Cetin et al., Reference Cetin, Christensen, Abbott, Stephen, Mayer, Canive, Bustillo, Pearlson and Calhoun2014). Our investigation extends these findings, which were based on group-level statistics, by suggesting that thalamic and temporal functional dysconnectivity is key for differentiating patients and controls at the level of the individual patients.
The results of the present study are unlikely to be explained by the differing effects of age or gender on brain function in patients and controls, as variation due to these factors was removed before estimating disease-related effects. Nevertheless, the present study has a number of limitations. First, our data were acquired at five different sites using different scanners and acquisition parameters; on the other hand, the use of independent datasets allowed us to demonstrate the replicability of our findings. Second, the available clinical information varied between the different datasets due to the use of different research protocols between sites. This means it was not possible to examine the relationship between clinically relevant factors and neural alterations across the five datasets. Third, in the present study, we did not use feature selection which has been shown to improve performance in previous studies (Mwangi et al., Reference Mwangi, Tian and Soares2014); future studies could try to evaluate the impact of alternative feature selection strategies on the results. Fourth, the graph theoretical analysis was implemented using a widely popular approach based on Pearson's correlations. However, there are alternative approaches, such as partial correlation matrices and binary topology metrics, which could be considered in future studies. Fifth, the development of diagnostic biomarkers for schizophrenia has limited clinical utility in real-words clinical practice (Orrù et al., Reference Orrù, Pettersson-Yeo, Marquand, Sartori and Mechelli2012). Of much greater clinical utility would be the development of biomarkers for predicting the onset of the illness or response to treatment; this, however, will require the availability of follow-up clinical data which were not available in our groups.
In conclusion, despite the above limitations, the present study demonstrates that connectome-wide functional connectivity allows the identification of individual patients with schizophrenia with greater accuracy than pre-processed whole-brain images and graph-based metrics. Functional dysconnectivity of thalamic and temporal regions was key for differentiating patients and controls at the level of the individual patients. This pattern of results is consistent with the dysconnectivity hypothesis of schizophrenia, which states that the neural basis of the disorder is best understood in terms of system-level functional connectivity alterations. Future studies could investigate the extent to which this finding is specific to schizophrenia or a trans-diagnostic feature of psychiatric disorders.
Finally, it is important to acknowledge that, at present, neuroimaging is still far from becoming a useful tool in the day-to-day clinical practice of clinical psychiatry. One of the key challenges is the poor generalizability of the findings across different datasets. For example, when using leave-one-site-out cross-validation, we found poor generalizability across the five sites, possibly due to the use of different recruitment criteria, scanners and scanning parameters. Nevertheless, the present study enhances the current understanding of network-level abnormalities in psychosis, which in turn could inform the development of diagnostic and prognostic neuroimaging-based markers in the future.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S0033291719001934
Acknowledgements
The authors thank all the participants for their cooperation. They assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008. This work was supported by a grant from the European Commission (PSYSCAN – Translating neuroimaging findings from research into clinical practice; ID: 603196); a Wellcome Trust's Innovator Award (208519/Z/17/Z) to Andrea Mechelli; Funds for International Cooperation and Exchange of the National Natural Science Foundation of China (Grant No. 81220108013) jointly awarded to Qiyong Gong and Andrea Mechelli; a Newton International Fellowship from the Royal Society to Dr Du Lei (ID: NF151455); and an European Research Council Award (REA-677467) to Gary Donohoe. Collection of dataset 4 was funded by Science Foundation Ireland (SFI 12/1365). Collection of dataset 5 was funded by the National Natural Science Foundation of China (Grant Nos. 81761128023, 81501452, 81227002, and 81621003).
Conflict of interest
None.