Introduction
It is well recognized that it is difficult to disentangle sleep dysfunction from disruption of circadian rhythms. For example, a recent review of experimental approaches designed to demonstrate independent sleep and circadian rhythm mechanisms concludes that the conceptual framework of independent mechanisms is simplistic and that the joint role of sleep and circadian rhythms serves homeostasis of essential physiological variables (Franken & Dijk, Reference Franken and Dijk2024). This conclusion is supported by a recent analysis of gene expression data (Jan et al. Reference Jan, Jimenez, Hor, Dijk, Skeldon and Franken2024). However, the knowledge that sleep disruption has a negative impact on depression through disrupting circadian physiology still leaves open the question of whether circadian disruption could be a causal driver of depression, or particular depressive subtypes, in some people. In addition, circadian phase can vary among individuals, which can lead to behavioral differences known as chronotypes that can also influence depression and mental health (Jones et al. Reference Jones, Lane, Wood, van Hees, Tyrrell, Beaumont, Jeffries, Dashti, Hillsdon, Ruth, Tuke, Yaghootkar, Sharp, Jie, Thompson, Harrison, Dawes, Byrne, Tiemeier, Allebrandt, Bowden, Ray, Freathy, Murray, Mazzotti, Gehrman, Lawlor, Frayling, Rutter, Hinds, Saxena and Weedon2019). Traditionally, circadian phase is measured by the gold standard tool Dim Light Melatonin Onset, which is defined as the start of melatonin production in the evening during dim light conditions. However, very frequent collection of blood samples over several hours can be inconvenient for participants, is labor and cost-intensive and hence leads to studies of small sample size. Moreover, such studies may be unethical for those with major depression.
Circadian rhythms are genetically governed by “clock genes,” which regulate rhythmic changes throughout the whole body (Piggins, Reference Piggins2002). In mammals, the core clock genes are expressed in all cells and circadian oscillations have been identified in different tissue and cell types (Takahashi, Reference Takahashi2017; Zhang et al. Reference Zhang, Lahens, Ballance, Hughes and Hogenesch2014). The primary transcriptional–translational negative feedback loop (TTFL) involves the CLOCK and BMAL1 (also named ARNTL) genes. Their products function as a heterodimeric transcriptional activator that binds to regulatory elements containing E-box motifs of other clock genes such as PER (Period) and CRY (Cryptochrome). These proteins then translocate into the nucleus and repress their own transcription by interacting with the CLOCK-BMAL1 heterodimer. With the decline of the protein level of PER and CRY, the repression is relieved and a new rhythmic cycle starts (Takahashi, Reference Takahashi2017). The TTFL results in thousands of clock-controlled genes showing a circadian pattern in expression level (Takahashi, Reference Takahashi2017; Zhang et al. Reference Zhang, Lahens, Ballance, Hughes and Hogenesch2014). At the organismal level, the circadian system functions as a hierarchical network with the “central clock” located in the suprachiasmatic nucleus (SCN) of the hypothalamus, which synchronizes clocks present in peripheral tissues. The tissue- or cell-specific internal clock can be shifted from the central clock but coordinates with it (Menet & Hardin, Reference Menet and Hardin2014; Yeung & Naef, Reference Yeung and Naef2018). Hence, clock-controlled genes show rhythmic patterns of expression in many tissues, but the times of maximum and minimum expression can vary between tissues (Zhang et al. Reference Zhang, Lahens, Ballance, Hughes and Hogenesch2014; Talamanca et al. Reference Talamanca, Gobet and Naef2023; Mure et al. Reference Mure, Le, Benegiamo, Chang, Rios, Jillani, Ngotho, Kariuki, Dkhissi-Benyahya, Cooper and Panda2018).
Experimental studies have demonstrated that some genes have robust circadian patterns at cellular and tissue levels (Zhang et al. Reference Zhang, Lahens, Ballance, Hughes and Hogenesch2014; Panda et al. Reference Panda, Antoch, Miller, Su, Schook, Straume, Schultz, Kay, Takahashi and Hogenesch2002; Ruben et al. Reference Ruben, Wu, Smith, Schmidt, Francey, Lee, Anafi and Hogenesch2018; Noya et al. Reference Noya, Colameo, Bruning, Spinnler, Mircsof, Opitz, Mann, Tyagarajan, Robles and Brown2019; Weger et al. Reference Weger, Gobet, David, Atger, Martin, Phillips, Charpagne, Weger, Naef and Gachon2021) regardless of disruption to sleep, while the expression of other rhythmic genes can be disrupted when sleep desynchrony regimes are enforced. Specifically, in a study of insufficient sleep (26 participants studied in both unrestricted and restricted sleep conditions), the number of genes with a rhythmic expression profile after 7 nights of only 5 hours of sleep was reduced from 1,855 (8.6%) to 1,481 (6.9%) (Möller-Levet et al. Reference Möller-Levet, Archer, Bucca, Laing, Slak, Kabiljo, Lo, Santhi, Von Schantz, Smith and Dijk2013). In a forced multi-day desynchrony study, 22 volunteers were scheduled to a 28-hour sleep–wake schedule with associated fasting–feeding and dark–dim light cycles (Archer et al. Reference Archer, Moller-Levet, Bonmati-Carrion, Laing and Dijk2014). The study showed that delaying sleep by 4 hours for 3 consecutive days led to a six-fold reduction of rhythmic transcripts in the human blood transcriptome (from 6.4% to just 1%), whereas the centrally driven circadian rhythm of melatonin was unaffected. The genes (N = 39 transcripts) that remained rhythmic after sleep desynchrony are interpreted as those associated with signals from the SCN, and represent cellular, metabolic, and homeostatic blood-specific processes. In contrast, the transcripts driven by sleep alone (N = 234) or by both circadian rhythmicity and the sleep–wake cycle (N = 286) were linked with the regulation of transcription and translation which, in turn, provide powerful reinforcement of rhythmicity in peripheral tissue. In a 90-day constant bed rest protocol (20 male participants, 2 weeks baseline, 60-day bed rest, 2 weeks recovery) 91% of the transcriptome was shown to have changed compared to the baseline state with 76% of the transcriptome still affected after 10 days of recovery, with most impacted transcripts associated with mRNA translation and immune function (Archer et al. Reference Archer, Moller-Levet, Bonmati-Carrion, Laing and Dijk2024). Together these results imply that decoupling of the sleep-wake cycle from circadian rhythmicity (as occurs in jetlag, shift work and likely mood disorders) results in a profound disruption of the temporal organization at the level of the transcriptome.
While the SCN is the central “pacemaker,” the hypothalamus-pituitary-adrenal (HPA) axis plays a key role in controlling circadian dynamics in peripheral tissues (Li et al. Reference Li, Lu and Androulakis2024). The HPA regulates many other physiological processes (e.g., immune response, cell cycle, energy metabolism) including the cortisol and the stress response (Belvederi Murri et al. Reference Belvederi Murri, Pariante, Mondelli, Masotti, Atti, Mellacqua, Antonioli, Ghio, Menchetti, Zanetidou, Innamorati and Amore2014). The extent to which peripheral and central rhythmicity are decoupled in SCRD-associated depression is unclear. Analysis of time-of-death gene expression data of 6 brain regions from major depressive disorder (MDD) (N = 34) patients and controls (N = 55) reported that gene expression rhythmicity was attenuated in MDD patients in terms of peak timing (Li et al. Reference Li, Bunney, Meng, Hagenauer, Walsh, Vawter, Evans, Choudary, Cartagena, Barchas, Schatzberg, Jones, Myers, Watson, Akil and Bunney2013). Emerging computational approaches applied to omics data that allow estimation of the internal body clock regardless of sample collection time are logistically attractive and could facilitate collection of larger clinical cohorts to study circadian changes under different conditions. Hence, we now review novel computational methods applied to omics data that have clear potential for advancing mechanistic understanding in circadian-associated mood disorders.
Characterizing internal circadian time using omics data and computational tools
Many computational tools have been developed to infer circadian phase from gene expression data (microarray/bulk RNA-seq). These methods provide circadian pattern metrics (i.e., period, amplitude, and phase) quantified using statistical approaches. Period refers to the time between peaks in expression (tau), amplitude is the difference between the highest and the lowest point over a cycle (sometimes divided by 2), and phase shift refers to a horizontal shift in the peaks of expression from a reference while maintaining the same period difference between peaks. Since many genes show rhythmic patterns (but with peaks at different times), rhythmic parameters can be inferred through their joint analysis. Generating predictors that can be applied to single timepoint data is an active area of research trained on data sets that have longitudinal sampling within individuals (not possible for many tissues in human), or on large cross-sectional data sets where tissue from different individuals is sampled across the 24-hour period. Data training methods are classified as supervised or unsupervised.
Supervised learning methods (e.g., molecular time-table (Ueda et al. Reference Ueda, Chen, Minami, Honma, Honma, Iino, Hashimoto and Waterman2004), ZeitZeiger (Hughey et al. Reference Hughey, Hastie and Butte2016), BIO_CLOCK (Agostinelli et al. Reference Agostinelli, Ceglia, Shahbaba, Sassone-Corsi and Baldi2016)) use ground-truth data sets with known time of sampling to generate prediction models tested in other independent ground-truth data sets and then applied in data sets where sampling time was unknown. These methods identify a core set of “time-indicating genes,” (e.g., 13 genes in ZeitZeiger) and train statistical models on the expression of these core genes. A limitation of supervised learning methods is the need for ground-truth samples since few data sets have samples documented with a collection “timestamp.” Unsupervised machine learning methods (e.g., CYCLOPS (Anafi et al. Reference Anafi, Francey, Hogenesch and Kim2017), CHIRAL (Talamanca et al. Reference Talamanca, Gobet and Naef2023)) use underlying modeling and joint analysis across multiple genes to infer circadian phase i.e., since many genes are rhythmic in their expression in a coordinated manner, the expression levels of many genes infer circadian phase more accurately than can be inferred from expression of a single gene. For example, CHIRAL uses a new mathematical method to infer circadian time applied to the Genotype-Tissue Expression (GTEx) data, which allows a comprehensive analysis of rhythmic patterns on the whole organism (since GTEx data comprise gene expression data from multiple organs from the same post-mortem donors). Briefly, the algorithm first assigns tissue internal phase (TIP) for each sample in each tissue type with a selected set of seed genes. Donor internal phase is estimated from TIPs, which assumed that each TIP is determined by the donor and the tissue. With this approach, differences across donors and tissues can be captured. Technological advances that allow generation of large data sets that can be interrogated for circadian phase, have been accompanied by an explosion in computational methods (Table 1). With adequate variation in sampling times across individuals, methods such as CYCLOPS and CHIRAL can achieve a high accuracy with a mean absolute error between 1∼2 h (Talamanca et al. Reference Talamanca, Gobet and Naef2023). When new methods are presented, comparisons with standard methods are provided (and show improvements in accuracy of assignments in ground-truth data sets or computational efficiency), but independent systematic comparisons of methods across multiple data sets are now needed. Hughes et al. (Reference Hughes, Abruzzi, Allada, Anafi, Arpat, Asher, Baldi, de Bekker, Bell-Pedersen, Blau, Brown, Ceriani, Chen, Chiu, Cox, Crowell, DeBruyne, Dijk, DiTacchio, Doyle, Duffield, Dunlap, Eckel-Mahan, Esser, FitzGerald, Forger, Francey, Fu, Gachon, Gatfield, de Goede, Golden, Green, Harer, Harmer, Haspel, Hastings, Herzel, Herzog, Hoffmann, Hong, Hughey, Hurley, de la Iglesia, Johnson, Kay, Koike, Kornacker, Kramer, Lamia, Leise, Lewis, Li, Li, Liu, Loros, Martino, Menet, Merrow, Millar, Mockler, Naef, Nagoshi, Nitabach, Olmedo, Nusinow, Ptacek, Rand, Reddy, Robles, Roenneberg, Rosbash, Ruben, Rund, Sancar, Sassone-Corsi, Sehgal, Sherrill-Mix, Skene, Storch, Takahashi, Ueda, Wang, Weitz, Westermark, Wijnen, Xu, Wu, Yoo, Young, Zhang, Zielinski and Hogenesch2017) provide “Guidelines for genome-scale analysis of biological rhythms” and a web-based application (CircaInSilico) to generate ground-truth synthetic genome biology data to facilitate benchmarking of methods.
a Supervised (uses ground-truth data to train the model)/Unsupervised.
Both supervised and unsupervised methods need samples collected across the 24-hour period, with better prediction accuracy achieved if each time point has more samples. While such time series sample collections are achieved in sleep laboratory studies or in post-mortem (i.e., across samples the full 24-hour period is represented), samples collected in clinics or volunteer studies are likely to represent only a fraction of the 24-hour period. A new method, has been developed to predict circadian time from bulk RNA-seq without needing to collect samples in a complete time series.
Beyond using bulk tissue gene expression data to study circadian biology, chronobiology at the single-cell level will give much deeper insights into potential differences in circadian pattern across different cell types. Many computational tools have been developed to infer cell-cycle state, which is a periodic biological process (Leng et al. Reference Leng, Chu, Barry, Li, Choi, Li, Jiang, Stewart, Thomson and Kendziorski2015; Liu et al. Reference Liu, Lou, Xie, Wang, Chen, Aparicio, Zhang, Jiang and Chen2017; Riba et al. Reference Riba, Oravecz, Durik, Jiménez, Alunni, Cerciat, Jung, Keime, Keyes and Molina2022). However, circadian rhythms and the cell cycle are independent processes. Characterizing circadian rhythm using single-cell data has been rarely documented but is likely to be an area of active research as more and more single-cell gene expression data sets are generated. One published method has used scRNA-seq data applying an unsupervised Bayesian algorithm developed to estimate circadian phase with a prior knowledge of core clock genes to initialize the model (Auerbach et al. Reference Auerbach, FitzGerald and Li2022). The model uses the posterior distribution of cell circadian phase to identify new rhythmic genes. This offers an opportunity to identify cell-type specific circadian patterns and ultimately cell therapies for chronobiological disorders. Disruption of circadian clocks in tumor vs non-tumor tissue in adenocarcinomas has already been demonstrated (Ananthasubramaniam & Venkataramanan Reference Ananthasubramaniam and Venkataramanan2024).
Although gene expression data are currently the most abundant, computational methods have been developed or adapted for applications to proteomic (Weger et al. Reference Weger, Gobet, David, Atger, Martin, Phillips, Charpagne, Weger, Naef and Gachon2021; Specht et al. Reference Specht, Kolosov, Cederberg, Bueno, Arrona-Palacios, Pardilla-Delgado, Ruiz-Herrera, Zitting, Kramer, Zeitzer, Czeisler, Duffy and Mignot2023) and metabolomics (Minami et al. Reference Minami, Kasukawa, Kakazu, Iigo, Sugimoto, Ikeda, Yasui, van der Horst, Soga and Ueda2009) data. Understanding circadian rhythms in genes, proteins and metabolites targeted directly or indirectly by drugs underpins the emerging field of chronopharmacology (Li et al. Reference Li, Lu and Androulakis2024).
Potential for application of computational methods to infer sleep and circadian disruption in depression
Methods to infer circadian phase from a single routinely collected blood sample (the easiest tissue to access) measured for gene expression offer the possibility of cost-effective investigation of circadian parameters in people with severe mental illness compared to those without. These scalable technologies will allow generation of large data sets which are needed to draw robust conclusions. To take advantage of these new methods, key data sets need to be generated. First, post-mortem brain samples are needed to demonstrate circadian disruption in a cell type relevant to depression (e.g., excitatory neurons, inhibitory neurons, astrocytes). Second, matched post-mortem blood and brain samples are needed to demonstrate that circadian disruption is detectable in blood. If these two steps were to be established, then it would pave the way to identification of disruption of circadian functions in a subset of living patients. Investigations could be conducted longitudinally within a person comparing mental ill health episodes to periods of euthymia and good health. Since a meta-analysis of randomized clinical trials (Scott et al. Reference Scott, Webb, Martyn-St James, Rowse and Weich2021) has shown that improved sleep quality leads to better mental health, quantification of the changes in circadian rhythm before and after sleep improvement using omics data could provide a direct test of the causal role of circadian disruption separated from sleep disruption. Over the next 5-10 years we expect to see a rapid growth both in statistical and computational methods and of data sets. This emerging area of research could help provide the evidence base to address the question posed by Hickie et al. (Reference Hickie, McCarthy, Crouse and Carpenter2024).
Data availability statement
No data are reported in this article type.
Author contributions
Scope: all authors. First draft: BB & NRW. Final draft: all authors.
Financial support
We acknowledge funding from National Health & Medical Research Council 1173790, 1185377, 2008197, Wellcome Trust UNS143926, National Institute of Health R01AG078241-01A1, 1R01MH121545-01.
Competing interests
None.
Ethical standards
Ethical approval and consent are not relevant to this article type.
Comments
No accompanying comment.