Why so GLUMM? Detecting depression clusters through graphing lifestyle-environs using machine-learning methods (GLUMM)

J.F. Dipnall; J.A. Pasco; M. Berk; L.J. Williams; S. Dodd; F.N. Jacka; D. Meyer

doi:10.1016/j.eurpsy.2016.06.003

Why so GLUMM? Detecting depression clusters through graphing lifestyle-environs using machine-learning methods (GLUMM)

Published online by Cambridge University Press: 23 March 2020

J.F. Dipnall ,

J.A. Pasco ,

M. Berk ,

L.J. Williams ,

S. Dodd ,

F.N. Jacka and

D. Meyer

Show author details

J.F. Dipnall*: Affiliation:
Impact strategic research centre, school of medicine, Deakin university, PO Box 281, Geelong, Victoria 3220, Australia Department of statistics, data science and epidemiology, Swinburne university of technology, Swinburne, Australia
J.A. Pasco: Affiliation:
Impact strategic research centre, school of medicine, Deakin university, PO Box 281, Geelong, Victoria 3220, Australia Melbourne clinical school-western campus, the university of Melbourne, Saint-Albans, VIC, Australia Department of epidemiology and preventive medicine, Monash university, Melbourne, VIC, Australia University hospital of Geelong, Geelong, VIC, Australia
M. Berk: Affiliation:
Impact strategic research centre, school of medicine, Deakin university, PO Box 281, Geelong, Victoria 3220, Australia University hospital of Geelong, Geelong, VIC, Australia Department of psychiatry, the university of Melbourne, Parkville, VIC, Australia Florey institute of neuroscience and mental health, Parkville, VIC, Australia Orygen, the National centre of excellence in youth mental health, Parkville, VIC, Australia
L.J. Williams: Affiliation:
Impact strategic research centre, school of medicine, Deakin university, PO Box 281, Geelong, Victoria 3220, Australia
S. Dodd: Affiliation:
Impact strategic research centre, school of medicine, Deakin university, PO Box 281, Geelong, Victoria 3220, Australia University hospital of Geelong, Geelong, VIC, Australia Department of psychiatry, the university of Melbourne, Parkville, VIC, Australia
F.N. Jacka: Affiliation:
Impact strategic research centre, school of medicine, Deakin university, PO Box 281, Geelong, Victoria 3220, Australia Department of psychiatry, the university of Melbourne, Parkville, VIC, Australia Centre for adolescent health, Murdoch children's research institute, Melbourne, Australia Black Dog institute, Sydney, Australia
D. Meyer: Affiliation:
Department of statistics, data science and epidemiology, Swinburne university of technology, Swinburne, Australia
*: ⁎Corresponding author. Impact strategic research centre, school of medicine, Deakin university, PO Box 281, Geelong, Victoria 3220, Australia. E-mail address:[email protected] (J.F. Dipnall).

Article contents

Abstract
Footnotes
References

Get access

Abstract

Background

Key lifestyle-environ risk factors are operative for depression, but it is unclear how risk factors cluster. Machine-learning (ML) algorithms exist that learn, extract, identify and map underlying patterns to identify groupings of depressed individuals without constraints. The aim of this research was to use a large epidemiological study to identify and characterise depression clusters through “Graphing lifestyle-environs using machine-learning methods” (GLUMM).

Methods

Two ML algorithms were implemented: unsupervised Self-organised mapping (SOM) to create GLUMM clusters and a supervised boosted regression algorithm to describe clusters. Ninety-six “lifestyle-environ” variables were used from the National health and nutrition examination study (2009–2010). Multivariate logistic regression validated clusters and controlled for possible sociodemographic confounders.

Results

The SOM identified two GLUMM cluster solutions. These solutions contained one dominant depressed cluster (GLUMM5-1, GLUMM7-1). Equal proportions of members in each cluster rated as highly depressed (17%). Alcohol consumption and demographics validated clusters. Boosted regression identified GLUMM5-1 as more informative than GLUMM7-1. Members were more likely to: have problems sleeping; unhealthy eating; ≤ 2 years in their home; an old home; perceive themselves underweight; exposed to work fumes; experienced sex at ≤ 14 years; not perform moderate recreational activities. A positive relationship between GLUMM5-1 (OR: 7.50, P < 0.001) and GLUMM7-1 (OR: 7.88, P < 0.001) with depression was found, with significant interactions with those married/living with partner (P = 0.001).

Conclusion

Using ML based GLUMM to form ordered depressive clusters from multitudinous lifestyle-environ variables enabled a deeper exploration of the heterogeneous data to uncover better understandings into relationships between the complex mental health factors.

Keywords

Depression Psychiatry Machine learning Boosted regression Cluster Lifestyle

Type: Original article
Information: European Psychiatry , Volume 39 , January 2017 , pp. 40 - 50

DOI: https://doi.org/10.1016/j.eurpsy.2016.06.003 [Opens in a new window]
Copyright: Copyright © Elsevier Masson SAS 2017

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

Footnotes

These authors contributed equally to this work.

Abbreviations: DIPIT, Data integration protocol in ten-steps, GLUMM, Graphing lifestyle-environs using machine-learning methods, GLUMM5-1, GLUMM solution 5 cluster 1, GLUMM5-2, GLUMM solution 5 cluster 2, GLUMM7-1, GLUMM solution 7 cluster 1, GLUMM7-3, GLUMM solution 7 cluster 3, GLUMM7-4, GLUMM solution 7 cluster 4, ML, Machine-learning, MART, Multiple additive regression trees, NCHS, National center for health statistics, NHANES, National health and nutrition examination survey, PHQ-9, Patient health questonnaire-9, SOMs, Self-organizing maps

References

Passos, ICMwangi, BKapczinski, FBig data analytics and machine learning: 2015 and beyond. Lancet Psychiatry 2016; 3: 13–15.CrossRef Google Scholar PubMed

Monteith, SGlenn, TGeddes, JBauer, MBig data are coming to psychiatry: a general introduction. Int J Bipolar Disord 2015; 3: 1–11.CrossRef Google Scholar PubMed

Samuel, ALSome studies in machine learning using the game of checkers. IBM J Res Develop 1959; 3: 210–229.CrossRef Google Scholar

Belson, WAMatching and prediction on the principle of biological classification. Appl Stat 1959;65–75.CrossRef Google Scholar

Witten, IHFrank, EHall, MAData mining: practical machine learning tools and techniques: practical machine learning tools and techniques. Morgan Kaufmann; 2011.Google Scholar

Kohenen, TSelf-organizing maps. Lecture notes in information sciences, 30. Springer; 1997.Google Scholar

Wehrens, RBuydens, LMSelf-and super-organizing maps in R: the Kohonen package. J Stat Softw 2007; 21: 1–19.CrossRef Google Scholar

Kohonen, TSelf-organized formation of topologically correct feature maps. Biol Cybern 1982; 43: 59–69.CrossRef Google Scholar

Mitchell, TMMachine learning Burr Ridge, IL: McGraw Hill; 1997. p. 45.Google Scholar

Shmueli, GPatel, NRBruce, PCData mining for business intelligence: concepts, techniques and applications in Microsoft Office Excel with XLMiner. Wiley; 2010.Google Scholar

Niculescu, ALevey, DPhalen, PLe-Niculescu, HDainton, HJain, Net al.Understanding and predicting suicidality using a combined genomic and clinical risk assessment approach. Mol Psychiatry 2015; 20: 1266–1285.CrossRef Google Scholar PubMed

Kessler, RCWarner, CHIvany, CPetukhova, MVRose, SBromet, EJet al.Predicting suicides after psychiatric hospitalization in US Army soldiers: the Army study to assess risk and resilience in service members (Army STARRS). JAMA Psychiatry 2015; 72: 49–57.CrossRef Google Scholar

Castro, VMRoberson, AMMcCoy, THWiste, ACagan, ASmoller, JWet al.Stratifying risk for renal insufficiency among lithium-treated patients: an electronic health record study. Neuropsychopharmacology 2015Google Scholar PubMed

Hahn, TKircher, TStraube, BWittchen, HUKonrad, CStröhle, Aet al.Predicting treatment response to cognitive behavioral therapy in panic disorder with agoraphobia by integrating local neural information. JAMA Psychiatry 2015; 72: 68–74.CrossRef Google Scholar PubMed

Chekroud, AMZotti, RJShehzad, ZGueorguieva, RJohnson, MKTrivedi, MHet al.Cross-trial prediction of treatment outcome in depression: a machine learning approach. Lancet Psychiatry 2016CrossRef Google Scholar PubMed

Monden, RWardenaar, KJStegeman, AConradi, HJde Jonge, PSimultaneous decomposition of depression heterogeneity on the person-, symptom-and time-level: the use of three-mode principal component analysis. Plos One 2015;10:e0132765.CrossRef Google Scholar PubMed

Widiger, TAClark, LAToward DSM-V and the classification of psychopathology. Psychol Bull 2000;126:946.CrossRef Google Scholar PubMed

Berk, MSarris, JCoulson, CJacka, FLifestyle management of unipolar depression. Acta Psychiatr Scand 2013; 127: 38–54.CrossRef Google Scholar

Hayley, ACSkogen, JCSivertsen, BWold, BBerk, MPasco, JAet al.Symptoms of depression and difficulty initiating sleep from early adolescence to early adulthood: a longitudinal study. Sleep 2014; 38: 1599–1606.CrossRef Google Scholar

Everitt, BHothorn, TCluster analysis. An introduction to applied multivariate analysis with R. Springer; 2011. p. 163–200.Google Scholar

Jain, AKDubes, RC Algorithms for clustering. Cap IV-cluster validity; 1988; 143–222.Google Scholar

Blashfield, RKMorey, LCThe classification of depression through cluster analysis. Compr Psychiatry 1979; 20: 516–527.CrossRef Google Scholar PubMed

Pilowsky, ILevine, SBoulton, DMThe classification of depression by numerical taxonomy. Br J Psychiatry 1969; 115: 937–945.CrossRef Google Scholar PubMed

Paykel, EClassification of depressed patients: a cluster analysis derived grouping. Br J Psychiatry 1971; 118: 275–288.CrossRef Google Scholar PubMed

Vesanto, JAlhoniemi, EClustering of the self-organizing map. Neural Networks IEEE Trans 2000; 11: 586–600.CrossRef Google Scholar PubMed

Van Hulle, MMSelf-organizing maps. Handbook of natural computing. Springer; 2012. p. 585–622.CrossRef Google Scholar

Waller, NGKaiser, HAIllian, JBManry, MA comparison of the classification capabilities of the 1-dimensional kohonen neural network with two partitioning and three hierarchical cluster analysis algorithms. Psychometrika 1998; 63: 5–22.CrossRef Google Scholar

Centers for disease control and prevention national center for health statistics. National health and nutrition examination survey: analytic guidelines, 1999–2010. U.S. department of health and human services; 2013.Google Scholar

Dipnall, JFBerk, MJacka, FNWilliams, LJDodd, SPasco, JAData integration protocol in ten-steps (DIPIT): a new standard for medical researchers. Methods 2014CrossRef Google Scholar PubMed

Kroenke, KSpitzer, RLThe PHQ-9: a new depression diagnostic and severity measure. Psychiatr Ann 2002; 32: 509–515.CrossRef Google Scholar

Kroenke, KSpitzer, RLWilliams, JBThe PHQ-9. J Gen Intern Med 2001; 16: 606–613.CrossRef Google Scholar PubMed

Grant, BFStinson, FSDawson, DAChou, SPDufour, MCCompton, Wet al.Prevalence and co-occurrence of substance use disorders and independent mood and anxiety disorders: results from the National epidemiologic survey on alcohol and related conditions. Arch Gen Psychiatry 2004;61:807.CrossRef Google Scholar PubMed

Grant, BFHarford, TCComorbidity between DSM-IV alcohol use disorders and major depression: results of a national survey. Drug Alcohol Depend 1995; 39: 197–206.CrossRef Google Scholar PubMed

(CDC), CfDCaP, National center for health statistics (NCHS). National health and nutrition examination survey questionnaire. Hyattsville, MD: U.S. department of health and human services, centers for disease control and prevention; 2009-2010.Google Scholar

Kohonen, TSelf-organizing maps-springer series in information sciences, 30. Berlin: Springer Verlag; 1995.Google Scholar

Gabor, ALeach, RDowla, FAutomated seizure detection using a self-organizing neural network. Electroencephalogr Clin Neurophysiol 1996; 99: 257–266.CrossRef Google Scholar PubMed

Magdolen, JRappelsberger, PDorffner, GFlexer, AWinterer, GEvaluating multi-layer perceptrons and self-organising feature maps as a tool for identifying psychiatric disorders in EEG. Psychiatr Res Neuroimag 1997; 68: 171–172.CrossRef Google Scholar

Arnrich, BSetz, CLa Marca, RTröster, GEhlert, USelf organizing maps for affective state detection. Machine Learn Assist Technol 2010;45.Google Scholar

Köhn, HFHubert, LJHierarchical cluster analysis. Wiley StatsRef: statistics reference online; 2006.Google Scholar

Freund, YSchapire, REA decision-theoretic generalization of on-line learning and an application to boosting. J Comp Syst Sci 1997; 55: 119–139.CrossRef Google Scholar

Hastie, TTibshirani, RFriedman, JFranklin, JThe elements of statistical learning: data mining, inference and prediction. Math Intel 2005; 27: 83–85.Google Scholar

Dipnall, JFPasco, JABerk, MWilliams, LJDodd, SJacka, FNet al.Fusing data mining, machine learning and traditional statistics to detect biomarkers associated with depression. Plos One 2016;11:e0148195.CrossRef Google Scholar PubMed

Friedman, JHastie, TTibshirani, RThe elements of statistical learning. Springer series in statistics Berlin: Springer; 2001Google Scholar

Schonlau, MBoosted regression (boosting): an introductory tutorial and a Stata plugin. Stata J 2005;5:330.CrossRef Google Scholar

Friedman, JHMeulman, JJMultiple additive regression trees with application in epidemiology. Stat Med 2003; 22: 1365–1381.CrossRef Google Scholar PubMed

Van Voorhees, BWPaunesku, DGollan, JKuwabara, SReinecke, MBasu, APredicting future risk of depressive episode in adolescents: the Chicago Adolescent depression risk assessment (CADRA). Ann Fam Med 2008; 6: 503–511.CrossRef Google Scholar

Friedman, JHHall, POn bagging and nonlinear estimation. J Stat Plan Infer 2007; 137: 669–683.CrossRef Google Scholar

Friedman, JHastie, TTibshirani, RAdditive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors). Ann Stat 2000; 28: 337–407.CrossRef Google Scholar

Rao, JNScott, AJThe analysis of categorical data from complex sample surveys: Chi² tests for goodness of fit and independence in two-way tables. J Am Stat Assoc 1981; 76: 221–230.CrossRef Google Scholar

Rao, JNScott, AJOn Chi² tests for multiway contingency tables with cell proportions estimated from survey data. Ann Stat 1984; 46–60.CrossRef Google Scholar

Sribney, WMTwo-way contingency tables for survey or clustered data. Stata Tech Bull 1999;8.Google Scholar

Archer, KJLemeshow, SGoodness-of-fit test for a logistic regression model fitted using survey sample data. Stata J 2006; 6: 97–105.CrossRef Google Scholar

Adrien, JNeurobiological bases for the relation between sleep and depression. Sleep Med Rev 2002; 6: 341–351.CrossRef Google Scholar PubMed

Jacka, FNPasco, JAMykletun, AWilliams, LJHodge, AMO’Reilly, SLet al.Association of Western and traditional diets with depression and anxiety in women. Am J Psychiatry 2010; 167: 305–311.CrossRef Google Scholar PubMed

Lai, JSHiles, SBisquera, AHure, AJMcEvoy, MAttia, JA systematic review and meta-analysis of dietary patterns and depression in community-dwelling adults. Am J Clin Nutr 2014; 99: 181–197.CrossRef Google Scholar PubMed

Dipnall, JFPasco, JAMeyer, DBerk, MWilliams, LJDodd, Set al.The association between dietary patterns, diabetes and depression. J Affect Disord 2015; 174: 215–224.CrossRef Google Scholar PubMed

Chief medical officers of England S, Wales, and Northern Ireland. Start active, stay active: a report on physical activity from the four home countries’ chief medical officers. Department of health; 2011.Google Scholar

Jacka, FNPJWilliams, LJLeslie, ERet al.Lower levels of physical activity in childhood associated with adult depression. J Sci Med Sport 2011; 14: 222–226.CrossRef Google Scholar PubMed

Pasco, JAWLJacka, FNet al.Habitual physical activity and the risk for depressive and anxiety disorders among older men and women. Int Psychogeriatr 2010; 24: 1–7.Google Scholar

Levesque, SSurace, MJMcDonald, JBlock, MLAir pollution & the brain: subchronic diesel exhaust exposure causes neuroinflammation and elevates early markers of neurodegenerative disease. J Neuroinflamm 2011;8:105.CrossRef Google Scholar PubMed

Weuve, JInvited commentary: how exposure to air pollution may shape dementia risk, and what epidemiology can say about it. Am J Epidemiol 2014; 180: 367–371.CrossRef Google Scholar

Nel, AAir pollution-related illness: effects of particles. Science 2005; 308: 804–806.CrossRef Google Scholar PubMed

Howren, MBLamkin, DMSuls, JAssociations of depression with C-reactive protein, IL-1, and IL-6: a meta-analysis. Psychosom Med. 2009; 71: 171–186.CrossRef Google Scholar PubMed

Pasco, JAWLJacka, FNet al.Tobacco smoking as a risk factor for major depressive disorder: population-based study. Br J Psychiatry 2008; 193: 322–326.CrossRef Google Scholar PubMed

Weissman, MMPaykel, ESMoving and depression in women. Society 1972; 9: 24–28.CrossRef Google Scholar

Grello, CMWelsh, DPHarper, MSDickson, JWDating and sexual relationship trajectories and adolescent functioning. Adolesc Fam Health 2003; 3: 103–112.Google Scholar

Davila, JStroud, CBStarr, LRMiller, MRYoneda, AHershenberg, RRomantic and sexual activities, parent–adolescent stress, and depressive symptoms among early adolescent girls. J Adolesc 2009; 32: 909–924.CrossRef Google Scholar PubMed

Joyner, KUdry, JRYou don’t bring me anything but down: Adolescent romance and depression. J Health Soc Behav 2000; 369–391.CrossRef Google Scholar PubMed

Bifulco, ABrown, GWAdler, ZEarly sexual abuse and clinical depression in adult life. Br J Psychiatry 1991; 159: 115–122.CrossRef Google Scholar PubMed

Submit a response

Comments

No Comments have been published for this article.

Article contents

Why so GLUMM? Detecting depression clusters through graphing lifestyle-environs using machine-learning methods (GLUMM)

Abstract

Keywords

Access options

Article purchase

Temporarily unavailable

Footnotes

References

Comments

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests