Advances in methods for characterising dietary patterns: a scoping review

Joy M. Hutchinson; Amanda Raffoul; Alexandra Pepetone; Lesley Andrade; Tabitha E. Williams; Sarah A. McNaughton; Rebecca M. Leech; Jill Reedy; Marissa M. Shams-White; Jennifer E. Vena; Kevin W. Dodd; Lisa M. Bodnar; Benoît Lamarche; Michael P. Wallace; Megan Deitchler; Sanaa Hussain; Sharon I. Kirkpatrick

doi:10.1017/S0007114524002587

Advances in methods for characterising dietary patterns: a scoping review

Published online by Cambridge University Press: 10 March 2025

Tabitha E. Williams ,

Sarah A. McNaughton ,

Rebecca M. Leech ,

Jill Reedy ,

Marissa M. Shams-White and

Jennifer E. Vena

...Show all authors

Show author details

Joy M. Hutchinson: Affiliation:
School of Public Health Sciences, University of Waterloo, Waterloo, ON, Canada
Amanda Raffoul: Affiliation:
Department of Nutritional Sciences, University of Toronto, Toronto, ON, Canada
Alexandra Pepetone: Affiliation:
School of Public Health Sciences, University of Waterloo, Waterloo, ON, Canada
Lesley Andrade: Affiliation:
School of Public Health Sciences, University of Waterloo, Waterloo, ON, Canada
Tabitha E. Williams: Affiliation:
School of Public Health Sciences, University of Waterloo, Waterloo, ON, Canada
Sarah A. McNaughton: Affiliation:
Health and Well-Being Centre for Research Innovation, School of Human Movement and Nutrition Sciences, University of Queensland, St. Lucia, QLD, Australia
Rebecca M. Leech: Affiliation:
Institute for Physical Activity and Nutrition, School of Exercise and Nutrition Sciences, Deakin University, Geelong, VIC, Australia
Jill Reedy: Affiliation:
National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
Marissa M. Shams-White: Affiliation:
Population Science Department, American Cancer Society, Washington, DC, USA Division of Cancer Control and Population Sciences, National Cancer Institute, Bethesda, MD, USA
Jennifer E. Vena: Affiliation:
Alberta’s Tomorrow Project, Alberta Health Services, Edmonton, AB, Canada
Kevin W. Dodd: Affiliation:
Division of Cancer Prevention, National Cancer Institute, Bethesda, MD, USA
Lisa M. Bodnar: Affiliation:
School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
Benoît Lamarche: Affiliation:
Centre Nutrition, santé et société (NUTRISS), Institut sur la nutrition et les aliments fonctionnels (INAF), Université Laval, Québec City, QC, Canada
Michael P. Wallace: Affiliation:
Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, ON, Canada
Megan Deitchler: Affiliation:
Intake – Center for Dietary Assessment, FHI Solutions, Washington, DC, USA
Sanaa Hussain: Affiliation:
School of Public Health Sciences, University of Waterloo, Waterloo, ON, Canada
Sharon I. Kirkpatrick*: Affiliation:
School of Public Health Sciences, University of Waterloo, Waterloo, ON, Canada
*: Corresponding author: Sharon Kirkpatrick; Email: [email protected]

Article contents

Abstract
Methods
Results
Discussion
Supplementary material
References

Rights & Permissions

Abstract

There is a growing focus on understanding the complexity of dietary patterns and how they relate to health and other factors. Approaches that have not traditionally been applied to characterise dietary patterns, such as latent class analysis and machine learning algorithms, may offer opportunities to characterise dietary patterns in greater depth than previously considered. However, there has not been a formal examination of how this wide range of approaches has been applied to characterise dietary patterns. This scoping review synthesised literature from 2005 to 2022 applying methods not traditionally used to characterise dietary patterns, referred to as novel methods. MEDLINE, CINAHL and Scopus were searched using keywords including latent class analysis, machine learning and least absolute shrinkage and selection operator. Of 5274 records identified, 24 met the inclusion criteria. Twelve of twenty-four articles were published since 2020. Studies were conducted across seventeen countries. Nine studies used approaches with applications in machine learning, such as classification models, neural networks and probabilistic graphical models, to identify dietary patterns. The remaining studies applied methods such as latent class analysis, mutual information and treelet transform. Fourteen studies assessed associations between dietary patterns characterised using novel methods and health outcomes, including cancer, cardiovascular disease and asthma. There was wide variation in the methods applied to characterise dietary patterns and in how these methods were described. The extension of reporting guidelines and quality appraisal tools relevant to nutrition research to consider specific features of novel methods may facilitate consistent reporting and enable synthesis to inform policies and programs.

Keywords

Dietary patterns Scoping review Novel methods Machine learning Latent class analysis Diet quality Health outcomes

Type: Scoping Review
Information: British Journal of Nutrition , First View , pp. 1 - 15

DOI: https://doi.org/10.1017/S0007114524002587 [Opens in a new window]
Creative Commons: To the extent this is a work of the US Government, it is not subject to copyright protection within the United States.
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright: © National Institutes of Health, National Institutes of Health, and the Author(s), 2025

Dietary intake is among the top risk factors for chronic diseases^{(Reference Afshin, Sur and Fay1,Reference English, Ard and Bailey2)} . Research examining dietary intake has historically focused on single foods, nutrients or other dietary constituents^{(Reference Mozaffarian, Rosenberg and Uauy3)}. As the focus of public health nutrition shifted from the prevention of deficiency to the prevention of chronic diseases, research likewise shifted towards the examination of dietary patterns, aiming to capture how foods and beverages are consumed in real life^{(Reference Mozaffarian, Rosenberg and Uauy3–Reference Schulz, Oluwagbemigun and Nöthlings5)}. Humans typically do not consume foods or nutrients on their own, but in the context of a broader dietary pattern^{(Reference Mozaffarian, Rosenberg and Uauy3,Reference Reedy, Subar and George4)} . Accordingly, food-based dietary guidelines are now typically focused on patterns of intake rather than single dietary components^{(Reference Herforth, Arimond and Álvarez-Sánchez6)}. It is likely the synergistic and antagonistic relationships among the multiple foods, beverages and other dietary components that humans consume that influence health rather than individual components^{(Reference Reedy, Subar and George4)}. In addition to this multidimensionality, dietary patterns are dynamic, changing from meal to meal, day to day and across the life course^{(Reference Reedy, Subar and George4,7)} . Further, dietary patterns are shaped by culture, social position and other contextual factors^{(Reference Imamura, Micha and Khatibzadeh8,Reference Delormier, Frohlich and Potvin9)} . However, incorporating the domains of multidimensionality, dynamism and contextual factors into dietary patterns analysis is a difficult task.

Traditional approaches to identify dietary patterns, including ‘a priori’ and ‘a posteriori’ approaches, are useful for understanding overall dietary patterns or the diet quality of populations and population subgroups^{(Reference Ocké10)}. For example, ‘a priori’ methods like the Healthy Eating Index-2020 and the Healthy Eating Food Index-2019 are generally investigator driven^{(Reference Shams-White, Pannucci and Lerman11,Reference Brassard, Munene and Pierre12)} and consider multiple components such as fruits and vegetables and whole grains as inputs, but typically compress the multidimensional construct of dietary patterns to a single unidimensional score reflecting overall diet quality^{(Reference Kirkpatrick, Reedy and Krebs-Smith13,Reference Bodnar, Cartus and Kirkpatrick14)} . ‘A posteriori’ approaches are data-driven and have also been widely used to identify dietary patterns. Commonly applied data-driven approaches include clustering methods (e.g., k-means, Ward’s method), principal component analysis and factor analysis, providing opportunities to identify dietary patterns through statistical modelling or clustering algorithms rather than relying on researcher hypotheses^{(Reference Hu15)}. These approaches compress dietary components to key food groupings typically expressed as single scores^{(Reference Ocké10,Reference Michels and Schulze16)} . By reducing the dimensionality of dietary patterns, these methods are limited in their ability to explain the wide variation in dietary intakes^{(Reference Reedy, Subar and George4)}. Methods employed to traditionally characterise dietary patterns using ‘a priori’ and ‘a posteriori’ approaches thus address multidimensionality to some extent, but do not allow for explorations of dietary patterns in their totality because they miss potential synergistic or antagonistic associations among dietary components^{(Reference Reedy, Subar and George4,Reference Bodnar, Cartus and Kirkpatrick14,Reference Reedy, Krebs-Smith and Hammond17)} .

Novel methods that have not traditionally been used to identify dietary patterns, such as probabilistic graphical modelling, latent class analysis and machine learning algorithms (e.g., random forest, neural networks), may capture complexities like dietary synergy. There is no clear delineation between traditional and novel methods, and specifically defining what is novel is challenging given it naturally implies an evolution of methods. Nonetheless, there is a growing interest among nutrition researchers in the application of methods that have not typically been used to capture dietary complexity, with these methods often centred in machine learning⁽¹⁸⁾. To date, there have been perspectives and narrative reviews on the application of machine learning in nutrition^{(Reference Kirk, Kok and Tufano19–Reference Côté and Lamarche21)}, and a recent systematic review of studies that applied machine learning approaches to assess food consumption^{(Reference Oliveira Chaves, Gomes Domingos and Louzada Fernandes22)}. However, there has not been an assessment of studies applying novel methods to characterise dietary patterns. Given the rapid adoption of these methods within the field of health^{(Reference Le Glaz, Haralambous and Kim-Dufor23–Reference Morgenstern, Buajitti and O’Neill26)}, it is increasingly important for researchers to have a basic understanding of available methods and how they are being applied in the field. This will facilitate the synthesis of evidence from a range of methodological inputs to inform food-based dietary guidelines and other policies and programs that promote health. The objective of this scoping review was therefore to describe the use of novel methods not traditionally used to characterise dietary patterns in the published literature.

Methods

The review was conducted in accordance with the JBI Manual for Evidence Synthesis⁽²⁷⁾, which was developed using the Arksey and O’Malley framework^{(Reference Arksey and O’Malley28)}. Reporting follows the Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews^{(Reference Tricco, Lillie and Zarin29)}.

Defining novel methods

The novel methods considered were based on a preliminary search of the literature and the expertise of the research team and included systems methods (e.g., agent-based modelling, system dynamics), least absolute shrinkage and selection operator, machine learning algorithms, copulas and data-driven statistical modelling approaches (e.g., treelet transformations, principal balances and coordinates). Novel methods could also include those that have been used previously in nutrition research if applied in new ways to characterise dietary patterns (e.g., linear programming used to model a modified dietary pattern rather than to test scenarios). Methods that were not considered to be novel were those that have been applied to assess dietary patterns in numerous studies and have been considered by prior reviews and commentaries^{(Reference English, Ard and Bailey2,Reference Ocké10,Reference Newby and Tucker30)} , including regression, ‘a priori’ approaches such as investigator-driven indices, and routinely used data-driven approaches, including factor analysis and cluster analysis^{(Reference Ocké10,Reference Krebs-Smith, Subar and Reedy31)} .

Identifying relevant studies

Articles were eligible for inclusion if they were: a primary research article; focused on dietary intake as an exposure or outcome, including examination of dietary patterns (i.e., multiple dietary components in combination rather than single nutrients, foods or other dietary components); used at least one or more novel methods, as described above, to characterise dietary patterns; were published in English; and focused on humans. Ineligible studies included those focused on individual foods or human milk rather than dietary patterns and commentaries and reviews.

Searches of three research databases, MEDLINE (via PubMed), the Cumulative Index to Nursing and Allied Health Literature and Scopus, were conducted in March 2022. These health-focused, specialised and multidisciplinary databases were selected based on consultation with a research librarian (JS) to ensure a range of possibly relevant study types were included. The search strategies were developed in consultation with the research librarian using keywords and subject headings to capture diet-related constructs (e.g., dietary intake, patterns, recommendations, feeding behaviour, food habits) and novel methods to characterise dietary patterns (e.g., machine learning, network science and system dynamics model). No date limits were applied to the searches, and articles were included until the end of the search in March 2022. The search strategies for MEDLINE, CINAHL and Scopus are available in online Supplementary File 1.

Study selection

Two independent reviewers (two of AP, AR, SH, SIK) screened each record at the title and abstract and full-text screening stages using Covidence⁽³²⁾, with one consistent reviewer (AP) participating throughout the entire process. At the title and abstract screening stage, an initial pilot screening (twenty-five records) generated 100 % agreement (AR and AP) and 92 % agreement (AP and SH). A second pilot screening (100 records) generated 91 % agreement (AR and AP) and 93 % agreement (AP and SH). When applicable, discrepancies were discussed by reviewers and if needed deferred to a third reviewer (SIK) for decision. Following pilot screening, the reviewers independently reviewed the remaining articles (96 % agreement, Kappa = 0·83).

The reviewers were intentionally liberal during the title and abstract screening stage because of the breadth of possible novel methods. This required iteratively revisiting the inclusion criteria. For example, reduced rank regression was initially considered to be novel but was found to be prevalent in the literature based on title and abstract screening and was excluded during full-text review. Further, articles that used ‘a posteriori’ methods to identify dietary intake but did not specify the exact method in the title or abstract were included for full-text review.

Pilot screening of full-text reviews (fifty records) generated 82 % agreement (AR and AP) and 96 % agreement (AP and SIK); after discrepancies were discussed, two reviewers independently screened the remaining full-text articles (93 % agreement, Kappa = 0·60). The high agreement between reviewers but relatively low Cohen’s Kappa is described as Cohen’s paradox, with a larger number of studies excluded than included^{(Reference Gwet33–Reference Belur, Tompson and Thornton35)}.

Data extraction

Data extraction was completed by JMH and TEW using a pre-specified Excel template, with all extracted data subsequently verified by LA. Data extraction fields (online Supplementary File 2) included information pertaining to authorship, study title, journal, year of publication, funding source, contextual details (e.g., study location), sample size and participant characteristics (e.g., age). Details relating to study methods (e.g., analysis input variables, measurement of dietary intake and analytic approaches) and results (e.g., findings related to dietary patterns and if applicable, health risk and outcomes) were also extracted.

Results

Summary of search

A total of 5274 unique articles were identified after removing duplicates. Of these, 436 were identified as potentially relevant based on the title and abstract review and underwent full-text screening (Figure 1). Studies excluded during full-text screening included those that did not include methods defined as novel, those that did not focus on dietary patterns, commentaries, narrative reviews, systematic reviews, studies that were not published in English, studies that were not conducted with humans and theses/dissertations. A final pool of twenty-four articles describing twenty-four unique studies met the inclusion criteria.

Figure 1. PRISMA diagram illustrating the screening process for a scoping review exploring innovative methods for the analysis of dietary intake data and characterisation of dietary patterns. PRISMA, Preferred Reporting Items for Systematic Reviews and Meta-Analyses.

Characteristics of included studies

Across the twenty-four included studies, data from seventeen countries were represented (Table 1). Half of the studies were published between 2005 and 2019^{(Reference Biesbroek, van der A and Brosens36–Reference Solans, Coenders and Marcos-Gragera47)}, and the remaining twelve were published between 2020 and March 2022^{(Reference Dalmartello, Decarli and Ferraroni48–Reference Schwedhelm, Lipsky and Shearrer59)}. Three studies used data from subsets of the European Prospective Investigation into Cancer and Nutrition^{(Reference Biesbroek, van der A and Brosens36,Reference Iqbal, Buijsse and Wirth45,Reference Schwedhelm, Knüppel and Schwingshackl46)} , two studies used waves of data from the National Health and Nutrition Examination Survey^{(Reference Wright, McKenna and Nugent54,Reference Farmer, Lee and Powell-Wiley58)} and two studies used data from the ELSA-Brasil cohort study (Table 2)^{(Reference Bezerra, Bahamonde and Marchioni42,Reference de Almeida Alves, Molina and da Fonseca57)} . Sample sizes ranged from 250 to over 73 000 participants. Nineteen studies were conducted using data from cohort or cross-sectional studies, and five studies applied a case–control design.

Table 1. Study characteristics across included studies applying novel methods to characterise dietary patterns

* Some studies included more than one country.

Table 2. Characteristics of studies (n 24) identified in a scoping review of novel analytic methods to characterise dietary patterns

† Etude Epidémiologique auprès de femmes de la Mutuelle Générale de l’Education Nationale.

‡ FFQ.

^§ Brazilian Longitudinal Study of Adult Health.

^|| European Prospective Investigation into Cancer and Nutrition.

^¶ National Health and Nutrition Examination Survey.

** 24 h dietary recall.

†† Protection Against Allergy: Study in Rural Environments.

‡‡ Finnish, rural-suburban birth cohort.

^§§ Portuguese Elderly and Nutritional Status Surveillance System.

^|||| Early Life Exposure in Mexico to Environmental Toxicants.

^¶¶ Coronary Artery Risk Development in Young Adults.

The majority (n 15) of studies used FFQ to assess dietary intake^{(Reference Biesbroek, van der A and Brosens36,Reference Fonseca, Gaio and Lopes37,Reference Oliveira, Rodríguez-Artalejo and Gaio39–Reference Harrington, Dahly and Fitzgerald43,Reference Iqbal, Buijsse and Wirth45,Reference Solans, Coenders and Marcos-Gragera47–Reference Hoang, Lee and Kim49,Reference Samieri, Sonawane and Lefèvre-Arbogast52,Reference Xia, Zhao and Zhang55–Reference de Almeida Alves, Molina and da Fonseca57)} . Six studies used 24-h recalls^{(Reference Schwedhelm, Knüppel and Schwingshackl46,Reference Madeira, Severo and Oliveira51,Reference Shang, Li and Xu53,Reference Wright, McKenna and Nugent54,Reference Farmer, Lee and Powell-Wiley58,Reference Schwedhelm, Lipsky and Shearrer59)} , two studies used food records/diaries^{(Reference Hearty and Gibney44,Reference Hose, Pagani and Karvonen50)} and one study used a FFQ and a 24-h recall^{(Reference Jamiołkowski, Szpak and Pawłowska38)}. Among the studies using 24-h recalls and records/diaries, one used data from a single recall that was combined with data from a FFQ^{(Reference Jamiołkowski, Szpak and Pawłowska38)}. The remaining studies including records or recalls averaged or combined data from two or more days of intake. Dietary input variables were created by selecting specific items of interest from questionnaires or condensing foods into groupings, ranging from nine to sixty-two food groupings^{(Reference Biesbroek, van der A and Brosens36–Reference Schwedhelm, Lipsky and Shearrer59)}. Apart from averaging recalls or records, none of the included studies applied substantial efforts to mitigate measurement error present in dietary intake data. Several studies noted potential misreporting as a limitation, and five studies specifically noted that findings may have been influenced by measurement error present in self-reported dietary assessment instruments^{(Reference Wu, Sánchez and Goodrich40,Reference Harrington, Dahly and Fitzgerald43,Reference Madeira, Severo and Oliveira51,Reference Samieri, Sonawane and Lefèvre-Arbogast52,Reference Schwedhelm, Lipsky and Shearrer59)} .

Novel methods applied to identify dietary patterns

The types of methods used and how they were implemented to identify dietary patterns varied widely (Table 3). Nine studies applied approaches that have applications in machine learning, including classification models, neural networks and probabilistic graphical models (Table 4)^{(Reference Biesbroek, van der A and Brosens36,Reference Jamiołkowski, Szpak and Pawłowska38,Reference Hearty and Gibney44–Reference Schwedhelm, Knüppel and Schwingshackl46,Reference Hoang, Lee and Kim49,Reference Shang, Li and Xu53,Reference Zhao, Naumova and Bobb56,Reference Schwedhelm, Lipsky and Shearrer59)} . The earliest study included in this review was published in 2005 and applied neural networks to characterise dietary patterns^{(Reference Jamiołkowski, Szpak and Pawłowska38)}. Fifteen studies applied other novel methods, including latent class analysis, mutual information and treelet transform^{(Reference Fonseca, Gaio and Lopes37,Reference Oliveira, Rodríguez-Artalejo and Gaio39–Reference Harrington, Dahly and Fitzgerald43,Reference Solans, Coenders and Marcos-Gragera47,Reference Dalmartello, Decarli and Ferraroni48,Reference Hose, Pagani and Karvonen50–Reference Samieri, Sonawane and Lefèvre-Arbogast52,Reference Wright, McKenna and Nugent54,Reference Xia, Zhao and Zhang55,Reference de Almeida Alves, Molina and da Fonseca57,Reference Farmer, Lee and Powell-Wiley58)} . Two studies identified dietary patterns using more than one novel method^{(Reference Hearty and Gibney44,Reference Shang, Li and Xu53)} . Five studies included comparisons of different novel methods, though these were typically versions of the same model^{(Reference Hearty and Gibney44,Reference Iqbal, Buijsse and Wirth45,Reference Solans, Coenders and Marcos-Gragera47,Reference Hoang, Lee and Kim49,Reference Shang, Li and Xu53)} . For example, Solans et al. compared three models for compositional data analysis and reported that the best-performing model incorporated both investigator- and data-driven methods^{(Reference Solans, Coenders and Marcos-Gragera47)}.

Table 3. Description of dietary patterns (n 24) identified in a scoping review of novel analytic methods to characterise dietary patterns

* We considered socio-demographic characteristics that are related to social position or are indicators of equity including age, sex, gender, race/ethnicity, marital status, education, employment status, smoking status as examples. We did not include physical activity, BMI or alcohol consumption.

Table 4. Novel methods applied to identify dietary patterns across included studies^*

* Studies may have used more than one novel approach to characterise dietary patterns.

In twelve studies, two to eight distinct dietary patterns, such as the ‘prudent’ pattern or ‘Western’ pattern, were identified using methods such as latent class analysis, treelet transform, random forest with classification tree analysis and multivariate finite mixture models^{(Reference Biesbroek, van der A and Brosens36–Reference Jamiołkowski, Szpak and Pawłowska38,Reference Affret, Severi and Dow41–Reference Harrington, Dahly and Fitzgerald43,Reference Dalmartello, Decarli and Ferraroni48,Reference Hose, Pagani and Karvonen50,Reference Madeira, Severo and Oliveira51,Reference Wright, McKenna and Nugent54,Reference de Almeida Alves, Molina and da Fonseca57,Reference Farmer, Lee and Powell-Wiley58)} . Six studies applied network methods, including probabilistic graphical models and mutual information, to identify networks of dietary patterns among populations^{(Reference Iqbal, Buijsse and Wirth45,Reference Schwedhelm, Knüppel and Schwingshackl46,Reference Hoang, Lee and Kim49,Reference Samieri, Sonawane and Lefèvre-Arbogast52,Reference Xia, Zhao and Zhang55,Reference Schwedhelm, Lipsky and Shearrer59)} .

Dynamism, or how dietary patterns vary across time, was incorporated into four studies’ characterisation or analysis of dietary patterns. Three studies incorporated stratification by meals to consider dynamism^{(Reference Hearty and Gibney44,Reference Schwedhelm, Knüppel and Schwingshackl46,Reference Schwedhelm, Lipsky and Shearrer59)} . In two studies using graphical models, separate networks were created for each meal to provide insights into how patterns of intake vary throughout the day^{(Reference Schwedhelm, Knüppel and Schwingshackl46,Reference Schwedhelm, Lipsky and Shearrer59)} . Hearty and Gibney used decision trees and neural networks and ran models by meals based on sixty-two food groups to predict diet quality^{(Reference Hearty and Gibney44)}. Additionally, one study considered dynamism by using ANOVA and chi-square tests to descriptively show how a variety of characteristics were associated with stable or changing dietary patterns characterised using latent class analysis^{(Reference Harrington, Dahly and Fitzgerald43)}.

Fourteen studies examined relationships between dietary patterns characterised using novel methods and variables indicative of health risk or outcomes, such as periodontitis, cardiovascular disease and metabolic syndrome (Table 3) ^{(Reference Biesbroek, van der A and Brosens36–Reference Wu, Sánchez and Goodrich40,Reference Dalmartello, Decarli and Ferraroni48–Reference Zhao, Naumova and Bobb56)} . Six studies included longitudinal analysis of the relationship between dietary patterns and health outcomes^{(Reference Biesbroek, van der A and Brosens36,Reference Jamiołkowski, Szpak and Pawłowska38,Reference Wu, Sánchez and Goodrich40,Reference Hose, Pagani and Karvonen50,Reference Samieri, Sonawane and Lefèvre-Arbogast52,Reference Shang, Li and Xu53)} . Most studies that examined health risk or outcomes first identified dietary patterns using a novel method and then investigated relationships with health outcomes using regression models^{(Reference Biesbroek, van der A and Brosens36–Reference Wu, Sánchez and Goodrich40,Reference Dalmartello, Decarli and Ferraroni48,Reference Hose, Pagani and Karvonen50,Reference Madeira, Severo and Oliveira51,Reference Wright, McKenna and Nugent54)} . In contrast, some studies incorporated variables indicative of health outcomes or risk directly into the machine learning models^{(Reference Hoang, Lee and Kim49,Reference Zhao, Naumova and Bobb56)} . For example, Zhao et al. ^{(Reference Zhao, Naumova and Bobb56)} applied Bayesian kernel machine regression, a machine learning model designed to incorporate high-dimensional data, to jointly model the relationship between several dietary components and cardiovascular disease risk. Similarly, Hoang et al. ^{(Reference Hoang, Lee and Kim49)} included health variables within mixed graphical models, though directionality of diet-health relationships could not be ascertained given the cross-sectional nature of the data. In two case–control studies, dietary patterns were identified using mutual information to estimate dietary pattern networks, with stratification by health outcomes^{(Reference Samieri, Sonawane and Lefèvre-Arbogast52,Reference Xia, Zhao and Zhang55)} .

Nineteen studies considered socio-demographic characteristics, such as sex, age, race/ethnicity, education and income^{(Reference Biesbroek, van der A and Brosens36,Reference Fonseca, Gaio and Lopes37,Reference Oliveira, Rodríguez-Artalejo and Gaio39–Reference Harrington, Dahly and Fitzgerald43,Reference Iqbal, Buijsse and Wirth45,Reference Dalmartello, Decarli and Ferraroni48–Reference Farmer, Lee and Powell-Wiley58)} . In one case, socio-demographic characteristics were included in models used to characterise dietary patterns^{(Reference Hoang, Lee and Kim49)}. Two studies stratified by socio-demographic characteristics, examining dietary patterns by sex^{(Reference Iqbal, Buijsse and Wirth45)} or age groups^{(Reference Bezerra, Bahamonde and Marchioni42)}. Studies that used case–control designs typically considered socio-demographic characteristics through matching^{(Reference Samieri, Sonawane and Lefèvre-Arbogast52,Reference Xia, Zhao and Zhang55)} . In the remaining studies that considered socio-demographic characteristics, these were incorporated in regression models to explore how dietary patterns characterised using novel methods were associated with health and other characteristics.

Two studies included comparisons of novel methods and traditional statistical approaches^{(Reference Biesbroek, van der A and Brosens36,Reference de Almeida Alves, Molina and da Fonseca57)} . For instance, Biesbroek et al. ^{(Reference Biesbroek, van der A and Brosens36)} found that dietary patterns identified through reduced rank regression were more strongly associated with coronary artery disease compared with those identified through random forest with classification tree analysis.

Discussion

The application of novel methods to dietary pattern research is rapidly expanding, with the aim of better understanding their complexity and how they are related to health and other factors. Many studies used methods that characterise distinct dietary patterns based on the population being studied, such as the ‘prudent’ pattern or the ‘Western’ pattern. Most studies used cross-sectional data, limiting opportunities to examine the effect of dietary patterns on health.

Methods newly being applied in this field offer promising capacity to better understand the totality of dietary patterns and synergistic relationships among dietary components when compared with traditional approaches that do not assume synergy^{(Reference Reedy, Subar and George4,Reference Bodnar, Cartus and Kirkpatrick14)} . Given the large variation in how dietary patterns were characterised using novel methods, multidimensionality and potential synergistic relationships between dietary components were considered and presented in a range of ways, from latent classes to networks. Several studies incorporated dynamism into their consideration of dietary patterns, though in most cases this was through stratification, for example, by meal, rather than through direct use of novel methods^{(Reference Harrington, Dahly and Fitzgerald43,Reference Hearty and Gibney44,Reference Schwedhelm, Knüppel and Schwingshackl46,Reference Schwedhelm, Lipsky and Shearrer59)} . In these cases, it was a combination of input variables, stratification by time and the novel method that enabled explorations of dynamism.

The methods highlighted have a range of strengths and limitations for the characterisation of dietary patterns. Methods that focused on the classification of distinct patterns allowed for the assessment of relationships between these patterns and health outcomes or other indicators of interest but explored the interrelationships between dietary components to a lesser degree^{(Reference Biesbroek, van der A and Brosens36–Reference Jamiołkowski, Szpak and Pawłowska38,Reference Affret, Severi and Dow41–Reference Harrington, Dahly and Fitzgerald43,Reference Dalmartello, Decarli and Ferraroni48,Reference Hose, Pagani and Karvonen50,Reference Madeira, Severo and Oliveira51,Reference Wright, McKenna and Nugent54,Reference de Almeida Alves, Molina and da Fonseca57,Reference Farmer, Lee and Powell-Wiley58)} . Other methods such as compositional data analysis, mutual information and probabilistic graphical models can be used to consider joint relationships among dietary components to better understand multidimensionality^{(Reference Iqbal, Buijsse and Wirth45–Reference Solans, Coenders and Marcos-Gragera47,Reference Hoang, Lee and Kim49,Reference Samieri, Sonawane and Lefèvre-Arbogast52,Reference Xia, Zhao and Zhang55,Reference Schwedhelm, Lipsky and Shearrer59)} . For example, Gaussian graphical models provide the opportunity to visualise the dietary pattern through a network of dietary components, with relationships between variables indicating conditional dependencies^{(Reference Iqbal, Buijsse and Wirth45,Reference Hoang, Lee and Kim49,Reference Schwedhelm, Lipsky and Shearrer59)} . However, studies making use of these methods included further analyses, such as the development of a score from dietary pattern networks, to assess relationships with health outcomes.

There are trade-offs between novel and traditional methods that should be considered when contemplating the most appropriate methods for a given study. Though potential benefits such as a greater ability to discern multidimensionality may be desirable, these must be weighed against the implications for interpretability and computational costs. The application of novel methods may not always yield insights beyond those gained from traditional approaches. For example, Biesbroek et al. ^{(Reference Biesbroek, van der A and Brosens36)} found that random forest models did not outperform reduced rank regression when examining associations of dietary patterns with coronary artery disease. Conversely, a study that was not included in this review because it first identified dietary patterns using a traditional method – principal component analysis – found that machine learning algorithms were better able to classify the identified dietary patterns according to cardiometabolic risk compared to traditional approaches^{(Reference Panaretos, Koloverou and Dimopoulos60)}.

Several socio-demographic characteristics are indicators of systemic health inequity and have been shown to be associated with dietary patterns among populations^{(Reference Hanson and Connor61–Reference Hiza, Casavale and Guenther63)}. The degree to which studies incorporated socio-demographic characteristics into their consideration of dietary patterns or relationships between dietary patterns and health varied, with adjusted regression models applied after dietary patterns were characterised as the most common approach. Consistent with nutrition research more broadly^{(Reference Hiza, Casavale and Guenther63–Reference Brassard, Munene and Pierre65)}, there was little consideration of possible interactions among socio-demographic characteristics in relation to dietary patterns. Methods particularly suited to pattern recognition and complexity could be leveraged to simultaneously explore potential joint relationships among facets of social identity and dietary patterns^{(Reference Wang, Ramaswamy and Russakovsky66)} and advance our understanding of how broader systems of oppression and intersecting characteristics contribute to dietary patterns^{(Reference Doan, Olstad and Vanderlee62)}.

Beyond the inclusion of socio-demographic characteristics in models, considering equity from the beginning of study design is a critical consideration given potential bias in data and algorithms that can have immense implications for those who already experience inequities because of factors such as structural racism^{(Reference Robinson, Renson and Naimi67–Reference Rajkomar, Hardt and Howell69)}. The included studies did not explicitly discuss the incorporation of equity into study design, and many conducted secondary analyses of existing datasets. The use of directed acyclic graphs has been identified as a potential solution to mitigate some possible issues with bias through careful model design^{(Reference Robinson, Renson and Naimi67)} and has been applied in other domains of nutrition research using novel methods^{(Reference Bodnar, Cartus and Kirkpatrick14)}. Engaging individuals with lived experience and the integration of interdisciplinary teams with broad expertise that can combine content knowledge with data-driven approaches can help to mitigate potential bias in algorithms^{(Reference Wang, Ramaswamy and Russakovsky66)}.

The level of description of methods varied, and it was sometimes challenging to decipher the specifics of how novel methods were applied. Although the Strengthening the Reporting of Observational Studies in Epidemiology—Nutritional Epidemiology reporting guidelines provide guidance for transparently reporting nutritional epidemiology and dietary assessment research^{(Reference Lachat, Hawwash and Ocké70)}, it was not designed specifically for the methods used in the studies considered in this review and the ways in which they are being applied in dietary patterns research. Other reporting guidelines, such as the Consolidated Standards of Reporting Trials, have been extended to consider the application of artificial intelligence (AI) ^{(Reference Liu, Cruz Rivera and Moher71)}. Motivations related to the extension of Consolidated Standards of Reporting Trials included inadequate reporting of studies using AI and the lack of full consideration of potential sources of bias specific to AI within existing reporting guidelines^{(Reference Liu, Cruz Rivera and Moher71)}. Relevant items added to Consolidated Standards of Reporting Trials-AI pertain to the role of AI in the study, the nature of the data used in AI systems and how humans interacted with AI systems, for example^{(Reference Liu, Cruz Rivera and Moher71)}. The extension of reporting guidelines such as STROBE-nut to consider applications of AI, including machine learning, and other methods that are becoming more commonly used, may facilitate consistent and complete reporting and improved comparability of studies. Reporting guidelines should continue to emphasise strategies applied to mitigate measurement error in dietary intake data^{(Reference Lachat, Hawwash and Ocké70)}, as studies using novel methods are not immune to the effects of error on findings^{(Reference Spicker, Nazemi and Hutchinson72)}. Along with reporting guidelines, the development of tailored quality appraisal tools may facilitate synthesis of high-quality evidence to inform recommendations about dietary patterns and health.

This review provides a snapshot of a rapidly evolving field^{(Reference Lampignano, Tatoli and Donghia73,Reference Slurink, Corpeleijn and Bakker74)} , with the involvement of an interdisciplinary team of researchers lending to a robust consideration of emerging methods in dietary patterns research. While prior reviews have provided perspectives on the potential applications of machine learning within the field of nutrition^{(Reference Kirk, Kok and Tufano19–Reference Côté and Lamarche21)}, this review considered dietary patterns in particular, as well as considering approaches beyond machine learning that have not traditionally been used in this area, broadening the scope compared to prior reviews^{(Reference Oliveira Chaves, Gomes Domingos and Louzada Fernandes22,Reference Kirk, Catal and Tekinerdogan75)} . The search terms were informed by preliminary searching, though it is unlikely that all relevant articles applying novel methods to characterise dietary patterns were captured, partially driven by the wide range of descriptors used for these methods and the lack of reporting standards. As well, determining whether a method is novel is somewhat subjective. Methods such as factor analysis and principal component analysis once revolutionised dietary pattern analysis, providing data-driven approaches to identify patterns^{(Reference Hu15)}. Now, they are widely applied and recognised as limited in their capabilities to capture complexity compared to some newer approaches. Further, the search terms skewed toward multidimensionality v. dynamism, potentially overlooking some studies focusing on variation of dietary patterns over time or across eating occasions. Nonetheless, this review documents an acceleration of the application of a range of novel methods to dietary patterns research and captures a broad scope of methods being used to characterise these patterns, highlighting the need for researchers to develop the lexicon and knowledge needed to interpret the emerging literature.

Conclusion

The findings of this review indicate a strong motivation to apply novel methods, including but not limited to machine learning, to improve understanding of dietary patterns and how they relate to health and other factors. The application of these methods may help us to learn about complex relationships that may not be possible to discern through traditional approaches. However, these methods may not be suitable for every question and do not necessarily overcome the limitations of more traditional approaches.

Given the proliferation of these methods, it is becoming increasingly worthwhile for nutrition researchers to have at least a basic understanding of novel methods such as machine learning and latent class analysis, so they can interpret the results of emerging studies. The development and implementation of reporting guidelines and quality appraisal mechanisms for studies that apply novel methods may improve the capacity for synthesis of evidence generated to inform strategies that promote improved population health and well-being.

Acknowledgements

We thank research librarian Jackie Stapleton (JS) of the University of Waterloo for support with the search strategy.

This review was funded by the Canadian Institutes of Health Research, a University of Waterloo Research Incentive Fund award, an Ontario Ministry of Research and Innovation Early Researcher Award held by S. I. K. and Microsoft AI for Good. J. M. H. was funded by a Vanier Canada Graduate Scholarship. R. M. L. is funded by a National Health and Medical Research Council Emerging Leadership Fellowship (APP1175250). L. M. B. was funded by the National Institutes of Health (R01 HD102313, MPI Bodnar LM, Naimi AI).

S. I. K. conceived of the review and planned it with the co-authors; A. R., A. P., S. H., S. I. K. conducted the search and screening; J. M. H. and T. E. W. conducted extraction; L. A. conducted verification; J. M. H. wrote the first draft of the manuscript with support from A. P. to write the methods; all co-authors provided critical input to the manuscript and all co-authors read and approved the final manuscript.

R. M. L. is a statistical editor for the British Journal of Nutrition. Other authors declare none.

Posted as a preprint: https://doi.org/10.1101/2024.06.20.24309251

Supplementary material

For supplementary material/s referred to in this article, please visit https://doi.org/10.1017/S0007114524002587

References

Afshin, A, Sur, PJ, Fay, KA, et al. (2019) Health effects of dietary risks in 195 countries, 1990–2017: a systematic analysis for the Global Burden of Disease Study 2017. The Lancet 393, 1958–1972.Google Scholar

English, LK, Ard, JD, Bailey, RL, et al. (2021) Evaluation of dietary patterns and all-cause mortality: a systematic review. JAMA Netw Open (Internet) 4, e2122277. https://pubmed.ncbi.nlm.nih.gov/34463743/ Google Scholar

Mozaffarian, D, Rosenberg, I & Uauy, R (2018) History of modern nutrition science—implications for current research, dietary guidelines, and food policy. BMJ (Internet) 361, k2392. https://www.bmj.com/content/361/bmj.k2392 Google Scholar

Reedy, J, Subar, AF, George, SM, et al. (2018) Extending methods in dietary patterns research. Nutrients (Internet) 10, 571. https://pubmed.ncbi.nlm.nih.gov/29735885/ Google Scholar

Schulz, CA, Oluwagbemigun, K & Nöthlings, U (2021) Advances in dietary pattern analysis in nutritional epidemiology. Eur J Nutr 60, 4115–4130.Google Scholar

Herforth, A, Arimond, M, Álvarez-Sánchez, C, et al. (2019) A global review of food-based dietary guidelines. Adv Nutr 10, 590.Google Scholar

National Academies of Sciences, Engineering, and Medicine; Health and Medicine Division; Food and Nutrition Board; Committee to Review the Process to Update the Dietary Guidelines for Americans (2017) Redesigning the Process for Establishing the Dietary Guidelines for Americans (Internet). Washington, DC: National Academies Press. https://www.ncbi.nlm.nih.gov/books/NBK469837/ (accessed July 2023).Google Scholar

Imamura, F, Micha, R, Khatibzadeh, S, et al. (2015) Dietary quality among men and women in 187 countries in 1990 and 2010: a systematic assessment. Lancet Global Health 3, e132–42.Google Scholar

Delormier, T, Frohlich, K & Potvin, L (2009) Food and eating as social practice – understanding eating patterns as social phenomena and implications for public health. Sociol Health Illness 31, 215–228.Google Scholar

Ocké, MC (2013) Evaluation of methodologies for assessing the overall diet: dietary quality scores and dietary pattern analysis. Proc Nutr Soc 72, 191–199.Google Scholar

Shams-White, MM, Pannucci, TE, Lerman, JL, et al. (2023) Healthy Eating Index-2020: review and update process to reflect the Dietary Guidelines for Americans, 2020–2025. J Acad Nutr Diet 123, 1280–1288.Google Scholar

Brassard, D, Munene, LAE, Pierre, SS, et al. (2022) Development of the Healthy Eating Food Index (HEFI)-2019 measuring adherence to Canada’s Food Guide 2019 recommendations on healthy food choices. Appl Physiol Nutr Metab 47, 595–610.Google Scholar

Kirkpatrick, SI, Reedy, J, Krebs-Smith, SM, et al. (2018) Applications of the Healthy Eating Index for surveillance, epidemiology, and intervention research: considerations and caveats. J Acad Nutr Diet 118, 1603.Google Scholar

Bodnar, LM, Cartus, AR, Kirkpatrick, SI, et al. (2020) Machine learning as a strategy to account for dietary synergy: an illustration based on dietary intake and adverse pregnancy outcomes. Am J Clin Nutr 111, 1235–1243.Google Scholar

Hu, FB (2002) Dietary pattern analysis: a new direction in nutritional epidemiology. Curr Opin Lipidol 13, 3–9.Google Scholar

Michels, KB & Schulze, MB (2005) Can dietary patterns help us detect diet-disease associations? Nutr Res Rev 18, 241–248.Google Scholar

Reedy, J, Krebs-Smith, SM, Hammond, RA, et al. (2017) Advancing the science of dietary patterns research to leverage a complex systems approach. J Academy Nutr Diet 117, 1019–1022.Google Scholar

National Academies of Sciences, Engineering, and Medicine; Health and Medicine Division; Food and Nutrition Board; Alice Vorosmarti and Joe Alper, Rapporteurs (2024) The Role of Advanced Computation, Predictive Technologies, and Big Data Analytics in Food and Nutrition Research (Internet). Washington, DC: National Academies Press. https://www.nationalacademies.org/our-work/the-role-of-advanced-computation-predictive-technologies-and-big-data-analytics-in-research-related-to-food-and-nutrition-a-workshop-series (accessed 10 May 2024).Google Scholar

Kirk, D, Kok, E, Tufano, M, et al. (2022) Machine learning in nutrition research. Adv Nutr 13, 2573–2589.Google Scholar

Morgenstern, JD, Rosella, LC, Costa, AP, et al. (2021) Perspective: big data and machine learning could help advance nutritional epidemiology. Adv Nutr 12, 621–631.Google Scholar

Côté, M & Lamarche, B (2021) Artificial intelligence in nutrition research: perspectives on current and future applications. Appl Physiol Nutr Metab 1–8.Google Scholar

Oliveira Chaves, L, Gomes Domingos, AL, Louzada Fernandes, D, et al. (2023) Applicability of machine learning techniques in food intake assessment: a systematic review. Crit Rev Food Sci Nutr 63, 902–919.Google Scholar

Le Glaz, A, Haralambous, Y, Kim-Dufor, DH, et al. (2021) Machine learning and natural language processing in mental health: systematic review. J Med Internet Res 23, e15708.Google Scholar

Payedimarri, AB, Concina, D, Portinale, L, et al. (2021) Prediction models for public health containment measures on COVID-19 using artificial intelligence and machine learning: a systematic review. Int J Environ Res Public Health 18, 4499.Google Scholar

Sidey-Gibbons, JAM & Sidey-Gibbons, CJ (2019) Machine learning in medicine: a practical introduction. BMC Med Res Method 19, 64.Google Scholar

Morgenstern, JD, Buajitti, E, O’Neill, M, et al. (2020) Predicting population health with machine learning: a scoping review. BMJ Open 10, e037860.Google Scholar

JBI (2020) Chapter 11: Scoping Reviews. JBI Manual for Evidence Synthesis (Internet) JBI. https://jbi-global-wiki.refined.site/space/MANUAL/4687342/Chapter+11%3A+Scoping+reviews (accessed 12 September 2023).Google Scholar

Arksey, H & O’Malley, L (2005) Scoping studies: towards a methodological framework. Int J Social Res Method 8, 19–32.Google Scholar

Tricco, AC, Lillie, E, Zarin, W, et al. (2018) PRISMA Extension for Scoping Reviews (-ScR): checklist and explanation. Ann Intern Med 169, 467–473.Google Scholar

Newby, PK & Tucker, KL (2004) Empirically derived eating patterns using factor or cluster analysis: a review. Nutr Rev 62, 177–203.Google Scholar

Krebs-Smith, SM, Subar, AF & Reedy, J (2015) Examining dietary patterns in relation to chronic disease: matching measures and methods to questions of interest. Circulation 132, 790–793.Google Scholar

Covidence (n.d.) Covidence - Better Systematic Review Management (Internet). https://www.covidence.org/ (accessed 12 September 2023).Google Scholar

Gwet, KL (2008) Computing inter-rater reliability and its variance in the presence of high agreement. Br J Math Stat Psychol 61, 29–48.Google Scholar

Cicchetti, DV & Feinstein, AR (1990) High agreement but low kappa: II. Resolving the paradoxes. J Clin Epidemiol 43, 551–558.Google Scholar

Belur, J, Tompson, L, Thornton, A, et al. (2021) Interrater reliability in systematic review methodology: exploring variation in coder decision-making. Sociol Methods Res 50, 837–865.Google Scholar

Biesbroek, S, van der A, DL, Brosens, MCC, et al. (2015) Identifying cardiovascular risk factor-related dietary patterns with reduced rank regression and random forest in the EPIC-NL cohort. Am J Clin Nutr 102, 146–154.Google Scholar

Fonseca, MJ, Gaio, R, Lopes, C, et al. (2012) Association between dietary patterns and metabolic syndrome in a sample of Portuguese adults. Nutr J 11, 64.Google Scholar

Jamiołkowski, J, Szpak, A & Pawłowska, D (2005) Dietary habits of men from Podlasie region of Poland in the years 1987–1998 analysed with self-organizing neural networks. Rocz Akad Med Bialymst 50, 220–224.Google Scholar

Oliveira, A, Rodríguez-Artalejo, F, Gaio, R, et al. (2011) Major habitual dietary patterns are associated with acute myocardial infarction and cardiovascular risk markers in a southern European population. J Am Diet Assoc 111, 241–250.Google Scholar

Wu, Y, Sánchez, BN, Goodrich, JM, et al. (2019) Dietary exposures, epigenetics and pubertal tempo. Environ Epigenet 5, dvz002.Google Scholar

Affret, A, Severi, G, Dow, C, et al. (2017) Socio-economic factors associated with a healthy diet: results from the E3N study. Public Health Nutr 20, 1574–1583.Google Scholar

Bezerra, IN, Bahamonde, NMSG, Marchioni, DML, et al. (2018) Generational differences in dietary pattern among Brazilian adults born between 1934 and 1975: a latent class analysis. Public Health Nutr 21, 2929–2940.Google Scholar

Harrington, JM, Dahly, DL, Fitzgerald, AP, et al. (2014) Capturing changes in dietary patterns among older adults: a latent class analysis of an ageing Irish cohort. Public Health Nutr 17, 2674–2686.Google Scholar

Hearty, AP & Gibney, MJ (2008) Analysis of meal patterns with the use of supervised data mining techniques--artificial neural networks and decision trees. Am J Clin Nutr 88, 1632–1642.Google Scholar

Iqbal, K, Buijsse, B, Wirth, J, et al. (2016) Gaussian graphical models identify networks of dietary intake in a German adult population. J Nutr 146, 646–652.Google Scholar

Schwedhelm, C, Knüppel, S, Schwingshackl, L, et al. (2018) Meal and habitual dietary networks identified through Semiparametric Gaussian Copula Graphical Models in a German adult population. PLoS One 13, e0202936.Google Scholar

Solans, M, Coenders, G, Marcos-Gragera, R, et al. (2019) Compositional analysis of dietary patterns. Stat Methods Med Res 28, 2834–2847.Google Scholar

Dalmartello, M, Decarli, A, Ferraroni, M, et al. (2020) Dietary patterns and oral and pharyngeal cancer using latent class analysis. Int J Cancer 147, 719–727.Google Scholar

Hoang, T, Lee, J & Kim, J (2021) Network analysis of demographics, dietary intake, and comorbidity interactions. Nutrients 13, 3563.Google Scholar

Hose, AJ, Pagani, G, Karvonen, AM, et al. (2021) Excessive unbalanced meat consumption in the first year of life increases asthma risk in the PASTURE and LUKAS2 birth cohorts. Front Immunol 12, 651709.Google Scholar

Madeira, T, Severo, M, Oliveira, A, et al. (2021) The association between dietary patterns and nutritional status in community-dwelling older adults-the PEN-3S study. Eur J Clin Nutr 75, 521–530.Google Scholar

Samieri, C, Sonawane, AR, Lefèvre-Arbogast, S, et al. (2020) Using network science tools to identify novel diet patterns in prodromal dementia. Neurology 94, e2014–25.Google Scholar

Shang, X, Li, Y, Xu, H, et al. (2020) Leading dietary determinants identified using machine learning techniques and a healthy diet score for changes in cardiometabolic risk factors in children: a longitudinal analysis. Nutr J 19, 105.Google Scholar

Wright, DM, McKenna, G, Nugent, A, et al. (2020) Association between diet and periodontitis: a cross-sectional study of 10 000 NHANES participants. Am J Clin Nutr 112, 1485–1491.Google Scholar

Xia, Y, Zhao, Z, Zhang, S, et al. (2020) Complex dietary topologies in non-alcoholic fatty liver disease: a network science analysis. Front Nutr 7, 579086.Google Scholar

Zhao, Y, Naumova, EN, Bobb, JF, et al. (2021) Joint associations of multiple dietary components with cardiovascular disease risk: a machine-learning approach. Am J Epidemiol 190, 1353–1365.Google Scholar

de Almeida Alves, M, Molina, MDCB, da Fonseca, MDJM, et al. (2022) Different statistical methods identify similar population-specific dietary patterns: an analysis of Longitudinal Study of Adult Health (ELSA-Brasil). Br J Nutr 128, 2249–2257.Google Scholar

Farmer, N, Lee, LJ, Powell-Wiley, TM, et al. (2020) Cooking frequency and perception of diet among US adults are associated with US healthy and healthy Mediterranean-style dietary related classes: a latent class profile analysis. Nutrients 12, 3268.Google Scholar

Schwedhelm, C, Lipsky, LM, Shearrer, GE, et al. (2021) Using food network analysis to understand meal patterns in pregnant women with high and low diet quality. Int J Behav Nutr Phys Act 18, 101.Google Scholar

Panaretos, D, Koloverou, E, Dimopoulos, AC, et al. (2018) A comparison of statistical and machine-learning techniques in evaluating the association between dietary patterns and 10-year cardiometabolic risk (2002–2012): the ATTICA study. Br J Nutr 120, 326–334.Google Scholar

Hanson, KL & Connor, LM (2014) Food insecurity and dietary quality in US adults and children: a systematic review. Am J Clin Nutr 100, 684–692.Google Scholar

Doan, N, Olstad, DL, Vanderlee, L, et al. (2022) Investigating the intersections of racial identity and perceived income adequacy in relation to dietary quality among adults in Canada. J Nutr 13, 67S–75S.Google Scholar

Hiza, HAB, Casavale, KO, Guenther, PM, et al. (2013) Diet quality of Americans differs by age, sex, race/ethnicity, income, and education level. J Acad Nutr Diet 113, 297–306.Google Scholar

Olstad, DL, Nejatinamini, S, Victorino, C, et al. (2021) Trends in socioeconomic inequities in diet quality between 2004 and 2015 among a nationally representative sample of children in Canada. J Nutr 151, 3781–3794.Google Scholar

Brassard, D, Munene, LAE, Pierre, SS, et al. (2022) Evaluation of the Healthy Eating Food Index (HEFI)-2019 measuring adherence to Canada’s Food Guide 2019 recommendations on healthy food choices. Appl Physiol Nutr Metab (Internet) 47, 582–594. https://pubmed.ncbi.nlm.nih.gov/35030069/ Google Scholar

Wang, A, Ramaswamy, VV & Russakovsky, O (2022) Towards intersectionality in machine learning: including more identities, handling underrepresentation, and performing evaluation. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency (Internet). New York, NY, USA: Association for Computing Machinery; pp. 336–49. (FAccT ’22). https://dl.acm.org/doi/10.1145/3531146.3533101 (accessed 01 April 2024).Google Scholar

Robinson, WR, Renson, A & Naimi, AI (2020) Teaching yourself about structural racism will improve your machine learning. Biostatistics 21, 339–344.Google Scholar

Rojas, JC, Fahrenbach, J, Makhni, S, et al. (2022) Framework for integrating equity into machine learning models: a case study. Chest 161, 1621–1627.Google Scholar

Rajkomar, A, Hardt, M, Howell, MD, et al. (2018) Ensuring fairness in machine learning to advance health equity. Ann Internal Med 169, 866–872.Google Scholar

Lachat, C, Hawwash, D, Ocké, MC, et al. (2016) Strengthening the Reporting of Observational Studies in Epidemiology-Nutritional Epidemiology (STROBE-nut): an extension of the STROBE statement. PLoS Med (Internet) 13. https://pubmed-ncbi-nlm-nih-gov.proxy.lib.uwaterloo.ca/27270749/ Google Scholar

Liu, X, Cruz Rivera, S, Moher, D, et al. (2020) Reporting guidelines for clinical trial reports for interventions involving artificial intelligence: the CONSORT-AI extension. Nat Med 26, 1364–1374.Google Scholar

Spicker, D, Nazemi, A, Hutchinson, J, et al. (2023) Challenges for predictive modeling with neural network techniques using error-prone dietary intake data (Internet) arXiv. http://arxiv.org/abs/2311.09338 (accessed 20 February 2024).Google Scholar

Lampignano, L, Tatoli, R, Donghia, R, et al. (2023) Nutritional patterns as machine learning predictors of liver health in a population of elderly subjects. Nutr Metab Cardiovasc Dis 33, 2233–2241.Google Scholar

Slurink, IA, Corpeleijn, E, Bakker, SJ, et al. (2023) Dairy consumption and incident prediabetes: prospective associations and network models in the large population-based Lifelines Study. Am J Clin Nutr 118, 1077–1090.Google Scholar

Kirk, D, Catal, C & Tekinerdogan, B (2021) Precision nutrition: a systematic literature review. Comput Biol Med 133, 104365.Google Scholar

Breiman, L (2001) Random forests. Mach Learn 45, 5–32.Google Scholar

Song, YY & Lu, Y (2015) Decision tree methods: applications for classification and prediction. Shanghai Arch Psychiatry 27, 130–135.Google Scholar

Grossi, E & Buscema, M (2007) Introduction to artificial neural networks. Eur J Gastroenterol Hepatol 19, 1046.Google Scholar

Bianchi, D, Calogero, R & Tirozzi, B (2007) Kohonen neural networks and genetic classification. Math Comput Modell 45, 34–60.Google Scholar

Drton, M & Maathuis, MH (2017) Structure learning in graphical modeling. Ann Rev Stat Appl 4, 365–393. –https://doi.org/101146/annurev-statistics-060116-053803.Google Scholar

Altenbuchinger, M, Weihs, A, Quackenbush, J, et al. (2020) Gaussian and mixed graphical models as (multi-)omics data analysis tools. Biochimica et Biophysica Acta (BBA) - Gene Regulatory Mechanisms 1863, 194418.Google Scholar

Koller, D & Friedman, N (2009) Probabilistic Graphical Models: Principles and Techniques. Cambridge, MA: MIT Press.Google Scholar

Liu, H, Han, F, Yuan, M, et al. (2012) High-dimensional semiparametric Gaussian copula graphical models. Ann Statistics 40, 2293–2326.Google Scholar

Freund, Y & Schapire, RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55, 119–139.Google Scholar

Bobb, JF, Valeri, L, Claus Henn, B, et al. (2015) Bayesian kernel machine regression for estimating the health effects of multi–pollutant mixtures. Biostatistics 16, 493–508.Google Scholar

Aitchison, J (1982) The statistical analysis of compositional data. J Royal Stat Soc Ser B (Methodol) 44, 139–177.Google Scholar

Rodrigues, LA, Daunis-i-Estadella, P, Figueras, G, et al. (2011) Flying in compositional morphospaces: evolution of limb proportions in flying vertibrates. In Compositional Data Analysis: Theory and Applications, pp. 235–254 [Pawlowsky-Glahn, V and Buccianti, A, editors]. Chichester: Wiley.Google Scholar

Zhang, CH & Huang, J (2008) The sparsity and bias of the Lasso selection in high-dimensional linear regression. Ann Stat 36, 1567–1594.Google Scholar

Oliveira, A, Lopes, C, Torres, D, et al. (2021) Application of a latent transition model to estimate the usual prevalence of dietary patterns. Nutrients 13, 133.Google Scholar

Fahey, MT, Thane, CW, Bramwell, GD, et al. (2007) Conditional Gaussian mixture modelling for dietary pattern analysis. J Royal Stat Soc Ser A (Stat Soc) 170, 149–166.Google Scholar

Cover, T & Thomas, J (2005) Elements of Information Theory (Internet) Wiley. https://books-scholarsportal-info.proxy.lib.uwaterloo.ca/en/read?id=/ebooks/ebooks2/pda/2011–12–01/1/13568.9780471748816#page=1 (accessed 23 August 2023).Google Scholar

Gorst-Rasmussen, A, Dahm, C, Dethlefsen, C, et al. (2011) Exploring dietary patterns by using the treelet transform. Am J Epidemiol 173, 1097–1104.Google Scholar

Lee, AB, Nadler, B & Wasserman, L (2008) Treelets—an adaptive multi-scale basis for sparse unordered data. Ann Appl Stat 2, 435–471.Google Scholar

Table 1. Study characteristics across included studies applying novel methods to characterise dietary patterns

Table 2. Characteristics of studies (n 24) identified in a scoping review of novel analytic methods to characterise dietary patterns

Table 3. Description of dietary patterns (n 24) identified in a scoping review of novel analytic methods to characterise dietary patterns

Table 4. Novel methods applied to identify dietary patterns across included studies*

Hutchinson et al. supplementary material 1

Hutchinson et al. supplementary material

File 112.2 KB

Hutchinson et al. supplementary material 2

Hutchinson et al. supplementary material

File 77.2 KB

Article contents

Advances in methods for characterising dietary patterns: a scoping review

Abstract

Keywords

Methods

Defining novel methods

Identifying relevant studies

Study selection

Data extraction

Results

Summary of search

Characteristics of included studies

Novel methods applied to identify dietary patterns

Discussion

Conclusion

Acknowledgements

Supplementary material

References

Hutchinson et al. supplementary material 1

Hutchinson et al. supplementary material 2

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests