Diagnostic prediction models to identify patients at risk for healthcare-facility–onset Clostridioides difficile: A systematic review of methodology and reporting

William M. Patterson; Jesse Fajnzylber; Neil Nero; Adrian V. Hernandez; Abhishek Deshpande

doi:10.1017/ice.2023.185

Diagnostic prediction models to identify patients at risk for healthcare-facility–onset Clostridioides difficile: A systematic review of methodology and reporting

Published online by Cambridge University Press: 04 September 2023

and

William M. Patterson: Affiliation:
Cleveland Clinic Lerner College of Medicine of Case Western Reserve University, Cleveland, Ohio, United States
Jesse Fajnzylber: Affiliation:
Cleveland Clinic Lerner College of Medicine of Case Western Reserve University, Cleveland, Ohio, United States
Neil Nero: Affiliation:
Education Institute, Floyd D. Loop Alumni Library, Cleveland Clinic, Cleveland, Ohio, United States
Adrian V. Hernandez: Affiliation:
Health Outcomes, Policy, and Evidence Synthesis (HOPES) Group, University of Connecticut School of Pharmacy, Storrs, Connecticut, United States Unidad de Revisiones Sistemáticas y Meta-análisis (URSIGET), Vicerrectorado de Investigación, Universidad San Ignacio de Loyola (USIL), Lima, Peru
Abhishek Deshpande*: Affiliation:
Cleveland Clinic Lerner College of Medicine of Case Western Reserve University, Cleveland, Ohio, United States Center for Value-Based Care Research, Primary Care Institute, Cleveland Clinic, Cleveland, Ohio, United States Department of Infectious Diseases, Respiratory Institute, Cleveland Clinic, Cleveland, Ohio, United States
*: Corresponding author: Abhishek Deshpande; Email: [email protected]

Article contents

Abstract
Objective:
Background:
Methods:
Results:
Conclusion:
Methods
Results
Discussion
Supplementary material
Financial support
Conflicts of interest
Footnotes
References

Rights & Permissions

Abstract

Objective:

To systematically review the methodology, performance, and generalizability of diagnostic models for predicting the risk of healthcare-facility–onset (HO) Clostridioides difficile infection (CDI) in adult hospital inpatients (aged ≥18 years).

Background:

CDI is the most common cause of healthcare-associated diarrhea. Prediction models that identify inpatients at risk of HO-CDI have been published; however, the quality and utility of these models remain uncertain.

Methods:

Two independent reviewers evaluated articles describing the development and/or validation of multivariable HO-CDI diagnostic models in an inpatient setting. All publication dates, languages, and study designs were considered. Model details (eg, sample size and source, outcome, and performance) were extracted from the selected studies based on the CHARMS checklist. The risk of bias was further assessed using PROBAST.

Results:

Of the 3,030 records evaluated, 11 were eligible for final analysis, which described 12 diagnostic models. Most studies clearly identified the predictors and outcomes but did not report how missing data were handled. The most frequent predictors across all models were advanced age, receipt of high-risk antibiotics, history of hospitalization, and history of CDI. All studies reported the area under the receiver operating characteristic curve (AUROC) as a measure of discriminatory ability. However, only 3 studies reported the model calibration results, and only 2 studies were externally validated. All of the studies had a high risk of bias.

Conclusion:

The studies varied in their ability to predict the risk of HO-CDI. Future models will benefit from the validation on a prospective external cohort to maximize external validity.

Type: Original Article
Information: Infection Control & Hospital Epidemiology , Volume 45 , Issue 2 , February 2024 , pp. 174 - 181

DOI: https://doi.org/10.1017/ice.2023.185 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright: © The Author(s), 2023. Published by Cambridge University Press on behalf of The Society for Healthcare Epidemiology of America

Clostridioides difficile (formerly Clostridium difficile) is the most common cause of healthcare-associated diarrhea in the United States and is associated with increased morbidity, mortality, and a mean attributable cost of US$21,448.^{Reference Marra, Perencevich and Nelson1,Reference Zhang, Palazuelos-Munoz, Balsells, Nair, Chit and Kyaw2} Elderly individuals, those receiving antibiotics, and those with a prolonged hospital stay are at an increased risk of healthcare-facility–onset (HO) C. difficile infection (CDI).^{Reference Dubberke and Olsen3} Numerous studies have developed diagnostic risk-prediction models to identify inpatients with the greatest risk of developing HO-CDI. Two potential advantages of these risk-prediction models are that they can help clinicians intervene by (1) optimizing patients’ exposure to high-risk antibiotics during their hospitalization and (2) targeting them for preventive treatment. However, the accuracy, quality, and clinical utility of these prediction models remain uncertain, which may contribute to their limited adoption in clinical settings.

Prediction models can be categorized as diagnostic or prognostic. Diagnostic models predict the occurrence of a disease state (eg, development of HO-CDI), whereas prognostic models predict outcomes within a specific disease state (eg, recurrence and mortality in patients with CDI).^{Reference van Smeden, Reitsma, Riley, Collins and Moons4} In the field of infectious diseases, both types of models are commonly used to guide clinical decision making (eg, identifying individuals who would benefit the most from Paxlovid). However, risk models for CDI have not been widely adopted.^{Reference Najjar-Debbiny, Gronich and Weber5} Understanding the reasons for limited uptake can aid future researchers in model design and reporting. A review of all clinical prediction models published in 6 high-impact journals in 2008 found significant issues, such as unclear study designs, lack of prospective validation cohorts, dichotomization of continuous variables, clear overfitting, or no calibration.^{Reference Bouwmeester, Zuithoff and Mallett6} The objective of our study was to systematically review and critically assess the methodology, reporting and generalizability of current diagnostic risk prediction models for HO-CDI in adults.

Methods

The study procedures, methods, and reporting adhere to the 2020 Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement and TRIPOD-SMRA guidelines.^{Reference Page, Moher and Bossuyt7,Reference Snell, Levis and Damen8} The guideline checklist is available in the Supplementary Material (online).

Study selection

Eligible articles included those presenting new multivariable diagnostic models, evaluating prior models with new data for predicting the risk of developing HO-CDI, and extending previously published models (considered a new model). We defined ‘diagnostic model’ as any statistical model predicting the risk or odds of inpatients developing HO-CDI ≥48 hours after admission.^{Reference Collins, Moons, Debray, Altman and Riley9} We included studies of any design, language, and publication date before January 21, 2022, with any timeframe (eg, prospective, retrospective). We excluded studies that included pediatric patients (aged <18 years), community-associated CDI, performed solely univariate analysis, used control group of patients with diarrhea, or described diagnostic models for individuals with a diagnosis of CDI.

Search strategy

We conducted simultaneous searches of Medline (Ovid), EMBASE (Ovid), and the Cochrane Library from inception until January 20, 2022. A comprehensive search strategy was developed with a medical librarian (N.N.) using the following terms: (Clostridioides OR Clostridium OR C difficile) AND (predict* OR risk*) AND (model* OR tool*). Indexing terms and keywords were combined with Boolean and proximity operators. No search limits were applied, and an additional 18 records were identified by manually searching a narrative review. The complete search strategy for all queried databases is available in the Supplementary Material (online).

The article selection process occurred in 2 rounds using the Covidence systematic review software (Veritas Health Innovation, Melbourne, Australia). Records with abstract-only content (conference proceedings) were excluded due to insufficient content to analyze the model methodology. Two independent evaluators (W.P. and J.F.) screened and assessed search results, resolving discrepancies through joint review and consultation with an independent third researcher (A.D.). Full-text of all titles that met the inclusion criteria by a majority vote were obtained and evaluated to make a final decision to include or exclude the article from the study. In all the rounds, reviewers referenced the systematic review question to determine study inclusion. Articles selected for review were evaluated using prespecified model evaluation criteria informed by a literature review and expert opinion, in addition to the PRISMA guidelines.^{Reference Page, Moher and Bossuyt7}

Evaluation of development and validation of diagnostic models

Data were independently extracted (W.P. and J.F.) using the checklist for critical appraisal and the data extraction for systematic reviews of prediction modelling studies (CHARMS) checklist.^{Reference Moons, de Groot and Bouwmeester10} Discrepancies were resolved through discussion and consultation with a third author (A.D.). Extracted data included sources, countries, study populations, participant types (eg, ICU, medicine ward), outcome to be predicted, candidate predictors, number of events, sample sizes, missing data, model development, performance, evaluation, data presentation, interpretation, and discussion.

To evaluate the risk prediction models, we defined a priori the criteria to determine a successful prediction model: (1) clearly state both target population and the outcome of interest; (2) ensure there is a representative sample with adequate size both for model development and validation; (3) use statistical methods appropriate for the question they are asking (eg, decision trees, logistic regression) and use cross-validation, bootstrapping (random sampling with replacement), and independent validation sets for internal and external validity; and (4) evaluate their discrimination and calibration, using measures such as the area under the receiver operating characteristic (ROC) curve and a calibration plot or table comparing predicted and observed outcome probabilities. We also assessed model performance using sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and positive/negative likelihood ratios (LR+/LR−).

Risk of bias and applicability assessment

Two reviewers (W.P. and J.F.) independently assessed each study for risk of bias (ROB) using PROBAST (Prediction model Risk Of Bias ASsessment Tool).^{Reference Wolff, Moons and Riley11} ROB refers to the potential bias introduced by the study design, conduct, or analysis on the predictive performance of the model. PROBAST provides questions that assess ROB in four domains (participants, predictors, outcome, and analysis) and concerns regarding applicability in 3 domains of prediction models: high, low, or uncertain. Concerns for applicability occur when study elements do not align with the review question (ie, studies conducted in large academic medical centers in major cities might not be as applicable to rural settings).

Statistical analyses

Categorical factors were summarized as frequencies and percentages, whereas continuous measures were described as means, standard deviations (SD), medians, and interquartile ranges (IQRs). We calculated the relationship between the number of events in a sample set, the number of model candidate coefficients (events per variable or EPV), and a correlation coefficient. An EPV >10 is generally accepted to maintain bias and variability at acceptable levels.^{Reference Peduzzi, Concato, Kemper, Holford and Feinstem12} No meta-analysis (quantitative synthesis of the data) was conducted. All analyses were performed using R version 4.1.0 statistical software (R Core Team, Vienna, Austria).

Results

Study selection

After full-text review, 11 citations were eligible (Fig. 1). Two groups developed an index model to be validated on external data in a separate, later study (one of these proposed an additional, updated model).^{Reference Tilton, Sexton and Johnson13–Reference Chandra, Latt and Jariwala16} Two studies reported 2 risk-prediction models each; thus, our review assessed a total of 12 risk-prediction models from 11 articles.^{Reference Voicu, Popescu and Florescu17,Reference Oh, Makar and Fusco18} We used 11 as the denominator when reference was made to studies and 12 when reference was made to prediction models.

Figure 1. PRISMA flowchart of study selection.

Study details

In total, 10 (91%) studies were conducted in North America and 1 (9%) in Europe. Also, 6 studies (55%) were conducted at a single-center healthcare facility, and 5 (45%) studies were multicenter studies ranging from 2 to 6 centers. No other international studies were included.

Data sources

All studies used retrospective data from electronic medical records for model development. Three models (25%) were validated with bootstrapped simulations (Table 1).^{Reference Oh, Makar and Fusco18,Reference Tabak, Johannes, Sun, Nunez and McDonald19} Most studies (n = 9, 82%) did not report or explicitly state how missing values were handled, except 1 study which utilized complete case analysis and 1 study that imputed missing values.^{Reference Chandra, Thapa, Marur and Jani15,Reference Davis, Sparrow, Ikwuagwu, Musick, Garey and Perez20}

Table 1. Evaluating Performance of Risk Prediction Models

Note. AUROC, area under the receiver operating characteristic.

Participants

The number of participants in both the model development and validation set(s) were clearly defined for all studies. The median number of participants utilized in model development was 49,231 (IQR, 367–68,809), and the median number of events used to develop models was 303 (IQR, 146–504) (Table 2).

Table 2. Counts of Coefficients, Performance, Sample Size, and Events by Study

Outcome

Most studies (n = 10, 91%) identified patients with CDI using either toxin enzyme immunoassays (EIA) targeting glutamate dehydrogenase (GDH) or toxin A/B and/or a nucleic acid amplification test (NAAT) for toxin B gene. Only 2 studies used a 2-step algorithm (EIA and NAAT) to identify patients. One study considered only patients who had a positive cytotoxicity assay (CTA) (Supplementary Table S1 online).^{Reference Garey, Dao-Tran, Jiang, Price, Gentry and DuPont21} Also, 8 studies (73%) defined HO-CDI as CDI ≥48 hours after hospital admission; 2 studies (18%) defined HO-CDI as CDI ≥72 hours after admission; and 1 (9%) did not provide a clear definition (Table S2).

Variable selection

In total, 6 studies (55%) limited predictors to those present at admission (static), whereas 5 studies (45%) included clinical values that fluctuated throughout hospital admission (dynamic). All but 1 model discretized some or all continuous variables. Most studies (n = 7, 64%) selected variables using stepwise approaches (forward or backward), whereas others used methods that were less susceptible to bias (L2 shrinkage or clustering methods) (Table 3). Models used from 2 to 20 different coefficients in their final model, with a median of 5.5 coefficients (IQR, 4–11.25). The most frequent predictors across all models were age (67%), receipt of ‘high-risk antibiotics’ as defined by literature available at the time of model publication (42%), and a history of either hospitalizations or CDI (each 33%) (Fig. 2 and Supplementary Table S3 online). The median EPV of the studies included in this analysis was 9.1 (IQR, 0.8–18.6), and 4 studies had an EPV <10.^{Reference Peduzzi, Concato, Kemper, Holford and Feinstem12} The number of coefficients in each model was modestly correlated with the log of the number of events in the sample size (r = 0.67) (Fig. 3).

Table 3. Issues in Model Development

Figure 2. Count of predictors in 12 models of C. difficile infection.

Figure 3. Number of events (CDI infection) versus number of model coefficients.

Model validation, performance, and presentation

All studies reported AUROC as a discriminatory measure, with point estimates ranging from 0.68 (fair) to 0.95 (excellent) for final models (median, 0.8; IQR, 0.75–0.80). Most studies (n = 8; 73%) did not report any calibration (agreement between predictions and observations) metrics and 3 studies displayed calibration plots to demonstrate the accuracy of their models over the entire probability range (Table 1).^{Reference Chandra, Thapa, Marur and Jani15,Reference Oh, Makar and Fusco18,Reference Tabak, Johannes, Sun, Nunez and McDonald19} Of the 7 models that reported values on validation and derivation sets, 3 models observed a performance penalty between their development and validation performance, and 2 reported surprising improvements in performance when applied to validation data sets (Supplementary Table S4 online). Most studies had a high NPV but a low PPV, with prevalence ranging from 4.1 to 68.1 per 1,000 persons (Supplementary Table S5 online).

ROB and applicability to review question

Individual domain-specific and overall judgements for the ROB and concerns for applicability are displayed graphically in Figure 4. Here, we briefly summarize the key results from each domain.

Participants: In the participants domain, almost all studies (n = 9, 82%) had a low ROB. Only 1 study had a high ROB, and the procedure for screening was not specified nor were the exclusion criteria justified in the text.^{Reference Tilton and Johnson14}
Predictors: Most studies (n = 10, 91%) were listed as having a low ROB introduced by predictors or their assessment. One study used variables highly specific to a cirrhosis, which are thus not likely to be externally generalizable.^{Reference Voicu, Popescu and Florescu17}
Outcome: Many studies (n = 7, 64%) had a low ROB introduced by the outcome or its determination. Three studies had high ROBs due to the inclusion of diarrhea (a predictor in their models) in their outcome definition, and a fourth study changed case definitions over the course of data collection.
Analysis: All studies were at high risk of bias in the analysis phase. Most studies (n = 7, 64%) used univariable analysis to select the predictors.^{Reference Tilton, Sexton and Johnson13,Reference Tilton and Johnson14,Reference Chandra, Latt and Jariwala16,Reference Voicu, Popescu and Florescu17,Reference Tabak, Johannes, Sun, Nunez and McDonald19,Reference Garey, Dao-Tran, Jiang, Price, Gentry and DuPont21,Reference Cooper, Heuer and Warren22} All but 2 studies had either unclear or high ROB due to the handling of missing data (either by omitting information or using only complete cases), and almost all did not account for model overfitting and optimism.

Figure 4. Results from the PROBAST analysis for risk of bias (a) and applicability (b) domain.

All studies had high ROB overall (n = 11, 100%); conversely, most studies had low concern for applicability (n = 7, 64%) (Supplementary Fig. S1 online).

Discussion

In this systematic review, we assessed the quality and clinical utility of 12 diagnostic prediction models for HO-CDI. Our findings revealed that all studies clearly identified the study design, participant selection process, and outcome definition. Most studies relied on either toxin EIA or NAAT as a standalone test to identify patients with CDI and only 2 studies used a 2-step diagnostic algorithm. The most common predictors across all models included advanced age, receipt of high-risk antibiotics, history of hospitalization, and history of CDI. We identified several methodological issues and reporting deficiencies in the prediction models. Specifically, most studies did not report the occurrence and handling of missing data or model calibration results, and none of them were externally validated. These issues, along with other limitations, contributed to the high ROB and limited reliability and clinical utility of these models.

We evaluated models developed using inpatient data to predict the diagnosis of HO-CDI and assessed their quality and clinical utility. All the reviewed studies exhibited high ROB according to the PROBAST review tool. Notably, only 2 studies explicitly described how they handled missing data,^{Reference Chandra, Thapa, Marur and Jani15,Reference Davis, Sparrow, Ikwuagwu, Musick, Garey and Perez20} raising concerns about bias and the reliability of conclusions drawn from these data.^{Reference Rao and Rao23} One study used complete case analysis without providing a compelling argument for data missing completely at random, which can introduce biased coefficient estimates.^{Reference Gorelick24} Almost all models discretized some or all continuous predictors, diminishing their predictive power and external validity.^{Reference MacCallum, Zhang, Preacher and Rucker25}

What is known in the literature and what our study adds

A narrative review published in 2021 discussed some models predicting the risk of incident or recurrent CDI.^{Reference Rao and Dubberke26} The researchers emphasized the need for developing a validated incident CDI model and highlighted the key challenges to model portability. Our report builds on these findings by systematically evaluating and reporting the risk of bias in each study. A systematic review published in 2022 evaluated machine learning models for predicting CDI and outcomes of CDI.^{Reference Chen, Xi and Johnson27} They highlighted that (1) most studies did not use an algorithmic definition of identify CDI electronically and that the definitions were highly variable and (2) none of the studies reported validation of those definitions. Our systematic review is more comprehensive, focusing on all types of diagnostic risk prediction models for HO-CDI, and it underscores the high risk of bias and reporting issues present in these models. We identified 4 broad areas for improvement in future studies.

Our systematic review has identified several methodological issues. Prediction models are prone to overfitting data and their training set or learning spurious correlations.^{Reference Berisha, Krantsevich and Hahn28} We evaluated the risk of overfitting by calculating the EPV for each study. The calculated EPV in most studies was <10. However, we may have overestimated this because it was calculated using the number of candidate predictors instead of the number of parameters. A threshold of 10 has been proposed by Peduzzi, but the effect strength and correlation structure may influence this value.^{Reference Peduzzi, Concato, Kemper, Holford and Feinstem12,Reference Heinze, Wallisch and Dunkler29} Regardless, the median EPV in this analysis was less than the proposed threshold, with notable exceptions of Cooper et al^{Reference Cooper, Heuer and Warren22} in 2013 and Davis et al^{Reference Davis, Sparrow, Ikwuagwu, Musick, Garey and Perez20} in 2018 (EPV, 81.3 and 92.6, respectively). Most models in our study used a stepwise approach to variable selection and model calibration, which can introduce bias in the estimation of the regression coefficients.^{Reference Heinze, Wallisch and Dunkler29,Reference Steyerberg, Eijkemans and Habbema30} It has been suggested that variable selection should be accompanied by stability investigation, a step not clearly described in any of the studies.^{Reference Heinze, Wallisch and Dunkler29}

Secondly, all studies exhibited high ROB in the analysis phase of model development, as assessed by PROBAST. Many studies (n = 7, 64%) used a significance threshold (eg, P < .05) as a discrimination metric for inclusion in the final prediction model. However, this method can result in errors of inclusion or exclusion; individual predictors are tested for significance only with respect to the outcome, not in the context of information from other predictors.^{Reference Sun, Shook and Kay31,Reference Greenland32}

Thirdly, most studies (n = 9, 82%) did not clearly report how they handled missing data.^{Reference Tilton, Sexton and Johnson13–Reference Tabak, Johannes, Sun, Nunez and McDonald19,Reference Cooper, Heuer and Warren22,Reference Press, Ku, McCullagh, Rosen, Richardson and McGinn33} Excluding missing data without examining the pattern of missingness or using imputation methods can introduce bias in the regression coefficients of prediction models.^{Reference Janssen, Donders and Harrell34} Consequently, all studies in our analyses were judged to have an overall high ROB.

Lastly, model performance was difficult to compare because of inconsistent reporting. Although the AUC is the most frequently reported metric to evaluate discrimination, individual performance varies widely. In addition, most models did not report independent derivation and validation sets and lacked calibration metrics. The high negative predictive value (NPV) and low positive predictive value (PPV) observed in most models reflected the low prevalence of CDI in the included studies. These methodological issues raise concerns about the reliability and generalizability of the model’s findings.

Our study had several limitations. We did not include abstracts and conference proceedings in our systematic review because they did not have sufficient data on model development or validation for ROB assessment. It is also possible that some studies were published outside the searched databases, although we attempted to minimize bias by manually searching references and including gray literature. The included studies varied in terms of design, diagnostic tests used to detect CDI, definition of CDI, and analytical methods. In many studies, the data were either insufficient or unavailable for further evaluation.

Future recommendations

Future studies should consider using a diagnostic algorithm to identify patients with active CDI because it may improve model performance. As recommended by the TRIPOD guidelines, diagnostic models should undergo validation on an external prospective cohort, ensuring their external validity and generalizability.^{Reference Collins, Reitsma, Altman and Moons35} Lastly, the use of checklists such as PROBAST can help identify sources of bias that often distort model performance.

In conclusion, the diagnostic models for HO-CDI exhibited considerable variation in their ability to estimate the risk of CDI. Most of these models demonstrated methodological shortcomings and provided inadequate information about calibration for feasible implementation in alternative clinical settings. Future studies should consider external validation of their model to improve its generalizability and clinical uptake of their model. A robust model can help identify high-risk patients and enable clinicians to optimize their exposure to high-risk antibiotics during their hospitalization and target them for preventive treatment.

Supplementary material

To view supplementary material for this article, please visit https://doi.org/10.1017/ice.2023.185

Acknowledgments

Analytic code is available via GitHub at https://github.com/pattwm16/CDIModelSysReview. Any other materials used in the review are available via reasonable request to the authors.

Financial support

No financial support was provided relevant to this article.

Conflicts of interest

A.D. has received research support from Clorox and Seres Therapeutics not related to this study and is a consultant for Merck. All other authors have no conflicts of interest relevant to this article.

Footnotes

Authors of equal contribution.

References

Marra, AR, Perencevich, EN, Nelson, RE, et al. Incidence and outcomes associated with Clostridium difficile infections: a systematic review and meta-analysis. JAMA Netw Open 2020;3:e1917597.CrossRef Google Scholar PubMed

Zhang, S, Palazuelos-Munoz, S, Balsells, EM, Nair, H, Chit, A, Kyaw, MH. Cost of hospital management of Clostridium difficile infection in United States—a meta-analysis and modelling study. BMC Infect Dis 2016;16:447.CrossRef Google Scholar PubMed

Dubberke, ER, Olsen, MA. Burden of Clostridium difficile on the healthcare system. Clin Infect Dis. 2012;55 suppl 2:S88–S92.10.1093/cid/cis335CrossRef Google Scholar

van Smeden, M, Reitsma, JB, Riley, RD, Collins, GS, Moons, KG. Clinical prediction models: diagnosis versus prognosis. J Clin Epidemiol 2021;132:142–145.CrossRef Google Scholar PubMed

Najjar-Debbiny, R, Gronich, N, Weber, G, et al. Effectiveness of paxlovid in reducing severe coronavirus disease 2019 and mortality in high-risk patients. Clin Infect Dis 2023;76:e342–e349.10.1093/cid/ciac443CrossRef Google Scholar PubMed

Bouwmeester, W, Zuithoff, NPA, Mallett, S, et al. Reporting and methods in clinical prediction research: a systematic review. PLoS Med 2012;9:1–12.CrossRef Google Scholar PubMed

Page, MJ, Moher, D, Bossuyt, PM, et al. PRISMA 2020 explanation and elaboration: updated guidance and exemplars for reporting systematic reviews. BMJ 2021; 372:n160.CrossRef Google Scholar PubMed

Snell, KIE, Levis, B, Damen, JAA, et al. Transparent reporting of multivariable prediction models for individual prognosis or diagnosis: checklist for systematic reviews and meta-analyses (TRIPOD-SRMA). BMJ 2023;381:e073538.10.1136/bmj-2022-073538CrossRef Google Scholar PubMed

Collins, GS, Moons, KGM, Debray, TPA, Altman, DG, Riley, RD. Systematic reviews of prediction models. In: Systematic Reviews in Health Research: Meta-Analysis in Context, Third Edition. Hoboken, NJ: John Wiley & Sons; 2022:347–376.CrossRef Google Scholar

Moons, KGM, de Groot, JAH, Bouwmeester, W, et al. Critical appraisal and data extraction for systematic reviews of prediction modelling studies: the CHARMS checklist. PLoS Med 2014;11:e1001744.CrossRef Google Scholar PubMed

Wolff, RF, Moons, KGM, Riley, RD, et al. PROBAST: a tool to assess the risk of bias and applicability of prediction model studies. Ann Intern Med 2019;170:51–58.CrossRef Google Scholar PubMed

Peduzzi, P, Concato, J, Kemper, E, Holford, TR, Feinstem, AR. A simulation study of the number of events per variable in logistic regression analysis. J Clin Epidemiol 1996;49:1373–1379.10.1016/S0895-4356(96)00236-3CrossRef Google Scholar PubMed

Tilton, CS, Sexton, ME, Johnson, SW, et al. Evaluation of a risk assessment model to predict infection with healthcare facility–onset Clostridioides difficile . Am J Health-Syst Pharm 2021;78:1681–1690.CrossRef Google Scholar PubMed

Tilton, CS, Johnson, SW. Development of a risk prediction model for hospital-onset Clostridium difficile infection in patients receiving systemic antibiotics. Am J Infect Control 2019;47:280–284.CrossRef Google Scholar PubMed

Chandra, S, Thapa, R, Marur, S, Jani, N. Validation of a clinical prediction scale for hospital-onset Clostridium difficile infection. J Clin Gastroenterol 2014;48:419–422.CrossRef Google Scholar PubMed

Chandra, S, Latt, N, Jariwala, U, et al. A cohort study for derivation and validation of a clinical prediction scale for hospital-onset Clostridium difficile infection. Can J Gastroenterol 2012;26:885–888.CrossRef Google Scholar PubMed

Voicu, MN, Popescu, F, Florescu, DN, et al. Clostridioides difficile infection among cirrhotic patients with variceal bleeding. Antibiotics (Basel) 2021;10:731.CrossRef Google Scholar PubMed

Oh, J, Makar, M, Fusco, C, et al. A generalizable, data-driven approach to predict daily risk of Clostridium difficile infection at two large academic health centers. Infect Control Hosp Epidemiol 2018;39:425–433.CrossRef Google Scholar PubMed

Tabak, YP, Johannes, RS, Sun, X, Nunez, CM, McDonald, LC. Predicting the risk for hospital-onset Clostridium difficile infection (HO-CDI) at the time of inpatient admission: HO-CDI risk score. Infect Control Hosp Epidemiol 2015;36:695–701.CrossRef Google Scholar PubMed

Davis, ML, Sparrow, HG, Ikwuagwu, JO, Musick, WL, Garey, KW, Perez, KK. Multicentre derivation and validation of a simple predictive index for healthcare-associated Clostridium difficile infection. Clin Microbiol Infect 2018;24:1190–1194.CrossRef Google Scholar PubMed

Garey, KW, Dao-Tran, TK, Jiang, ZD, Price, MP, Gentry, LO, DuPont, HL. A clinical risk index for Clostridium difficile infection in hospitalized patients receiving broad-spectrum antibiotics. J Hosp Infect 2008;70:142–147.CrossRef Google Scholar PubMed

Cooper, PB, Heuer, AJ, Warren, CA. Electronic screening of patients for predisposition to Clostridium difficile infection in a community hospital. Am J Infect Control 2013;41:232–235.CrossRef Google Scholar

Rao, CR, Rao, DC. Essential Statistical Methods for Medical Statistics A Derivative of Handbook of Statistics: Epidemiology and Medical, volume 27. New York: Elsevier; 2011.Google Scholar

Gorelick, MH. Bias arising from missing data in predictive models. J Clin Epidemiol 2006;59:1115–1123.CrossRef Google Scholar PubMed

MacCallum, RC, Zhang, S, Preacher, KJ, Rucker, DD. On the practice of dichotomization of quantitative variables. Psychol Methods 2002;7:19–40.CrossRef Google Scholar PubMed

Rao, K, Dubberke, ER. Can prediction scores be used to identify patients at risk of Clostridioides difficile infection? Curr Opin Gastroenterol 2022;38:7–14.CrossRef Google Scholar PubMed

Chen, Y, Xi, M, Johnson, A, et al. Machine learning approaches to investigate Clostridioides difficile infection and outcomes: a systematic review. Int J Med Inform 2022;160:104706.CrossRef Google Scholar PubMed

Berisha, V, Krantsevich, C, Hahn, PR, et al. Digital medicine and the curse of dimensionality. NPJ Digit Med 2021;4:1–8.CrossRef Google Scholar PubMed

Heinze, G, Wallisch, C, Dunkler, D. Variable selection—a review and recommendations for the practicing statistician. Biometr J 2018;60:431–449.10.1002/bimj.201700067CrossRef Google Scholar PubMed

Steyerberg, EW, Eijkemans, MJC, Habbema, JDF. Stepwise selection in small data sets: a simulation study of bias in logistic regression analysis. J Clin Epidemiol 1999;52:935–942.CrossRef Google Scholar PubMed

Sun, GW, Shook, TL, Kay, GL. Inappropriate use of bivariable analysis to screen risk factors for use in multivariable analysis. J Clin Epidemiol 1996;49:907–916.10.1016/0895-4356(96)00025-XCrossRef Google Scholar PubMed

Greenland, S. Modeling and variable selection in epidemiologic analysis. Am J Public Health 2011;79:340–349.CrossRef Google Scholar

Press, A, Ku, BS, McCullagh, L, Rosen, L, Richardson, S, McGinn, T. Developing a Clinical Prediction Rule for First Hospital-Onset Clostridium difficile Infections: A Retrospective Observational Study. Infect Control Hosp Epidemiol. 2016;37:896–900.CrossRef Google Scholar PubMed

Janssen, KJM, Donders, ART, Harrell, FE, et al. Missing covariate data in medical research: to impute is better than to ignore. J Clin Epidemiol 2010;63:721–727.CrossRef Google Scholar PubMed

Collins, GS, Reitsma, JB, Altman, DG, Moons, K. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD Statement. BMC Med 2015;13:1.10.1186/s12916-014-0241-zCrossRef Google Scholar PubMed

Figure 1. PRISMA flowchart of study selection.

Table 1. Evaluating Performance of Risk Prediction Models

Table 2. Counts of Coefficients, Performance, Sample Size, and Events by Study

Table 3. Issues in Model Development

Figure 2. Count of predictors in 12 models of C. difficile infection.

Figure 3. Number of events (CDI infection) versus number of model coefficients.

Figure 4. Results from the PROBAST analysis for risk of bias (a) and applicability (b) domain.

Patterson et al. supplementary material

PDF 786.2 KB

Article contents

Diagnostic prediction models to identify patients at risk for healthcare-facility–onset Clostridioides difficile: A systematic review of methodology and reporting

Abstract

Methods

Study selection

Search strategy

Evaluation of development and validation of diagnostic models

Risk of bias and applicability assessment

Statistical analyses

Results

Study selection

Study details

Data sources

Participants

Outcome

Variable selection

Model validation, performance, and presentation

ROB and applicability to review question

Discussion

What is known in the literature and what our study adds

Future recommendations

Supplementary material

Acknowledgments

Financial support

Conflicts of interest

Footnotes

References

Patterson et al. supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests