The coronavirus disease 2019 (COVID-19) hospitalization rate is a key surveillance metric used by public health officials to estimate the population burden of severe acute respiratory coronavirus virus 2 (SARS-CoV-2) infections that are severe. COVID-19 hospitalization rates are also used by the US Centers for Disease Control and Prevention (CDC), in conjunction with case counts and percentage of inpatient beds occupied by COVID-19 patients, to estimate community COVID-19 risk levels that in turn inform recommendations such as indoor masking. 1
The CDC initially defined “COVID-19 hospitalizations” as any person hospitalized within 14 days of a positive PCR result for SARS-CoV-2, regardless of the patient’s presenting syndrome or reason for admission. 2 This definition initially served well for estimating the burden of severe illness, but widespread vaccination, universal testing, prolonged PCR positivity after infection, increasing rates of prior infection, and new and potentially milder SARS-CoV-2 variants such as Omicron have challenged the validity of this measure as a severity indicator. High community infection rates will lead to some patients hospitalized for reasons other than COVID-19 testing positive for SARS-CoV-2, including patients with mild, asymptomatic, or resolving infections. These so-called “incidental” SARS-CoV-2–positive patients are still counted by the traditional CDC definition as COVID-19 hospitalizations without differentiating them from patients hospitalized specifically for COVID-19.
Public health agencies and hospital officials have therefore proposed, and in many cases implemented, alternative definitions to identify hospitalizations specifically due to COVID-19 illness. These typically require receipt of SARS-CoV-2 therapeutics (eg, dexamethasone or remdesivir) or need for supplemental oxygen in addition to a positive PCR. Reference Fillmore, La and Zheng3–Reference Fatima5 Large cohort studies have also used different approaches for defining COVID-19 hospitalizations, including a positive PCR alone, Reference Gershengorn, Patel and Shukla6–Reference Myers, Mark and Ley9 International Classification of Disease, Tenth Revision, Clinical Modification (ICD-10-CM) codes for COVID-19, Reference Song, Zhang, Patterson, Barnes and Haas10–Reference Musshafen, El-Sadek, Lirette, Summers and Compretta18 institutional definitions, or combinations of these. Reference Bhavani, Verhoef and Maier19–Reference Chomistek, Liang and Doherty25 Notwithstanding the panoply of definitions being used, few data are available that compare estimates of COVID-19 hospitalizations, severity of illness, mortality, and trends between definitions, nor their accuracy in identifying primary or contributing versus incidental infections.
In this study, we assessed the impact of different commonly used definitions for COVID-19 hospitalization in a regional health system on case counts, disease severity, and in-hospital mortality in the period when the Omicron variant was predominant versus the period preceding the Omicron variant. In addition, we evaluated how each definition performed at identifying hospitalizations due to COVID-19 versus hospitalizations with incidental COVID-19 using detailed medical record reviews.
Study design and methods
We performed a retrospective cohort study using electronic health record (EHR) and administrative data from 2 large academic hospitals and 3 community hospitals within the Mass General Brigham healthcare system: Massachusetts General Hospital, Brigham and Women’s Hospital, Salem Hospital/Northshore Medical Center, Newton Wellesley Hospital, and Brigham and Women’s Faulkner Hospital. The study included all hospitalizations of adults aged ≥18 years, including inpatient admissions and observation stays, as well as emergency department (ED) visits ending in death, with admission dates between March 1, 2020, and March 1, 2022. ED visits ending in death were included to ensure capture of all potential cases of severe disease. Transfers between study hospitals and readmissions on the same date as discharge were treated as continuous encounters. The study was approved with a waiver of informed consent by the institutional review board at Mass General Brigham (protocol no. 2020P001631).
COVID-19 hospitalization definitions
We assessed 6 definitions for COVID-19 hospitalizations modeled after existing strategies currently being used for public health surveillance, clinical monitoring, and/or research 1,Reference Fillmore, La and Zheng3,Reference Murray and Wachter4,26,Reference Shappell, Klompas, Kanjilal, Chan and Prevalence27 :
1. PCR only: positive PCR for SARS-CoV-2 between 14 days prior to admission and discharge.
2. PCR plus hypoxemia: positive PCR (using the same timeframe of 14 days preadmission through discharge) and the patient either required supplemental oxygen for any amount of time or had at least 1 oxygen saturation <94% recorded in vital signs during hospitalization.
3. PCR plus dexamethasone: positive PCR and received at least 1 dose of dexamethasone during hospitalization.
4. PCR plus remdesivir: positive PCR and received at least 1 dose of remdesivir during hospitalization.
5. COVID-19 flag: presence of an institutional EHR-based COVID-19 flag maintained for ≥5 days with start and end dates overlapping with the hospitalization. Institutional COVID-19 flags were triggered by a positive PCR result but could be removed by local infection control officials if review of patients’ clinical syndrome, initial and repeat PCR tests, cycle thresholds, prior history of known infections, and SARS-CoV-2 anti-nucleocapsid antibody status suggested that the positive PCR was more indicative of a remote infection or a false-positive result. Reference Rhee, Baker and Kanjilal28
6. ICD-10: an ICD-10 discharge diagnosis code for COVID-19 (U07.1 or J12.82).
Outcomes
Each definition was applied across all encounters to calculate crude hospitalization counts, percentage of hospitalizations related to COVID-19, and in-hospital mortality rates by month. We also assessed how outcomes estimates varied by their definition in the period preceding the Omicron variant (March 1, 2020–December 16, 2021) versus the period when the Omicron variant was predominant (December 17, 2021–March 1, 2022). We determined these periods in accordance with the emergence and predominance of the Omicron variant in Massachusetts. 29
Assessing accuracy of definitions for primary, contributing, or incidental COVID-19
From 15,436 PCR-positive encounters, 100 cases were randomly selected for structured medical record reviews using a standardized data abstraction tool in REDCap version 12.0.19 software (Vanderbilt University, 2022) (Supplementary Material online). In total, 50 cases were selected at random from the periods before and after the emergence of the Omicron variant. All available notes, medication records, laboratory and microbiology results, radiology reports and images, and pathology reports were reviewed.
Each case was adjudicated into 1 of 3 categories: (1) primary COVID-19 admission, (2) contributing COVID-19 admission, and (3) incidental COVID-19 admission. A primary COVID-19 admission was defined as an encounter in which the patient presented with a syndrome definitely or probably due to SARS-CoV-2 infection (eg, COVID-19 pneumonia or COVID-related myocarditis). A contributing COVID-19 admission was defined as any encounter not meeting primary COVID-19 criteria but likely triggered by or related to SARS-CoV-2 infection (eg, exacerbation of underlying disease such as congestive heart failure, chronic lung disease, or arrhythmia) or an encounter during which the patient presented for non–COVID-19 reasons, developed COVID-19 after admission, and the infection led to complications such as medically prolonged stay, ICU transfer, or death. Notably, receipt of a COVID-19 therapeutic in of itself was not considered evidence of primary or contributing COVID-19 hospitalization. Incidental COVID-19 hospitalizations were those in which SARS-CoV-2 was not relevant to the syndrome causing admission and did not cause complications. Positive tests deemed to be false positives or residual RNA from previous infection were categorized as “incidental.” Reference Rhee, Baker and Kanjilal28 Please see Supplementary Table 1 (online) for complete descriptions of these categories and representative examples.
The first 15 cases were reviewed independently by 2 physician reviewers (C.S. and C.R.); interrater reliability for classifying COVID-19–relevance categories was moderate to strong (agreement on 13 of 15 cases; Krippendorff α, 0.77). All 15 cases were discussed between the 2 reviewers to make final adjudications for the 2 discrepant classifications and to ensure a standardized process moving forward. The remaining 85 cases were reviewed by 1 physician (C.S); the 12 cases in which classifications were unclear were subsequently discussed with 2 additional reviewers (C.R. and M.K.) to achieve consensus.
Statistical analysis
Patient characteristics and outcomes were compared across groups using χ2 tests for categorical variables and ANOVA tests for continuous variables. Comparisons between the periods before and after the emergence of the Omicron variant were performed for each definition by calculating incidence of COVID-19 cases per 100 admissions and incidence of ICU admissions, need for mechanical ventilation, and in-hospital deaths per 100 COVID-19 cases. We then calculated incidence rate ratios (IRRs) for the periods before and after the emergence of the Omicron variant.
The proportions of COVID-19 hospitalizations due to primary or contributing COVID-19 versus incidental COVID-19 per medical record review were compared across definitions and area under the receiver operating curves (AUROCs), sensitivity, specificity, and positive and negative predictive values were calculated.
Analyses were conducted using Stata version 17 software (StataCorp, 2021, College Station, TX: StataCorp LLC). For all analyses, P < .05 was considered statistically significant.
Results
Study cohort
The study cohort included 306,387 hospital encounters associated with 197,434 unique individuals. Overall, 15,436 (5.0%) of 306,387 encounters met the primary PCR-based definition (positive PCR between 14 days prior to admission and discharge). Compared to hospital encounters without a positive PCR test, those meeting the PCR definition were slightly older (median age 62 vs 60 years), more likely to be male (51.8% vs 43.9%), and less likely to be of white race (59.8% vs 75.2%) (Table 1).
Note. DC, discharge; IQR, interquartile range; ICD-10, International Classification of Disease, Tenth Revision.
a Comorbidities were derived using the Elixhauser index. 38,Reference Elixhauser, Steiner, Harris and Coffey39 Cancer includes solid tumor with and without metastases and lymphoma. Diabetes includes diabetes with and without complications. Neurologic disease includes movement disorders, seizures, and other neurologic conditions. Kidney disease includes moderate and severe renal failure.
COVID-19 hospitalizations, clinical characteristics, and outcomes across definitions
Clinical characteristics and outcomes for all 6 definitions are shown in Table 2. The proportions of encounters meeting criteria for COVID-19 hospitalization were 1.5% for PCR plus dexamethasone, 1.9% PCR plus remdesivir, 3.9% for PCR plus hypoxemia, 5.0% for PCR only, 5.1% for institutional COVID-19 flag, and 5.2% for COVID-19 ICD-10 codes. These proportions varied substantially over time (Fig. 1). In-hospital mortality rates ranged from 8.3% for PCR only to 12.2% for PCR plus dexamethasone versus 2.2% for all non–COVID-19 encounters.
Note. PCR, polymerase chain reaction assay; Hypox, hypoxemia; Dex, dexamethasone; RDV, remdesivir; EHR, electronic health record; ICD-10, International Classification of Disease, Tenth Revision; ICU, intensive care unit; LOS, length of stay; MV, mechanical ventilation.
Before and after the predominance of the Omicron variant
Overall, 30,273 (9.9%) of 306,387 encounters occurred during the Omicron variant period versus 276,114 (90.1%) during the period preceding the Omicron variant. However, among PCR-positive encounters, 3,424 of 15,436 (22.2%) occurred during the Omicron period. Median duration of mechanical ventilation was substantially shorter for COVID-19 encounters during the Omicron period across all 6 definitions: 4 versus 11 days for PCR only and 6 versus 13 days for PCR plus dexamethasone. Incidence rate ratios and their respective confidence intervals for ICU admission and the need for mechanical ventilation were <1 across all definitions during the Omicron period indicating lower risk of these outcomes during the period after the emergence of the Omicron variant compared to before. In-hospital mortality was lower during the Omicron period for PCR only, PCR plus hypoxemia, institutional flag, and ICD-10 definitions. However, mortality was similar during the periods before and after the emergence of the Omicron variant for PCR plus dexamethasone and PCR plus remdesivir definitions (Table 3). A sensitivity analysis limiting the period before the emergence of the Omicron variant to November 1, 2020, to December 16, 2021 (ie, when the use of dexamethasone and remdesivir to treat SARS-CoV-2 were well established) yielded similar results (Supplementary Table 4 online).
Note. IRR, incidence rate ratio; CI, confidence interval; PCR, polymerase chain reaction assay.
a For COVID-19 hospitalization, incidence is the number of COVID-19 cases per 100 encounters; for ICU admission, mechanical ventilation, and in-hospital mortality, incidence is the number of outcome cases per 100 COVID-19 encounters based on given definition.
Distinguishing primary or contributing versus incidental infections
Among 100 cases reviewed, 45 met criteria for primary COVID-19 hospitalizations: 17 were COVID-19 contributing, and 38 were COVID-19 incidental. Among the incidental COVID-19 hospitalizations, 19 cases had PCR results deemed to be a false positive or residual RNA from a previous recovered infection based upon the patient’s clinical syndrome, PCR cycle threshold values, repeat test results, and/or timing of recent infections. Proportions of primary, contributing, and incidental cases differed significantly (P < .001) between subgroups in the periods before and after the emergence of the Omicron variant: 30 (60%) of 50 primary cases before the Omicron variant versus 15 (33.3%) of 50 cases after the Omicron variant; 8 (16%) of 50 contributing cases before the Omicron variant versus 9 (18%) of 50 cases after the Omicron variant; and 12 (24%) of 50 incidental cases before the Omicron variant versus 26 (52%) of 50 cases after the Omicron variant.
The performance characteristics for each definition are summarized in Figure 2 and Table 4. PCR plus remdesivir had the highest PPV (90.0%; 95% confidence interval [CI], 76.3–97.2) and AUROC (0.74; 95% CI, 0.66–0.82) for a COVID-19 primary or contributing hospitalization. This definition showed moderate sensitivity at best (58.1%; 95% CI, 44.8–70.5) and negative predictive value (56.7%; 95% CI, 43.2–69.4). The ICD-10–based definition had the highest sensitivity (98.4%; 95% CI, 91.3–100) and negative predictive value (93.8%; 95% CI, 69.8–99.8) but poor specificity (39.5%; 95% CI, 24.0–56.6) and a fair positive predictive value (72.6%; 95% CI, 61.8–81.8). In general, the performances of the other definitions were poor to moderate, with AUROCs for primary or contributing hospitalizations ranging from 0.57 (95% CI, 0.47–0.66) for PCR plus hypoxemia to 0.69 (95% CI, 0.61–0.77) for ICD-10 codes. The performances for each definition stratified by periods before and after the emergence of the Omicron variant can be found in Supplementary Table 2 (online).
Note. CI, confidence interval; PCR, polymerase chain reaction; ICD-10, International Classification of Disease, Tenth Revision (ICD-10); AUROC, area under the receiver operating curve; NPV, negative predictive value; PPV, positive predictive value; PCR, polymerase chain reaction assay.
a Sensitivity, specificity, AUROC, and NPV left blank for definition 1 as all reviewed encounters met the PCR-only criteria.
Discussion
We found substantial variation in COVID-19 hospitalization counts and outcomes across 6 commonly used definitions for COVID-19 hospitalizations. Crude COVID-19 hospitalization counts varied up to 3-fold between the most inclusive definition (based on ICD-10) versus the most restrictive definition (PCR plus dexamethasone). Definitions based upon receipt of COVID-19 therapeutics identified encounters with significantly higher rates of ICU admission, mechanical ventilation, and death. None of the definitions we examined reliably differentiated between primary or contributing versus incidental COVID-19 hospitalizations compared to detailed chart review. The most accurate definition (PCR plus remdesivir) had a very high positive predictive value but identified fewer than two-thirds of primary or contributing cases.
Our findings demonstrate the challenge of conducting surveillance for severe SARS-CoV-2 infections in the contemporary context. Hospitalization with a positive PCR assay alone is a poor proxy for severe SARS-CoV-2 infections at this stage of the pandemic. None of the alternative definitions we assessed, however, were both sensitive and specific for severe SARS-CoV-2 infection.
Some of the definitions we evaluated may nonetheless still be useful depending on the purpose of the analysis. The original PCR-based CDC definition, ICD-10–based definition, and institutional COVID-19 flag-based definition were sensitive and produced large cohorts, albeit with lower severity of illness overall and low specificity for primary or contributing COVID-19 infections. These metrics mirror trends in the prevalence of COVID-19 in the local community and accurately reflect the absolute count of cases being managed by hospitals. They are imperfect measures, however, of the incidence of severe COVID-19, particularly during the Omicron period.
Conversely, definitions that incorporated hypoxemia or receipt of anti–COVID-19 therapeutics identified smaller cohorts with higher severity of illness, had greater specificity for primary or contributing infections, and yielded more stable mortality estimates in the periods before and after the emergence of the Omicron variant. The low sensitivity of these definitions renders them poor proxies for estimating the total burden of severe disease, but their high specificity may make them useful candidates for tracking relative changes in the burden of severe disease over time. These characteristics may also make these definitions useful as inclusion criteria for observational studies of inpatient COVID-19 cohorts. Hospitalizations flagged by these definitions are enriched for COVID-19 primary or contributing hospitalizations and experience higher incidence rates of many common study outcomes such as ICU admission or death. However, we urge caution for 2 reasons: risk of selection bias that can distort the magnitude and direction of measured associations if components of the definition used for inclusion qualify as “collider” variables, Reference Admon, Bohnert, Cooke and Taylor30 and their performance will likely change over time as indications, availability, and alternative therapies evolve. Public health agencies and researchers can also consider using multiple definitions with different sensitivities and specificities to provide both “conservative” and “liberal” estimates of the burden of severe COVID-19.
In this study, ICD-10 codes had high sensitivity and good negative predictive value but poor specificity and moderate positive predictive value. Because our medical record reviews were conducted among PCR-positive hospitalizations, the true positive predictive value of ICD-10 codes might be even lower. This finding contrasts with early assessments of COVID-19 ICD-10 codes which reported excellent positive and negative predictive values for ICD-10 codes compared to PCR data in all patients and critically ill patients, respectively. Reference Wu, D’Souza and Quan8,Reference Khera, Mortazavi and Sangha24,Reference Kadri, Gundrum and Warner31–Reference Bosch, Law, Peterson and Walkey34 In retrospect, the excellent performance for ICD-10 codes in these studies was likely due to the use of PCR positivity as the gold standard for COVID-19 hospitalization, as well as the newness of the epidemic, focal use of testing, low healthcare utilization for non–COVID-19 care (hence fewer incidental cases), and fewer false-positive results due to prior infections. We advise caution when interpreting studies which identify COVID-19 hospitalizations using ICD-10 codes during the current era.
The finding that all definitions had poor-to-moderate AUROCs for distinguishing incidental versus primary or contributing COVID-19 underscores the complexity and variability of COVID-19 presentations and the challenge of disentangling the attributable morbidity of SARS-CoV-2 in specific patient encounters. EHR-based approaches using the simple definitions assessed in our study, perhaps unsurprisingly, were ill-equipped to identify such nuance. Our study draws attention to the need to develop better surveillance definitions that more accurately capture and characterize the full spectrum of COVID-19–associated illness in hospitalized patients. Algorithms that incorporate a wider array of EHR data may better distinguish primary versus incidental COVID-19 hospitalizations, but this comes at the cost of generalizability and the broad applicability that is essential for public health surveillance. Reference Klann, Strasser and Hutch35
Many frequently reported COVID-19 outcomes such as need for ICU admission, use of mechanical ventilation, and in-hospital death were significantly less common during the periods before than after the emergence of the Omicron variant despite much higher case incidence rates during the Omicron period. Prior studies have speculated that this is due to higher rates of population immunity from vaccination and prior infections, a broader armamentarium of therapeutics, and/or lower intrinsic severity for the Omicron variant versus prior variants. Reference Modes, Directo and Melgar36,Reference Iuliano, Brunkard and Boehmer37 Our findings also suggest that a fourth contributing factor is the dramatic increase in community incidence during the initial Omicron surge, which led to a large increase in the number of hospitalized patients with incidental COVID-19 and to a consequent decrease in the percentage of COVID-19 hospitalizations with severe disease.
Our study had several limitations. It was conducted using EHR data from a single healthcare system; larger studies with more geographic diversity are needed. Also, it included only adult patients; these results cannot be extended to pediatric populations. Only a small number of cases were manually reviewed to characterize each definition’s capacity to distinguish primary or contributing versus incidental infections. Determining the role of COVID-19 in hospitalization can be subjective, but it was mitigated using a standardized data collection tool and discussion of difficult cases with 3 clinicians to reach consensus. Furthermore, the performance of the definitions we evaluated likely fluctuated over the examined period and will continue to change in the future as new variants emerge, therapeutic strategies evolve, and reinfections become more common. Therefore, ongoing periodic reassessment of definitions for COVID-19 hospitalization will be needed to determine their appropriateness to inform public health surveillance, policy recommendations, and research.
In conclusion, estimates of COVID-19 admissions, severity of illness, in-hospital mortality, and trends are significantly affected by how COVID-19 hospitalizations are defined. The traditional PCR-based definition identifies many incidental cases and is associated with less severe illness compared to definitions that incorporate hypoxemia or COVID-19 therapeutics. Most definitions demonstrated improvements in in-hospital mortality rates in the periods before and after the emergence of the Omicron variant, but definitions that required dexamethasone or remdesivir did not. Medical record reviews demonstrated that no definition accurately differentiated between primary or contributing versus incidental hospitalizations, although positive PCR plus remdesivir or dexamethasone had a high positive predictive value for primary or contributing hospitalizations. An ICD-10–based definition had excellent sensitivity but poor positive and negative predictive values. These findings have important implications for public health surveillance and research, including highlighting the need for improved surveillance definitions that better capture and characterize the full spectrum of COVID-19–associated disease in hospitalized patients.
Supplementary material
To view supplementary material for this article, please visit https://doi.org/10.1017/ice.2022.300
Acknowledgments
Financial support
C.N.S. received grant support from the National Institutes of Health (grant no. 1F32GM143862-01). M.K. and C.R. received grant support from the Centers for Disease Control and Prevention (grant no. 6U54CK000484-04-02).
Conflicts of interest
C.R. reports royalties from UpToDate, Inc., and consulting fees from Cytovale and Pfizer on unrelated topics. M.K. reports royalties from UpToDate, Inc. The other authors report no potential conflict of interest.