The Pediatric Heart Network’s Residual Lesion Score study was designed to test the validity of a quality assessment tool: The Residual Lesion Score. The Residual Lesion Score is a tool that assesses residual cardiac defects after congenital heart surgery based upon echocardiographic criteria that capture the individual components of specific operations. It also includes clinical events such as unplanned re-interventions prior to discharge for major residua in the anatomic areas relevant to the surgical procedure. Reference Nathan, Trachtenberg and Van Rompay1 The previously developed Technical Performance Score Reference Larrazabal, Del Nido and Jenkins2–Reference Nathan and Karamichalis4 was utilised as a template for the development of the Residual Lesion Score modules for the Residual Lesion Score study using RAND Delphi methodology.
The RAND modified Delphi technique Reference Brook, McCormack, Moore and Seigal5 is used to examine the validity and the feasibility of candidate quality indicators. The nine-point scale has been used for more than two decades at RAND in developing explicit indicators for evaluating appropriateness and quality. Reference Brook, McCormack, Moore and Seigal5 These methods require individuals who rate measures to place them into one of three categories: valid criterion for quality, equivocal criterion for quality, invalid criterion for quality. Each category can be rated on a three-point scale to allow for some variation within category. The scale is ordinal, for example, a 9 is better than an 8. Because quantities (e.g., risk-benefit ratios) are not assigned to each number on the scale, the difference between an 8 and a 9 is not necessarily the same as the difference between a 5 and a 6. Explicit ratings are used because in small groups some members tend to dominate the discussion and this can lead to a skewed or biased decision that is not reflective of the group. Reference McGlynn, Kosecoff, Brook, Goodman and Baratz6
We describe the use of this methodology for the development of the Residual Lesion Score for five common congenital cardiac procedures that are routinely performed in infancy: repair of tetralogy of Fallot with pulmonary stenosis, repair of complete atrioventricular septal defect repair, arterial switch operation with or without ventricular septal defect closure for dextro-transposition of the great arteries, repair of coarctation of aorta or hypoplastic or interrupted aortic arch with ventricular septal defect, and Norwood procedure for single ventricle anatomy.
Methodology
A 11-member expert panel was convened consisting of 6 paediatric cardiac surgeons and 5 paediatric cardiologists from major paediatric centres across North America (cardiac surgeons: Bacha, Gaynor, Kanter, Ohye, Pizarro, Tweddell; cardiologists: Atz, Colan, Schwartz, Shirali, Tani). In addition, two chairpersons (Nathan, Newburger) and a consultant (Gurvitz) were also integral to the process. The consultant was an expert in the RAND Delphi methodology and ensured that the methodology was appropriately followed. The primary objective of the expert panel meeting was to finalise the sub-components of the Residual Lesion Score modules for each of five congenital cardiac operations and produce the ultimate sub-components of the discharge and post cardiopulmonary bypass intraoperative Residual Lesion Score modules.
The Technical Performance Score modules, which had previously been developed based on the consensus of cardiac surgeons and cardiologists at a single centre, were used as the framework for development of the Residual Lesion Score. Reference Larrazabal, Del Nido and Jenkins2–Reference Nathan and Karamichalis4 In work from single-centre studies, as well as in a secondary analysis of the multicentre Single Ventricle Reconstruction trial, the Technical Performance Score modules were associated with both in-hospital and post-discharge outcomes. Reference Nathan, Sadhwani and Gauvreau7–Reference Sengupta, Gauvreau, Kaza, Hoganson, Del Nido and Nathan24 The Residual Lesion Score modules were derived from these Technical Performance Score modules. A detailed literature search was performed to assess appropriateness of each sub-component to serve as a quality measure of surgical repair. As with the Technical Performance Score modules, the Residual Lesion Score module for each procedure was divided into its sub-components, and each sub-component was categorised as Class 1 (optimal, no residual lesion), 2 (adequate, minor residual lesion), or 3 (inadequate, major residual lesion, or unplanned reintervention), based on specific echocardiographic and clinical criteria. The use of three categories of residual lesions gave us the ability to distinguish excellent repairs with no residual lesions (Class 1) from those with minor residual lesions (Class 2). It also allowed us to separate out and focus on patients with intermediate residual lesions (Class 2) by separating them from those with major residual lesions (Class 3). Class 2 represents the group for which there was no recommendation as to whether immediate intervention (in the operating room or prior to discharge) would be beneficial or not. The Residual Lesion Score modules for the five procedures studied in the Pediatric Heart Network protocol were revised by investigators from Pediatric Heart Network sites in a series of conference calls in preparation for their use in a prospective Pediatric Heart Network Residual Lesion Score study. The components of the consensus-derived Residual Lesion Score modules were then finalised by the RAND modified Delphi method. Reference Brook, McCormack, Moore and Seigal5 Importantly, the RAND modified Delphi methodology allowed for modification of sub-components by the expert panel but did not allow them to add entirely new sub-components to Residual Lesion Score at the in-person meeting.
Panel process and criteria for measure inclusion
The role of the 11-member expert panel was to optimise the modules in two steps, an initial email score followed by an in-person panel discussion to finalise the score. Each panellist was asked to rate approximately 12 sub-components per operation. The chairpersons and consultant were responsible for collating the email scores, providing the required information to the panellists, and staying within the framework of the RAND Delphi methodology during panel discussions at the in-person meeting (Fig 1).
A draft of the Residual Lesion Score modules (both intraoperative and discharge Residual Lesion Score), a summary of supporting evidence, and rating sheets were sent to all members of the expert panel via email. Expert panel members were asked to return their scoring sheets in 3 weeks. Each panellist was asked to rate all sub-components of the Residual Lesion Score modules for each of the five operations on a nine-point scale for two dimensions, validity and feasibility.
Validity
A sub-component of the Residual Lesion Score module was considered valid if there was adequate scientific evidence or where evidence was insufficient, expert professional consensus to support the clinical importance of the indicator, and there were identifiable health benefits to patients who have minor or no residua as specified by the indicator.
Based on the panellists' professional experience, physicians/surgeons with significantly higher rates of adherence to the indicator would be considered higher quality providers.
The majority of factors that determined adherence to an indicator were under the control of the physician/surgeon or were subject to influence by the physician/surgeon.
Validity of the sub-component (i.e., the indicator) was scored as follows: 1–3: if the sub-component is not a valid criterion for evaluating residual lesions after repair; 4–6: if the sub-component is an uncertain or equivocal criterion for evaluating residual lesions after repair; and 7–9: if the sub-component is clearly a valid criterion for evaluating residual lesions after repair.
Feasibility
A sub-component of the Residual Lesion Score module was considered feasible if the information necessary to determine Residual Lesion Score Class of the sub-component was likely to be found on the average echocardiogram and clinical records, and failure to have such information available would be considered a marker of poor quality, and all sub-component measurements (echocardiographic and clinical) were likely to be reliable and unbiased. Reliability was defined as the degree to which assessment would be free from random error.
Feasibility of the sub-component (i.e., indicator) was scored as follows: 1–3: if it was not feasible to measure the sub-component; 4–6: if there was considerable variability in feasibility of measurement of the sub-component; and 7–9: if it was clearly feasible to measure the sub-component.
Email scoring
The panel members were encouraged to comment on the Residual Lesion Score sub-components for each of the five operations, in addition to scoring. Prior to the in-person meeting, the expert panel members received a de-identified summary of the validity and feasibility ratings, with each Residual Lesion Score sub-component receiving a separate rating (Fig 2). This summary included a rating “line” that had markings from 1 to 9, for each indicator. The distribution of all panel member ratings was displayed above the rating line. Individual panel member’s rating for each indicator was marked with a carat below the line and enabled them to compare their rating to the other panellists’ ratings. Additionally, the panellists’ comments about the various sub-components were included in this summary.
In-person expert panel meeting
At the two-day expert panel meeting, the validity and feasibility ratings for each sub-component previously evaluated by email scoring were reviewed. The panel focused most of the discussion on sub-components with indeterminate scores or wide variation in scoring among panellists. These discussions were facilitated by the Panel Chairpersons. During the meeting, after discussion of each of the sub-components, sub-components could be modified for clarity, and the expert panellists were asked to re-rate the sub-components. At the conclusion of discussion, the expert panellists submitted their final ratings for the sub-component measures for each procedural category.
Analysis of in-person scoring
The median was used to measure the central tendency for the panel members’ ratings, and the mean absolute deviation from the median to measure the dispersion of the ratings. The final disposition of each Residual Lesion Score sub-component measure was based on its median validity and feasibility scores. To be included in the “final” Residual Lesion Score, each sub-component was required to have a median rating of 7–9 on validity and 4–9 on feasibility (Table 1).
To determine agreement and disagreement among panellists, we used a statistical definition that could be applied regardless of the number of ratings available. This approach frames the definitions of “agreement” and “disagreement” in terms of the distribution of ratings in a hypothetical population of repeated ratings by similarly selected individuals.
Finalising the discharge and intraoperative Residual Lesion Score modules
Based on the analysis of the in-person scoring, the final modules of the intraoperative and discharge Residual Lesion Score were developed. Reference Nathan, Trachtenberg and Van Rompay1 The components in these modules were then used to design electronic data collection forms for use by the site echocardiographers and the core lab readers. Of note, the core lab data collection forms included some additional variables that will be used during analysis to allow derivation of empiric (data-driven) Residual Lesion Score.
Results
The distribution of the validity and feasibility scores in the email and in-person scoring rounds is provided in Table 2.
Values are median, mean absolute deviation (range). A = agreement is high (accept); I = indeterminate agreement (if occurs in first round requires further discussion with panellist in second round, but acceptable if occurs in second round); D = disagreement (reject).
Procedures: Arch/VSD = repair of coarctation of aorta, hypoplastic or interrupted arch and ventricular septal defect; ASO, ASO/VSD = arterial switch operation for d-transposition of great arteries with or without ventricular septal defect; CAVSD = complete atrioventricular septal defect repair; Norwood = Stage I Norwood procedure for single ventricles; TOF/PS = tetralogy of Fallot pulmonary stenosis repair.
Abbreviations: ASD = atrial septal defect; BTS = Blalock Taussig Thomas shunt; DC = discharge; IO = intraoperative; LAVV = left atrioventricular valve; LPA = left pulmonary artery; LVOT = left ventricular outflow tract; MPA = main pulmonary artery; PDA = patent ductus arteriosus; PV = pulmonary valve; RAVV = right atrioventricular valve; TAP = transannular patch; VS = valve sparing; VSD = ventricular septal defect; RLS = Residual Lesion Score.
Some common themes were noted during review and analysis of scores for validity and feasibility. During email scoring, the panellists scored two separate intraoperative modules, one for transoesophageal echocardiograms and one for epicardial echocardiograms. The decision was made in the in-person meeting to have a single intraoperative module that would encompass both transoesophageal and epicardial echocardiograms, where applicable. Valve assessment was made more granular by splitting the sub-component into two separate sub-components, one each for stenosis and regurgitation. Additional clarity on types of ventricular septal defect was introduced by classifying them as either muscular or non-muscular defects. As a group, the panel concluded that assessments of fenestrated atrial septal defect and ventricular septal defects were neither valid nor feasible measures of residual lesions, and therefore recommended removal of these Residual Lesion Score sub-components, with the exception of intraoperative Residual Lesion Score for repair of tetralogy of Fallot. Additionally, for the intraoperative Residual Lesion Score, the panel decided to exclude complete atrioventricular conduction block as a sub-component for all procedures except the Arterial Switch Operation category.
Tetralogy of Fallot pulmonary stenosis repair
In this category, during email scoring, some sub-components failed to meet validity and/or feasibility criteria (Table 2). At the in-person meeting after additional discussion, the panel decided to combine limited transannular patch with transannular patch and some sub-components were made more granular: pulmonary valve assessment in valve-sparing repair was split into two sub-components for assessment of stenosis and regurgitation. Similarly, right atrioventricular valve function was also split into two sub-components for assessment of stenosis and regurgitation. On analysis of in-person scoring, some sub-components were excluded from the final Residual Lesion Score modules because of failure to meet validity and/or feasibility criteria as detailed in Table 2.
Complete atrioventricular septal defect repair
In this category, in the initial email scoring, left ventricular outflow tract assessment was the only sub-component that did not meet validity criteria for discharge Residual Lesion Score. Discussion during the in-person meeting also resulted in elimination of fenestrated atrial septal defect and left ventricular outflow tract sub-components in both intraoperative and discharge Residual Lesion Score modules.
Arterial switch operation/arterial switch operation with ventricular septal defect repair
The email rating identified neopulmonary valve and right atrioventricular valve assessment for regurgitation or stenosis for discharge Residual Lesion Score as not valid. In the in-person meeting, the subaortic and subpulmonary outflow tracts were split from a single sub-component into two separate sub-components that would be assessed in all patients whether or not there was an intervention in these areas during the arterial switch operation. As a result of discussion during the in-person meeting, the panel decided to eliminate left atrioventricular valve assessment in the discharge Residual Lesion Score module, but to retain it in the intraoperative Residual Lesion Score, given that it can be a surrogate for systemic ventricular dysfunction related to ischaemia from coronary insufficiency.
Arch/ventricular septal defect repair
The email scoring raised concerns about the clinical indicators of arch repair from the side. After discussion in the in-person meeting, these sub-components were excluded from the final module.
Norwood
The email scoring raised concerns about validity of measuring adequacy of coronary blood flow for both discharge and intraoperative Residual Lesion Score, and the feasibility of assessing the proximal and distal arch during intraoperative imaging. During the in-person meeting, clinical indicators of systemic outflow reconstruction were included as a sub-component for the intraoperative Residual Lesion Score. In addition, assessment of neoaortic valve stenosis was excluded from both discharge and intraoperative Residual Lesion Score as an invalid measure of outcome.
Discussion
The RAND Delphi technique is a research methodology that obtains and then sharpens the opinions of experts when there is no indisputable answer. RAND Delphi methodology has been used successfully in the development of quality indicators for adults with CHD, Reference Gurvitz, Marelli, Mangione-Smith and Jenkins25 the Risk Adjustment in Congenital Heart Surgery, Reference Jenkins, Gauvreau, Newburger, Spray, Moller and Iezzoni26 and the Society of Thoracic Surgeons - European Association for Cardiothoracic Surgery Congenital Heart Surgery Mortality Categories. Reference O'Brien, Clarke and Jacobs27,Reference Jacobs, Jacobs and Thibault28 In the present work, we used RAND Delphi methodology to develop Residual Lesion Score modules in a multicentre study that sought to build consensus about the technical success of repair of congenital heart lesions amidst varying practices and beliefs.
The RAND Delphi process enabled a detailed consideration of what components of each of the five operations were both clinically meaningful and could be quantified. In general, the two main reasons for disagreement were unclear language or unclear scientific evidence, which led to differences in interpretation of the validity of the Residual Lesion Score sub-component. The expert panel meeting allowed further refinement of each Residual Lesion Score sub-component based on group input. The collective wisdom of the experts in the field helped develop modules that were both “valid” as a measure of severity of residual lesions and “feasible” to measure accurately using echocardiography.
The importance of validating the Residual Lesion Score for the five procedural categories in a multicentre environment is that, if validated with minimal changes to sub-components, it will allow development of modules for the evaluation of all anatomic areas that may be repaired during congenital cardiac operations. These modules can then be finalised using the RAND Delphi methodology described above. Once there is expert consensus on a “library” of modules, the modules can be combined, as needed, into infinite numbers of scoring tools, like the score discussed here. This type of flexibility in an evaluative tool and the ability for the tool to evolve as more data are added are key for the assessment of the large number of unique procedures that are possible in the surgical management of the highly heterogeneous CHDs. The alternative of using prospective studies to develop data-driven modules for these rare procedures is not feasible, given the rarity of some of the procedures, and the pace with which surgical techniques and imaging technologies change over time.
Limitations
There are some inherent limitations with the process used to develop the Residual Lesion Score modules. While every attempt was made to use peer-reviewed published data, such data were of limited quantity. Categorization of the majority of sub-components was based on expertise of a 11-member panel, whose opinions may not accurately match those of the much larger group of paediatric cardiologists and cardiac surgeons who provide day-to-day clinical care to patients with CHD. Variations in imaging modalities and imaging practices within each modality cannot be completely accounted for when using expert opinion, despite every effort being made to be inclusive of experts from centres of varying sizes and varying geographical locations in the final panel.
Conclusions
The RAND Delphi methodology allowed us to utilise available evidence and expert opinion to develop Residual Lesion Score modules for five important and common infant cardiac surgeries. It enabled us to develop Residual Lesion Score sub-component scores that were clinically important and measureable, based on echocardiographic and clinical criteria. As the Residual Lesion Score is tested and validated prospectively, it will provide opportunities for further refinement as empiric data become available. Using empiric data, the cut points between classes 1, 2, and 3 will be refined and then internally validated using available study data. Once empiric scores are developed, comparing these scores to the original score will allow utilisation of similar methodology to develop Residual Lesion Score for the multitude of congenital cardiac operations. By making the score available to all caregivers, the Residual Lesion Score can be used internally at each site as a quality improvement tool. Additionally, it can be also used, between sites, as a quality metric and can guide collaborative learning for improvement of outcomes.
Acknowledgements
We would like to particularly acknowledge the sage contribution of Dr James Tweddell to this process. His clear thinking will be greatly missed.
Financial support
This study was supported by grants (U24HL135691, U10HL068270, HL109818, HL109778, HL109816, HL109743, HL109741, HL109673, HL068270, HL109781, HL135665, HL135680) from the National Heart, Lung, and Blood Institute, National Institutes of Health. Meena Nathan was supported by a K23 grant-NHLBI/NIH HL119600.
Conflicts of interest
The authors have no disclosures or conflicts of interest to report. The views expressed in this article are those of the authors and do not necessarily represent the views of the National Heart, Lung, and Blood Institute, the National Institutes of Health, or the US Department of Health and Human Services.
Ethical standards
Not applicable, as no human patients or animals were involved in this work.