Introduction
Delivering high quality care to patients depends on having accurate test results whose clinical implications are understood. While these requirements apply throughout medicine, the question of how best to ensure the quality of genetic tests used in clinical care, in particular, has vexed scientists and regulators alike for roughly two decades. Numerous federal advisory committees, expert scientific bodies, and professional societies have weighed in on the issue, proposed a variety of approaches, and identified a number of governmental and non-governmental entities to regulate the quality of single-gene tests.Reference Holtzman, Watson, Pratt, Leonard and Kaul1 Over time, the understanding and clinical use of genetic tests have increased dramatically, but challenges in ensuring patients get accurate results whose clinical impact is understood are not yet solved. One need only look at discrepant results from different laboratories and the number of variants of uncertain significance to see the enormity of the current challenges.Reference Van Driest2
As difficult as the issues attending single-gene tests are, genomic tests — which make possible the examination of multiple variants across one genome that can be analyzed individually or in combination to inform patient care — present a whole new level of complexity.3 An ongoing challenge for genomic tests, including those using next-generation or genome sequencing technology (NGS), is determining when the results of such testing are of sufficient quality to inform clinical decision-making. Indeed, simply gaining consensus on the meaning and appropriate parameters of “quality” in this context — let alone on which entities should be responsible for serving as quality gatekeepers — is difficult.
This article seeks both to identify the key components of quality in the NGS context and to evaluate the extent to which existing federal regulatory “checkpoints” for NGS test quality are sufficient to ensure the quality of genomic data and interpretation intended for use in diagnosis and treatment of patients. As part of this analysis, we will also identify the role of professional organizations, which have an important part to play in driving quality that is explored in greater length in a companion piece to this article.Reference Burke4 This article does not provide a general overview of state regulatory efforts to ensure quality or the potentially practice-informing impact of state liability rules even though these sources of law are often quite important. It does, however, briefly note interactions between New York State's laboratory regulations and the federal Clinical Laboratory Improvement Amendments (CLIA) and Food and Drug Administration (FDA) regulatory frameworks. Our focus here is on clinical testing. Later work can address the questions of ensuring quality in direct-to-consumer genetic testing and return of research results, even while we acknowledge that clinicians increasingly are being faced with requests to interpret and act upon genetic information originating from both of these contexts.
This article seeks both to identify the key components of quality in the NGS context and to evaluate the extent to which existing federal regulatory “checkpoints” for NGS test quality are sufficient to ensure the quality of genomic data and interpretation intended for use in diagnosis and treatment of patients.
Part I presents background, explaining that no single entity oversees the process of ensuring quality in genomic analysis and interpretation. Part II then defines the components of quality, including analytic validity, clinical validity, and clinical utility, as well as the attendant challenges. Part III analyzes current oversight, focusing on oversight by the Centers for Medicare & Medicaid Services (CMS) and FDA, current approaches to clinical utility, and the role of non-governmental entities. Part IV then presents our findings and recommendations for governing quality to advance integration of genomic testing into clinical care.
I. Background
Genomic data are generated for a range of clinical reasons and in different patient populations. For example, genomic tests may be performed as part of the evaluation of a child with unexplained developmental delay or other features that suggest a genetic disorder may be involved.5 Sequencing panels consisting of dozens to hundreds of genes are increasingly used to characterize cancers to refine prognosis and therapy.Reference Gornick, Hertz and McLeod6 Also, as the cost of sequencing decreases, the pressure to adopt genome-based approaches has increased, even if an ordering physician requests, and a laboratory interprets and includes in the final report, only a limited subset of the data generated by sequencing. Genome-based approaches, by their nature, generate a tremendous amount of uninterpreted data, most of which are not pertinent to answering the particular clinical question for which testing was ordered. Nevertheless, clinical laboratories and providers are currently encouraged to report results beyond the original clinical indication. The American College of Medical Genetics and Genomics (ACMG) recommends that clinical laboratories performing genome or exome sequencing should give patients the option to receive the results when pathogenic variants that are considered medically actionable are discovered in any of 59 specified disease-associated genes, regardless of the clinical indication for testing.Reference Kalia7 Some prominent scientistsenvision a day when genome sequencing and analysis are included as part of routine healthcare screening.Reference Collins and Church8
Under the current U.S. regulatory scheme, oversight of genetic and genomic test quality is distributed among different governmental and non-governmental actors, with no single entity in charge of the entire process. Additionally, applicable legal requirements may differ depending on the methodology and setting of testing, as discussed below.
II. Defining Quality
A necessary prerequisite to assessing the adequacy of current regulations to ensure genomic testing quality is to define the parameters of such “quality.” We focus here on quality in the traditional clinical genetic testing context while highlighting the added quality challenges that arise when performing genomic sequencing. In the clinical context, discussions of genetic test quality generally focus on the domains of analytical validity (including analytical verification), clinical validity, and clinical utility.Reference Joseph, Micheel, Strande, Sung and Teutsch9
Analytical validity often refers to how well a test detects, identifies, calculates, or analyzes the presence or absence of a particular gene change.Reference Burke10 Analytical verification is seen by some to be part of analytical validity,Reference Rehm11 while others view verification and validation as separate processes.Reference Lathrop12
Clinical validity usually refers to how well a genetic variant is correlated with the presence, absence, or risk of disease.Reference Fabsitz13 In the diagnostic setting, clinical validity is often described in terms of “sensitivity” (the proportion of affected individuals who have an abnormal test result), “specificity” (the proportion of unaffected people who do not have the abnormal result), “positive predictive value” (the probability that individuals who test positive actually have the condition), and “negative predictive value” (the probability that individuals who test negative do not have the condition).14 In the context of predictive testing (i.e., testing of asymptomatic individuals to identify genetic susceptibility to future disease), clinical validity usually is viewed as the measure of the accuracy with which the test predicts future clinical conditions.15 Genetic tests for cancer syndromes, for example, have relatively low sensitivity and somewhat higher specificity and so have less clinical validity for purposes of prediction for reasons discussed in more depth below; most women with breast cancer do not have germline mutations in BRCA 1 and 2, while unaffected women are not very likely to have such variants, but some will.
Healthcare providers make critical decisions about which patients to test and which tests to order. A genomic test with well-established analytical validity, clinical validity, and clinical utility for one intended use (e.g., diagnosis of symptomatic patients) may lack quality if inappropriately ordered for other uses (e.g., screening of healthy persons). But, as discussed below, even when diagnostic tests are validated, labeled, and promoted for a specific intended use consistent with regulatory requirements, healthcare providers have broad discretion to order genetic and other diagnostic tests as part of the practice of medicine. The independent role of the healthcare provider adds a further level of complexity to an already fragmented regulatory regime.
Clinical utility usually refers to the risks and benefits resulting from genetic test use and encompasses considerations of (a) whether a test (and subsequent interventions taken based on the result) leads to better clinical management or an improved health outcome among people with a positive test result, and (b) the potential risks posed by such testing.16 As such, clinical utility is both a vital measure but also more difficult to define precisely as its meaning can vary from one user to another. Some definitions of the term equate utility with “actionability,” and take the position that for a test to be clinically useful, there must be established therapeutic or preventive interventions available or other available actions that may change the course of the disease.17 However, a test result may also provide diagnostic or prognostic information that improves clinical management, even if the disease course is not changed. In addition, some definitions also encompass the concept of personal utility, based on the view that information may have value to patients and families and may have benefits to society, even when no medical intervention is available.18 However, there is ongoing controversy about conflating the concepts of personal utility with clinical utility,Reference Terry and Wolf19 and not all commentators agree that personal meaning supplies a legitimate basis for returning results.20 An example of ongoing debate is whether supplying a definitive diagnosis through genomic testing provides clinical utility to patients and families even when the underlying disorder has no specific treatment.
In the twenty years since the Secretary's Advisory Committee on Genetic Testing (“SACGT”) introduced its initial framework discussing the concepts of analytical validity, clinical validity, and clinical utility as they related to a regulatory scheme for genetic tests,21 these concepts have maintained their vitality while undergoing various clarifications and refinements. For example, FDA often refers to a test's “analytical performance” and “clinical performance,” as opposed to its analytic validity and clinical validity.22 The concept of analytic performance arguably encompasses two concepts: analytic validity (whether the test is capable of accurately detecting the analyte it purports to analyze) and analytic verification (a more process-oriented assessment of whether the test was performed in accordance with applicable standards, instructions, and procedures, considering factors such as operator qualifications and equipment calibration).23 While recognizing the importance of these ongoing refinements to the terminology, we choose in this paper to use the traditional terms analytical validity, clinical validity, and clinical utility, which have served as the pillars of clinical genetic testing quality for years.
Ensuring that patients get accurate clinical genomic test results that are pertinent to their own care raises additional complexities and difficulties that fall outside the scope of federal regulation. The genomic testing process begins with preanalytical steps that occur before a laboratory even receives a patient specimen. The specimen must be obtained, labeled, preserved, and transported to the laboratory in a manner and time frame that preserves the specimen for testing. This paper does not address these important preanalytic components of quality.
This paper also largely avoids the question of how clinician decision-making affects quality and how that decision-making can be optimized to support quality in genomic testing. Healthcare providers make critical decisions about which patients to test and which tests to order. A genomic test with well-established analytical validity, clinical validity, and clinical utility for one intended use (e.g., diagnosis of symptomatic patients) may lack quality if inappropriately ordered for other uses (e.g., screening of healthy persons). But, as discussed below, even when diagnostic tests are validated, labeled, and promoted for a specific intended use consistent with regulatory requirements, health-care providers have broad discretion to order genetic and other diagnostic tests as part of the practice of medicine. The independent role of the healthcare provider adds a further level of complexity to an already fragmented regulatory regime.
A. Defining the Process
Modern, high-throughput genomic testing comprises a series of steps performed within or at the direction of the laboratory. First, the biospecimen is collected, stabilized or preserved, and transported to the laboratory in an identified fashion. DNA is then extracted from the patient's biospecimen, and the laboratory applies special-purpose chemical reagents to create a library, which is a collection of short DNA fragments that can be analyzed simultaneously in parallel. Three analytical phases then follow, known as the “bioinformatics pipeline,” involving the use of instrumentation (sequencing analyzers and physical devices), consumables and supplies (e.g., chemical reagents), analytical software algorithms, and skilled human personnel.
Primary analysis, which involves raw data generation and base calling, uses instruments and software to read the sequence of nucleotides in the various DNA fragments and to assess the quality of the readings.Reference Celesti, Goldfeder and Moorthie24 The current output of this process is generally a FASTQ data file that records the line-up of nucleotides in the tested fragments.
Secondary analysis uses software to map these fragmentary readings onto a human reference genome, probabilistically aligning the fragments back into order and using the software to filter and clean the data and call variants; that is, identify specific locations where the tested individual's genome differs from the human reference genome.25 The current outputs of this process are generally a binary alignment map (BAM) file or compressed alignment (CRAM) file,Reference Hsi-Yang26 in which the fragmentary readings are reassembled into a model of the person's tested DNA, and a variant call file (VCF) that summarizes the identified genetic variants.
Tertiary analysis involves annotating the variants with available information about the potential clinical significance of those variants, prioritizing the variants through filtration using various annotation fields, and interpreting a subset of the variants to prepare a test report.27 The interpretive process typically requires a mix of interpretive software and expert human judgment.
In addition to the software required for these three analytical steps, genomic testing laboratories also depend on process-related software — for example, laboratory information management systems that help orchestrate workflows, guide laboratory technicians and automated systems in using the right reagents in the right way, track specimens and results, and generate quality control data.28
Some of the software in the genomic testing bioinformatics pipeline is a component of instrumentation that the laboratories purchase (“embedded software”). However, some laboratories rely, at least in part, on software sold separately as an accessory to their instrumentation, software developed in-house, or software supplied by external software vendors or cloud-based service providers (“stand-alone software”). The level of regulatory oversight software receives may vary, depending on its provenance, as discussed later in this article. Whatever its origin, all of the analytical and process-related software requires appropriate verification and validation,29 and the complex algorithms embodied therein require validation and testing. Data scientists, clinicians, commentators, and regulators are only beginning to identify the requirements necessary for responsible clinical implementation of complex algorithms, which may but do not always include machine learning algorithms and deep learning neural networks that are now involved in sequencing.Reference Vollmer, Wiens and Schrag30 Definitions of quality are not yet widely agreed on for medical algorithms.
B. Analytical Validity Challenges
In the context of genomic sequencing, analytical validity is particularly challenging because of the sheer scale of the genome (3 billion haploid base-pairs or 6 billion per individual). Current methodological limitations preclude analytic validation of every potentially meaningful variant, and certain regions of the genome are known to be extremely challenging to measure accurately with currently available laboratory methods.Reference Pant31 Most NGS produces short reads and then aligns them to a reference genome. Using ordinary methods and algorithms, however, NGS may not be able to detect differences in the copy number of a gene or gene segment because it might align all copies to the same place. This technology may also fail to detect major rearrangements because it simply aligns short segments to the reference and does not put them in the order in which they actually appear in the patient's genome. Sequencing algorithms may filter out short rearranged segments.Reference Wheeler32 Consequently, no extant analytical approach will reliably detect and map every variant in a single test. Moreover, whole genome or whole exome sequencing may be more accurate for some variants, while more focused sequencing (of a single gene or a few genes) will work better for others. Genomic sequencing may also result in the identification of many “novel” results, that is, variations that have not been previously observed. Novel findings need to be confirmed using a second (orthogonal) sequencing method to determine whether the variation is actually present or is instead an artifact resulting from technical error,33 whether inherent in the method or caused by the operator. These many limitations of the current sequencing technologies and processes make it critical to identify the strengths and weaknesses of different analytic approachesReference Scheuner34 and to specify areas of potential uncertainty for each, because clinicians and patients need to be able to assess the accuracy of the particular variants on which they are basing decisions about care.
At present, the best indicator of a laboratory's ability to analyze a DNA sequence accurately is comparison to a standardized reference sample developed based on sequencing and analysis of a significant number of other genomes. A number of organizations are working to develop suitable reference materials for clinical laboratories to use in method development, test validation, internal quality control, assay calibration, and proficiency testing,Reference Zook and Hardwick35 but many more reference materials are needed.
Bioinformatic algorithms that call bases, map bases to reference sequences, and interpret sequence data can be highly technical. While it is clear that complex analytical problems can be solved correctly using different mathematical and computational methods, bioinformatic systems also need to be validated and tested extensively so their performance parameters are well understood and can be communicated to users.36
C. Clinical Validity Challenges
In the context of genomic sequencing, the number of variants that can be identified vastly exceeds the number for which clinical significance has been established,Reference Dewey37 a gap that will persist even as our understanding of the impact of variants individually and in combination continues to grow. Moreover, clinical significance resides along a continuum, so that geneticists classify variants into one of five categories: pathogenic, likely pathogenic, uncertain significance, likely benign, and benign.Reference Nykamp38 If a variant with strong, reliable evidence of pathogenicity (e.g., co-segregation with disease status within large families and/or functionally tested in a valid assay) is identified in a patient who already has cardinal signs and symptoms of the related disease, then the clinical significance of the variant is typically not in doubt.
Even where a variant has strong, reliable evidence of pathogenicity, the impact of such a variant in the context of a specific patient can be uncertain. Even single-gene diseases may manifest a variety of possible phenotypes; for example, patients with sickle cell disease, the prototypical disease caused by a single base-pair change, can have a range of possible symptoms, including painful crises, acute chest syndrome, overwhelming sepsis, and/or strokes.Reference Pecker, Little, Meier, Abraham, Fasano and Beutler39 Furthermore, single-gene disorders frequently vary in their “penetrance,” that is, the proportion of individuals with a pathogenic variant who exhibit clinical symptoms. For example, many genes contribute to the development of cardiac arrhythmias. However, some individuals who carry one of these dominant mutations do not exhibit any symptoms.Reference Miko40 Incomplete penetrance complicates the ability to predict disease in currently asymptomatic individuals, as some individuals who carry a pathogenic mutation may never become ill.41 Further complicating clinical interpretation of genetic information is the fact that our understanding of a variant's clinical significance can shift over time, as new scientific evidence accumulates. Consequently, a variant suspected to increase risk today may turn out to be benign, while a variant believed to be unrelated to disease risk may turn out to increase risk in certain contexts.Reference SoRelle, Taber and David42 At any one time, experts may have differing opinions about how best to analyze the genome or to interpret the probable phenotypic effect of particular variants.
Computational algorithms for interpreting genomic data are essential tools for establishing clinical validity, but as in the case of analytic validity, challenges abound. Each algorithm may have unique filters and cutoffs. For instance, many algorithms filter in variants that are rarely seen in a population, but the algorithms can use different cut-offs for what counts as rare. Each algorithm also embodies its developers' views on how best to predict the pathogenicity or non-pathogenicity of a variant. Many users may not be able check the output of these algorithms by manually analyzing the clinical significance of variants. In any event, as sequencing becomes more common, the sheer volume of data may preclude most manual assessments.
Analysis algorithms search databases for reports of disease associated with the variants of interest in a clinical sequence. Unfortunately, many genomic databases contain mistakes or misleadingly incomplete information.Reference Coovadia43 In addition, databases at present do not adequately represent population diversityReference Popejoy, Fullerton and Manrai44 and may or may not include unaffected as well as symptomatic individuals. These gaps can lead to interpretive error, and consequently to misjudgments regarding the clinical validity and significance of a variant.Reference Kohane45 As discussed in further detail below, FDA is working to address these technical gaps through guidance.
Finally, difficulties assessing clinical validity also arise at the level of the gene and not the variant. Whether a given gene is associated with a given disease can be perplexing. In a recent example, it was found that standard panels for testing of Brugada Syndrome included a high number of genes for which there was insufficient evidence that they were causally related to the condition.Reference Hosseini46
Thus, genomic approaches face all the challenges that can bedevil the interpretation of single genes. But genomics also permits the study of more complex diseases, which may be influenced by the effects of numerous genes often in a particular pathway and numerous potential variants of variable penetrance or expression,Reference Vihinen47 leading to a variety of symptoms. Genomic risk scores — which combine the individual, often small, effects of tens or hundreds of different genes — are currently being developedReference Natarajan48 and marketed.Reference Ray49 At present, however, genomic risk scores are rife with interpretive uncertainty, raising questions about their value for diagnosis, much less for prediction of future disease state in an asymptomatic person.Reference Hunter, Drazen, Rosenberg and Curtis50 Creating a massive matrix of computer-searchable medical phenotypes associated with genotypes, as well as a wealth of environmental and other data from large numbers of individuals (as initiatives such as NIH's All of Us Research Project seek to do) may ultimately help to provide more insights into the clinical impact of most variants.51
D. Clinical Utility Challenges
Finally, clinical utility is more difficult to define and assess in the context of genomic sequencing than in interpreting the significance single-gene variants. Genomic approaches clearly can be useful in guiding clinical management for individuals with a family history suggesting a dominant disorder that could be attributable to any one of several genesReference Ma52 and for people who have an undiagnosed disease that appears to have a genetic contribution.Reference Splinter53 While these uses of sequencing fall easily within a narrow definition of clinical utility, they raise a number of questions that are beyond the scope of this paper, including deciding whether to pursue genome-based versus more focused approaches as well as whether to search for and return secondary findings. In the absence of a clinical indication for sequencing, however, the likelihood that a person will receive results that will improve long-term health depends in part on how much of the genome is being analyzed, the nature of the diseases or conditions being assessed, whether the diseases would otherwise be detected and adequately treated, competing morbidities, and the strength of the evidence regarding the efficacy of prevention or intervention.
There have been some efforts to assess the utility of genomic screening when the individual does not have a pertinent family history or current symptoms. For example, investigators at Geisinger Health System sought to understand whether genomic screening could identify previously undiagnosed patients with familial hypercholesterolemia (FH). They analyzed genomic sequencing and electronic health record data from 50,726 individuals from the Geisinger Health System to understand the prevalence and clinical impact of FH variants in this clinical cohort. They identified 229 individuals carrying one of the FH variants. They determined that only 24% of these carriers would have met the criteria for probable or definite FH diagnosis in the absence of variant identification, although most of the carriers, whether or not identified as having FH, were receiving lipid-lowering treatment. The study concluded that genomic data can augment the detection of individuals with FH and lead to identifying patients who would have been missed by using only clinical criteria in electronic health record (EHR) screening.Reference Abul54 In the eMERGE network, which examines variants in approximately 100 genes (including the ACMG list of 59 genes as well as a number of other genes) in approximately 25,000 individuals, preliminary data show that somewhere between 3-5 percent of the participants, some of whom had previously been diagnosed, had returnable results.Reference Gibbs, Rehm and Reuter55
Notwithstanding the previous examples, few studies to date have offered genome-based screening to adults in the absence of a clear clinical indication. This makes it difficult to determine whether and under what circumstances routine genome screening may improve patient outcomes, especially in light of the potential for overdiagnosisReference Elmore and Welch56 and iatrogenic costs and harms from non-indicated interventions.Reference Khoury, Meagher, Berg, Narod, Sopik, Cybulski and Vassy57 The likelihood of net benefit will depend not only on which genes are interrogated and the strength of the geno-type-phenotype correlation58 but also on the ability of providers, many of whom are not genetic specialists, correctly to interpret and act on the results.Reference Arora59 Some commentators have raised questions about whether primary care providers are willing and prepared to handle information from genomic screening, particularly in the absence of clinical decision support (CDS) tools, availability of genetics specialists for referral, and insurance coverage for the initial consultation and follow-up.Reference Christensen and Pet60 A 2017 pilot study sought to describe the effect on clinical care and outcomes of adding whole genome sequencing to a standardized family history assessment in primary care. The study concluded that adding whole genome sequencing (WGS) to primary care reveals new molecular findings of uncertain clinical utility, and that while non-geneticist providers may manage some genetic results appropriately, in other cases the information may prompt additional clinical actions of unclear value.61
Notwithstanding uncertainty around the clinical utility of genomic information, consumer interest in obtaining genetic testing outside the clinical setting as well as enrollment of individuals in longitudinal WGS research suggests that some people value receiving the information and perceive it as meaningful for themselves or their children.Reference Robinson and Genetti62 Recent studies have sought to assess the views of those who have taken part in WGS research and to evaluate whether their a priori expectations were met. One randomized study surveyed 202 primary care and cardiology patients in the Med-Seq study before and up to six months after receiving either WGS and family health history (FHH) or family history information alone. The study found that decisional regret overall was low in both groups, but that those who did not receive WGS information were more likely to report at least some regret about participating in the study. Participants who received both FHH and WGS information were much more likely to report that study results had provided new information with some level of personal or clinical utility. For example, they were over seven times more likely to report that study results had yielded accurate identification of disease risk and more than twice as likely to report that results had or would influence their medical treatment. Yet the majority of those who were found to have a pathogenic variant had no evidence of disease even after extensive clinical investigation. At the same time, they expected a higher level of benefits from the study than were actually achieved and also were more likely to report receiving too much information (particularly in the primary care group), which led the authors to recommend tempering patients' expectations.Reference Roberts63
Another study, involving 29 HealthSeq healthy participants who completed six-month follow-up, sought to gauge the psychological and behavioral impact of receiving WGS results at various time points. The study found that most patients had positive emotional reactions to receiving their results, although a few expressed negative reactions.Reference Sanderson64 Of the seven participants who received pathogenic or likely pathogenic rare disease variant results, two were concerned about and acted on the information. Two participants who had APOE e4/e4 variants (indicating increased Alzheimer disease risk) reported being concerned, while one participant who had APOE e4/e3 variants (associated with more modestly increased risk) was confused about the result. The data also showed that among those who reported distress from the information, such distress had largely subsided by the six-month assessment. The study concluded that currently neither the benefits nor harms of personal genome sequencing are significant for most individuals, but that there may be important exceptions warranting further investigation and that the impact of returning WGS results on a larger scale remains to be seen.
III. Regulating Quality in the Clinical Context
As mentioned above, no single government agency regulates the entire spectrum of genetic and genomic test quality, and some aspects of quality are not subject to any direct federal governmental regulation, whether because of current statutory limitations, competing agency priorities, or inherent limitations of federal control over health care delivery. Both FDA and the CMS currently regulate certain aspects of genomic test quality in certain situations. We begin by examining the role that federal regulatory agencies currently play in ensuring quality. CMS administers the CLIA program in partnership with CDC and FDA, which fulfill certain responsibilities in connection with CLIA.65 Other federal agencies, e.g., NIH, play supporting roles but do not exert direct oversight on the provision of clinical laboratory tests.
CMS is authorized by the CLIA statute to issue certificates to laboratories that meet certain standards set in order to ensure consistent performance of valid and reliable laboratory examinations and other procedures by clinical laboratories. “By controlling the quality of laboratory practices, CLIA standards are designed to ensure the analytical validity of genetic tests.”66 CLIA does not address the clinical validity or utility of tests.67 On the other hand, FDA regulates the safety and effectiveness of certain laboratory instruments, reagents, and test kits (collectively, in vitro diagnostic devices (IVDs)) sold to clinical laboratories. The FDA's authority to regulate stems from the Food, Drug, and Cosmetic Act (FDCA), which tasks the agency with ensuring the safety and efficacy of such articles.68 The two agencies thus both regulate analytical validity, albeit with respect to different aspects of the testing process.
A. Regulation of Clinical Laboratories
(1) CLIA
The CLIA statute is a federal law that applies to all clinical laboratories operating in or testing specimens from patients in the United States.69 CLIA defines a clinical laboratory as a facility that examines materials collected from the human body for the purpose of providing information for the diagnosis, prevention, or treatment of disease or the assessment of health.70 CLIA requires clinical laboratories to hold one of five types of certificates, depending on the complexity of the tests the laboratory performs. Clinical laboratories that provide genomic testing can elect to obtain a certificate of compliance (in which case they undergo inspection by CMS or state health departments that act as CMS's agents71) or a certificate of accreditation (in which case they are inspected by one of several private accreditation bodies “deemed” (approved) by CMS, such as the Joint Commission or the College of American Pathologists (CAP)72). Finally, a laboratory is CLIA-exempt if it has been licensed by a state whose laboratory requirements CMS has determined are equal to or more stringent than CLIA's requirements, and the state licensure program has been approved by CMS.73 Two states — New York and Washington — currently meet these conditions.74 Consequently, CLIA regulations serve as a “baseline” and laboratories that are CAP-accredited or permitted by New York or Washington may be subject to slightly different or additional requirements depending on the type of testing. In particular, New York requires laboratories that perform tests using methods not cleared or approved by FDA (e.g., laboratory-developed tests (LDTs)) to obtain approval for such tests in addition to a laboratory permit. Furthermore, CAP requires laboratories performing molecular genetic testing to use the Molecular Pathology Checklist to prepare for inspection.
CLIA regulations address, among other things, personnel qualification and training, record keeping, quality control, and proficiency testing. CLIA also requires laboratories to “maintain a quality assurance and quality control program adequate and appropriate for the validity and reliability of the laboratory examinations.”75 In addition, CLIA requires that laboratories “qualify under a proficiency testing program” meeting the standards established by CMS.76
Compliance with CLIA is ascertained through periodic inspections (surveys) either by a state inspection agency or accreditation body. CMS's State Operations Manual (SOM) provides guidance to laboratories and inspectors alike on interpretation of CLIA requirements. The SOM describes the survey as an “outcome oriented” process that focuses on the effect of a laboratory's practices on patient test results and/or patient care and that focuses the surveyor/inspector on “those requirements that will most effectively and efficiently assess the laboratory's ability to provide accurate, reliable, and timely test results.”77 Emphasis is placed on the laboratory's quality system as well as the “structures and processes throughout the entire testing process that contribute to quality test results.”78
The CLIA regulatory framework places significant responsibility on the laboratory director to ensure test quality. The laboratory director is “responsible for the overall operation and administration of the laboratory, including the employment of personnel who are competent to perform test procedures, record and report test results promptly, accurately, and proficiently, and for assuring compliance with the applicable regulations.”79 Laboratory directors must, as part of their duties, ensure that:
testing systems developed and used for each of the tests performed in the laboratory provide quality laboratory services for all aspects of test performance, including the preanalytic, analytic, and postanalytic phases of testing;
the test methodologies selected have the capability of providing the quality of results required for the patient's care;
verification procedures used are adequate to determine the accuracy, precision, and other pertinent performance characteristics of the method;
laboratory personnel are performing test methods as required for accurate and reliable results;
quality control and quality assessment programs are established and maintained to ensure the quality of laboratory services provided and to identify failures in quality as they occur;
acceptable levels of analytical performance for each test system are established and maintained;
all necessary remedial actions are taken and documented whenever significant deviations from established performance characteristics are identified and that patient test results are reported only when the system is functioning properly;
reports of test results include pertinent information required for interpretation; and
consultation is available to the laboratory's clients on matters relating to the quality of the test results and their interpretation concerning specific patient conditions.80
Additionally, for high complexity testing the laboratory director must ensure that the laboratory is enrolled in a CMS-approved proficiency testing program for each specialty in which testing is performed, that samples are properly tested and reported, that proficiency testing reports are reviewed and, where necessary, corrective actions are taken. Where a test specialty has not been established or compatible proficiency testing samples are not offered by a CMS-approved proficiency testing program, the laboratory must, at least twice annually, verify the accuracy of the test, including the accuracy of calculated results, if applicable.81
CLIA requirements apply to laboratory tests performed using assays manufactured by third parties (IVD test systems) as well as assays developed in-house by the laboratory either from scratch or by modifying a manufacturer-developed test kit (so-called laboratory developed tests or LDTs). When a laboratory uses a proprietary test system, the laboratory may not release test results prior to establishing performance specifications relating to analytical validity for the use of the test system in the laboratory's environment.82 Performance specifications must be established for:
(1) accuracy; (2) precision; (3) analytical sensitivity; (4) analytical specificity (including interfering substances); (5) reportable range of test results for the test system; (6) reference intervals (normal values); and (7) any other performance characteristic required for test performance.Reference Aziz83 With respect to accuracy, the laboratory is responsible “for verifying that the method produces correct results,” using testing reference materials, comparing results of tests performed by the laboratory against results of a reference method, or comparing split sample results with results obtained from another method that has already been shown to provide accurate results. For qualitative methods, the laboratory must verify that a method will identify the presence or absence of the analyte.84
Where the laboratory uses a third-party IVD test system, the laboratory is “responsible for verifying the performance specifications” of the test system “prior to reporting patient test results.” The “verification of method performance should provide evidence that the accuracy, precision, and reportable range of the procedures are adequate to meet the clients' needs.” The laboratory may use the manufacturer's performance specifications as a guideline, but “is responsible for verifying the manufacturer's analytical claims before initiating patient testing.”85
Failure to comply with CLIA certification and/or state clinical laboratory licensure requirements may result in a range of enforcement actions, including certificate or license suspension, limitation, or revocation; directed plan of action; onsite monitoring; civil monetary penalties; criminal sanctions; and revocation of the laboratory's approval to receive Medicare and Medicaid payment for its services. In practice, these penalties are infrequently applied, as the stated goal of regulators in the first instance is to educate laboratories and work collaboratively to correct non-compliance.86
There has been ongoing concern that CLIA's concept of “high-complexity” testing fails to capture the true level of complexity that genetic and genomic testing actually requires. CMS has established specific requirements for certain CLIA specialty areas, such as microbiology and cytogenetics, but has persistently declined to recognize genetic and genomic testing as a specialty area.87 In 1997 — back in the days of single-gene tests — a joint task force of NIH and the Department of Energy called on the Clinical Laboratory Improvement Advisory Committee (CLIAC), which advises CMS on CLIA matters, to consider creating a genetic testing specialty.88 Other concerned groups later filed citizens' petitions calling on CMS to create a genetic testing specialty.Reference Hudson89 CMS did not create such a specialty. The advent of genomic testing has only added to the concerns that CMS is not updating CLIA regulations to address the added complexity of new genomic tests to address the problem.
The goal of CLIA is to ensure the accuracy and reliability of test results (i.e., analytical verification and validity) as the test is performed in that specific laboratory. In the case of genomic testing, the current absence of a molecular genetics specialty or of a CMS-approved genetics-specific proficiency testing program poses a challenge to laboratories in demonstrating, and to surveyors in confirming, test accuracy. Although the CAP has made efforts to address the absence of standards and requires compliance as a condition of accreditation, not all genomic testing laboratories elect to be regulated under a CLIA certificate of accreditation, with CAP as their accrediting body. As previously noted, CLIA allows laboratories to pursue a CLIA certificate of compliance, in which case they would not be answerable to these standards.
CLIA does not directly address the clinical validity or clinical utility of laboratory tests. However, implicit in the responsibilities of the laboratory director is the requirement to determine whether there are sufficient data to support the inclusion of a test on the test menu, that is, to determine whether the test is clinically valid.90 A point of concern is that CLIA, even while specifying requirements for analytical validity, delegates responsibility for ensuring that a new test is clinically valid to the laboratory director; thus there is no external, data-driven regulatory review of clinical validity before a laboratory offers a new test.91 In the case of genomic sequencing, the laboratory director's responsibility necessarily includes determining whether there are sufficient data to report a specific variant as clinically significant. As noted above, however, because WGS necessarily will generate information about variants whose clinical significance is uncertain, genomic test results can include a signifi-cant amount of information for which the laboratory cannot provide “pertinent information required for interpretation”92 and for which the laboratory will not be able to assist clients in interpreting test results. Additionally, CLIA does not include an external review component (either before or after a laboratory begins to offer a test) to evaluate a laboratory's evidentiary basis for performing a test or for the interpretive conclusions included in the test report. Laboratories therefore have significant discretion as to what tests they include on their test menu and how they perform variant interpretation.
CLIA also does not specifically regulate the bioinformatics pipeline, that is, the software algorithms used to generate and interpret genomic sequence data. When the bioinformatics is performed in-house by the same clinical laboratory that performs the sequencing, there is an implicit obligation under CLIA for the bioinformatics to be validated, given its impact on test accuracy and reliability. However, CMS has not defined specific educational or training requirements for bioinformatics personnel even though that discipline requires different expertise than other aspects of laboratory testing, and CMS has not specified requirements for software validation. Furthermore, when the interpretive bioinformatics is performed by an entity separate from the laboratory that generated the sequencing data (i.e., the increasing use of a separate “dry lab” or “unbundled” interpretation services), CLIA arguably does not apply to that separate entity. Standing alone, bioinformatics does not involve direct examination of materials derived from the human body, but rather involves only the interpretation of digitally-stored data resulting from prior examination of a specimen by another entity.93
(2) FDA
The FDCA gives FDA authority to regulate medical devices, defined to include instruments, machines, reagents, in vitro diagnostic (IVD) devices, and similar or related articles or components, that are “intended for use in the diagnosis, prevention, cure, mitigation, or treatment of disease” or intended to affect the structure or function of the body.94 The FDCA prescribes a risk-based framework under which the regulatory requirements are stratified according to the device's risk. Low-risk devices generally may be marketed without prior FDA marketing authorization, as long as manufacturers comply with certain “general controls.”Reference see also95 High-risk devices are generally subject to pre-market approval and must submit, among other information, “[f]ull reports of all information, published or known to or which should be reasonably known to the applicant, concerning investigations which have been made to show whether or not the device is safe or effective.”96 Moderate-risk devices are subject to general controls and may be — but often are not — subject to ”special” controls to ensure safe and effective use, and in many cases must submit a premarket notification application and receive clearance before the device may be marketed (often referred to as a “510(k) clearance”).97 To obtain 510(k) clearance, the manufacturer must demonstrate “substantial equivalence” to a previously marketed (“predicate”) device, meaning that the device has the same intended use as the predicate and has technological characteristics that are either the same or that are at least as safe and effective as the predicate and must also demonstrate compliance with general controls and specific controls which may include guidance documents and post-market surveillance.98 The 510(k) process generally does not require the manufacturer to submit clinical evidence directly demonstrating the safety and effectiveness of its device.99
Test instruments and systems manufactured by third parties and sold to clinical laboratories for use in collection, preparation, or examination of specimens from the human body are regulated by FDA as IVD devices. Thus, for example, FDA regulates as a Class II medical device a “high throughput genomic sequence analyzer for clinical use,” which is defined as an “analytical instrument system intended to generate, measure and sort signals in order to analyze nucleic acid sequences in a clinical sample.”100 These devices, which include Illumina's MiSeqDx Platform,101 are subject to special controls that specify information that must be included in device labeling, as well as to 510(k) pre-market notification submission requirements. FDA's authority to regulate medical devices includes power to regulate embedded software that affects the safety and effectiveness of the overall device. This was seen, for example, when software limitations in the MiSeqDX sequencing system prompted a 2014 recall.102
Clinical laboratories that perform genomic testing generally use one or more test instruments, systems, or reagents regulated by FDA as part of their testing process, but the laboratories may make modifications to these products or may use them for purposes not specified in labeling or in ways not addressed in manufacturer instructions. A test system that is developed and validated by a clinical laboratory, even when it incorporates FDA-approved or cleared components, has traditionally been regulated as an LDT.103 FDA has historically taken the position that clinical laboratories using LDTs are “manufacturers” of IVD devices and that it has jurisdiction to regulate LDTs as IVDs.104 At the same time, FDA historically has exercised “enforcement discretion,”105 and thus generally has not required clinical laboratories performing LDTs to comply with FDA's IVD device regulatory requirements. A few developers have sought FDA pre-market authorization for genomic LDTs, primarily in the context of cancer diagnosis or prognosis but also for the use of next generation sequencing platforms for the diagnosis of specific conditions such as cystic fibrosis.106 Additionally, in a few instances FDA has declined to exercise enforcement discretion for certain types of genetic LDTs, including those offered by DTC companies. FDA recently issued a safety communication warning clinical laboratories against offering pharmacogenetic testing for certain drugs whose FDA-approved label does not describe how pharmacogenetic information can be used in determining therapeutic treatment.107 As a general matter, however, genomic LDTs offered by many clinical laboratories currently benefit from FDA's enforcement discretion policy and are not subject to FDA regulation.
In April 2018, the FDA issued two guidance documents intended to inform the development of NGS-based testing. The first guidance addresses the design, development, and analytical validation of NGS-based IVDs intended to aid in the diagnosis of suspected germline diseases.108 Germline diseases encompass “those genetic diseases or other conditions arising from inherited or de novo germline variants,” and FDA makes clear that the guidance does not address “tests intended for use in the sequencing of healthy individuals.”109 Although nonbinding, the guidance provides some insight as to the agency's current thinking about quality and the clinical implementation of NGS-based tests. The guidance, which FDA stated was intended to “spur development of standards” for NGS testing,110 discusses “performance characteristics.” The guidance describes analytical validation as “measuring a test's analytical performance over a set of predefined metrics to demonstrate whether the performance is adequate for its indications for use and meets predefined performance specifications.”111 This typically involves evaluating whether the test successfully identifies or measures, within defined statistical bounds, the presence or absence of a variant that will provide information on a disease or other condition in a patient.112 Per the guidance, “[t]he complete NGS-based test should be analytically validated in its entirety (i.e., validation experiments should be conducted starting with specimen processing and ending with variant calls, including documentation that performance meets predefined thresholds) prior to initiating use of the test.”113 The guidance lays out specific performance metrics to be assessed when analytically validating NGS-based tests, including accuracy (positive percent agreement, negative percent agreement, technical positive predictive value), precision (reproducibility and repeatability), limit of detection (establishing a minimum and maximum amount of DNA enabling the test to provide expected results in 95% of runs), and analytical specificity (interference, cross-reactivity, and cross-contamination).114
Concurrently, FDA issued a second guidance on the use of public human genetic variant databases to support clinical validity of genetic and genomic-based IVDs.115 The guidance describes an approach in which test developers may rely on clinical evidence from FDA-recognized public databases to support clinical claims for their tests and help provide assurance of the accurate clinical evaluation of genomic test results. The guidance describes how product developers can use these databases to support the clinical validation of NGS tests they are developing and states that FDA-recognized databases will provide test developers with an efficient path for marketing clearance or approval of a new test. Subsequently, in December 2018 FDA recognized ClinGen's expert panel approved variants as the first to meet the level of an FDA-recognized database. The existence of one quality-controlled database to aid in variant interpretation is a step forward, but much work remains to be done. Indeed, recognizing that theirs is a work in progress, the organizers of ClinGen are continuing to expand their efforts and encourage other organizations and databases to seek FDA recognition.116
In the past, FDA has signaled an intent to modify its enforcement discretion policy with regard to regulation of LDTs. In 2014, the agency proposed a draft regulatory framework for LDTs,117 whose implementation was subsequently abandoned in 2016.118 While FDA currently does not seem inclined to implement an LDT regulatory framework in a systematic way under existing statutory authorities, it is possible that Congress may enact legislation directing FDA to regulate LDTs. Congress has been considering diagnostic test legislation for several years including through the issuance of several discussion drafts setting forth possible legislative approaches. Most recently, a legislative discussion draft of the VALID Act (Verifying Accurate Leading-edge IVCT Development) was publicly released in December 2018. Although released too late for consideration in the 115th session of Congress, the draft legislation may be taken up at some point in the future.
FDA's role in the regulation of software also is potentially relevant to oversight of genomic testing. For many years, FDA has regulated “software in a medical device”119 — software embedded in traditional devices like pacemakers, drug infusion pumps, and in vitro diagnostic (IVD) test kits where the software affects the safety and effectiveness of the device as a whole.120 Thus, software incorporated as a component of an FDA-regulated genomic test — for example, software embedded in an FDA-regulated sequencing analyzer — is reviewed by FDA as part of its premarket review of the safety and effectiveness of the overall device. FDA has reviewed such software in the context of the small number of NGS-based tests that have undergone FDA review.
This, however, leaves a vast amount of stand-alone genomic testing software unregulated at the current time. Software used to interpret genomic data generated using LDTs may escape FDA scrutiny. The same is true of cloud-based software services that laboratories incorporate into their bioinformatics pipeline. In response to this problem, FDA has asserted its authority to regulate “software as a medical device” (SaMD), defined as “software intended to be used for one or more medical purposes that perform these purposes without being part of a hardware medical device.”Reference Cortez121 Such software “utilizes an algorithm (logic, set of rules, or model) that operates on data input (digitized content) to produce an output for a medical use specified by the manufacturer.”122 Software used in the bioinformatics pipeline arguably could meet the definition of SaMD to the extent it is intended for use in clinical testing (i.e., as part of disease diagnosis or health assessment) and is not embedded in a device that is already subject to FDA regulation (such as a clinical sequencer). Unfortunately, FDA is still far from having a framework in place to support comprehensive regulation of SaMD.
FDA's regulation in this area is still very much a work in progress, and work to date has focused on medical software generally as opposed to genomic software more specifically. In a 2017 guidance document,123 FDA adopted the principles of the International Medical Device Regulators Forum (IMDRF),124 of which the agency is a member, for “clinical evaluation” of SaMD, that is, a “set of ongoing activities conducted in the assessment and analysis of a SaMD's clinical safety, effectiveness and performed as intended by the manufacturer in the SaMD's definition statement.”125 The SaMD guidance builds on previous IMDRF documents that addressed SaMD terminology, risk categorization, and quality management system principles, respectively. The guidance identifies three pillars of clinical evaluation:
establishing a valid clinical association between the SaMD output and the targeted clinical condition;
demonstrating that the SaMD is analytically valid, meaning that it correctly processes input data to generate accurate, reliable, and precise output data; and
demonstrating that the SaMD is clinically valid, meaning that the output data achieves the intended purpose in the target population in the context of clinical care.
The SaMD guidance explains that clinical evaluation should be a systematic and planned process that continues through the device lifecycle as part of the quality management system. Further, the guidance states that the “level of evaluation and independent review” of a particular SaMD should be commensurate with its risk. The guidance encourages manufacturers to leverage the connectivity inherent in SaMD to modify software based on “real-world” performance.
In its 2017 Digital Innovation Action PlanReference Gottlieb126 and a pilot Digital Health Software Precertification (Pre-Cert) Program proposal,127 FDA acknowledged that its traditional premarket review process for devices is not well suited for software:
FDA's traditional approach to moderate and higher risk hardware-based medical devices is not well suited for the faster iterative design, development, and type of validation used for software-based medical technologies. Traditional implementation of the premarket requirements may impede or delay patient access to critical evolutions of software technology, particularly those presenting a lower risk to patients.128
FDA pledged to “reimagin[e] its approach to digital health medical devices.”129 The Pre-Cert Program, currently in its pilot phase, focuses FDA's scrutiny at the level of the firm that develops the software, rather than on the specific software product.130 FDA would verify that the firm “demonstrate[s] a culture of quality and organizational excellence based on objective criteria, for example, that they can and do excel in software design, development, and validation (testing).”131 If so, FDA may allow a pre-certified firm to move its lower-risk software to market without premarket review or may provide a more cursory or faster review of the firm's moderate- and higher-risk software,132 possibly relying on postmarket evidence to validate safety and effectiveness after software already is in use.133
FDA acknowledges that the Pre-Cert program is only in its developmental phrase and will not provide a viable path to market in the near future.134 FDA also recognizes unresolved questions about whether FDA has the statutory authority it needs to regulate software, which ultimately may require new legislation.135 FDA notes that embedded software is not currently eligible for pre-certification and would, at least in the near future, continue to go through FDA's traditional premarket review process. In short, there are many unresolved issues, and it is still not clear whether — and how — FDA will be able to regulate software in the genomic testing bioinformatics pipeline.
In tacit recognition of this reality, FDA's 2018 guidance on analytical validity of genomic tests for germ-line diseases136 simply called on laboratories to specify and document all the software they are using, “including the source (e.g., developed in-house, third party), and any modifications” and to “document whether the software will be run locally or remotely (e.g., cloud-based).”137 There was no assertion that this bioinformatics software would necessarily receive any regulatory oversight, but laboratories were exhorted to “document and validat[e] their bioinformatics software performance in the context of the end-to-end NGS-based test.”
More recently, however, FDA signaled a more assertive posture toward regulating bioinformatics software used in genomic testing. The agency's September 2019 draft guidance on clinical decision support software138 states that FDA views bioinformatics software used to process high-volume “omics” data as being subject to FDA's device regulations if the software produces patient-specific information, whether or not the software is clinical decision support (CDS) software.139 FDA also stated that “bioinformatics software products that query multiple genetic variants against reference databases or other information sources to make patient-specific recommendations” are medical devices.140 This suggests that FDA views all phases of the genomic testing bioinformatics pipeline — including variant interpretation — as subject to FDA medical device regulation. The more recent draft guidance ended its public comment period in December 2019 and is under consideration by the agency. A companion article in this issue reflects on some of the potential impacts of FDA's plans to regulate software used in genomic testing.Reference Evans141
With software regulation unsettled, other entities are seeking to develop voluntary standards for both NGS testing and bioinformatics. In particular, the CDC-led workgroup Nex-StoCTReference Heger142 has published three consensus recommendations since 2012 that address standards for NGS testing and bioinformatics. Collectively, these documents address sequence generation, analysis of raw sequence data, and standardization of variant files to facilitate meaningful inter-laboratory comparisons and provide a common format for data contained within the variant file.Reference Lubin, Gargis and Gargis143 Although not legally binding, these recommendations may provide useful guidance to entities developing, implementing, or selecting among NGS-based test systems.
Finally, it is important to note that even when FDA regulates a genetic or other type of diagnostic test as a medical device, FDA evaluates the test's analytic and clinical performance for a specific intended use.144 FDA has no authority to interfere with clinicians' off-label use of lawfully marketed devices, so clinical uses of a test may stray beyond the use for which FDA has reviewed evidence.145 An oft-cited example of this problem, relating to a non-genetic test, is the fact that FDA has cleared prostate-specific antigen testing for monitoring men who already have been diagnosed with prostate cancer, yet the test is widely prescribed off-label for screening healthy individuals — a use for which the test may not be safe and effective.146 Fur ther, modern sequencing “technology allows broad and indication-blind testing”147 — in other words, genomic testing lacks a clearly enunciated intended use, because the data it generates can be put to a vast multiplicity of uses.148 Even if regulators ensure that genomic testing has analytical and clinical validity for one intended use — for example, diagnosing the cause of previously undiagnosed developmental delay in a child — the test generates data about thousands of other genetic variants, some of which are rare or never seen before, that lend themselves to many other uses for which analytical and clinical performance have not been assessed, which some have characterized as opportunistic screening.Reference Burke149 Finally, as mentioned previously in this section, FDA has followed an enforcement discretion policy for many — but not all — lab-developed tests,150 so there are many genomic tests for which FDA has never reviewed analytical and clinical performance for even one intended use.
B. Government Regulation of Clinical Utility
CMS does not regulate the clinical utility of tests under CLIA.151 The simplistic account of FDA regulation is that FDA also does not require evidence of clinical utility for genomic tests. The reality is more nuanced. First, FDA assesses the safety and effectiveness of tests relative to the manufacturer's intended use for the test. If the manufacture states a clinical intended use (e.g., “This test is useful for diagnosing cystic fibrosis”), as opposed to stating a purely analytic use (e.g., “This test accurately detects the presence or absence of specific CFTR genetic variants”), then the test's intended use implicitly asserts clinical utility. When FDA confirms that the test is safe and effective for a clinical intended use, FDA in effect is confirming that the test has its stated clinical utility. Second, FDA has authority to oversee not just the analytical and clinical performance of tests — that is, their analytical and clinical validity — but also the labeling of tests.152 FDA “want[s] to make sure that what a manufacturer is saying about their test in their labeling, in their materials about the test is truthful and accurate.”153 There is no requirement for test manufacturers to make any claims of clinical utility in their labeling, and a test can be brought to market with purely analytical claims. But if a manufacturer does elect to make any claims about clinical utility in a test's labeling, FDA will require evidence to support those claims.154 This is part of FDA's statutory mandate to ensure that drug and device labeling shall not be “false or misleading in any particular.”155
FDA recognizes, however, that clinical utility is, to a large degree, a medical practice issue. Clinical utility addresses whether a test (and subsequent interventions taken based on the result) leads to better clinical management or an improved health outcome among people with a positive test result.156 No matter how accurately a test does its job (e.g., detecting cancer), the test will have no clinical utility if the clinician orders an inappropriate course of treatment after receiving the test result. For this reason, ensuring the clinical utility of tests is largely the province of state medical practice regulators and tort law. Professional societies also offer guidance.157 FDA's role is generally confined to ensuring that any assertions test manufacturers make about clinical utility are supported by sound evidence.Reference Jillson158
For many people, genomic tests are clinically available as a practical matter only if they are covered by insurance. Importantly, both government and commercial payers evaluate clinical utility, usually in terms of the impact of test results on the patient's health outcome, as part of determining whether to cover clinical genomic testing.Reference Deverka and Dreyfus159 In light of the rapidly evolving but still incomplete understanding of the clinical utility of particular variants, payers are more likely to pay for focused genetic tests with clearly documented clinical impact than for broad-based genomic tests in which the clinical implications of individual variants vary widely.160 The local contractors that administer Medicare payments in specific regions of the country have significant discretion in determining whether to cover diagnostic testing, including genomic sequencing. The same is true of private payers, which often but not always, look to Medicare coverage policies in deciding whether to cover a particular test.Reference Phillips161 In 2018, CMS issued a national coverage decision stating that coverage of NGS for tumor profiling is required only where there is an FDA-approved companion diagnostic linked to the testing,162 but the agency is reconsidering its decision in light of significant stakeholder objections.163 Pressure is also growing for payers to provide coverage with evidence development in which patients are enrolled in clinical trials or registries to inform future reimbursement policy.Reference Eisenberg and Varmus164 Private payers often follow the CMS coverage determinations but, because private insurance is a matter of contract between private parties, they are not bound to do so. As a result, private payer coverage can be inconsistent from one payer to the next and can change over time.
To the extent a patient is willing and able to pay for testing directly, however, the ability to obtain genomic testing is limited only by a physician's willingness to order it, which may reflect at least in part the clinician's assessment of its value, and the laboratory's willingness to provide it. Some states allow direct-access testing, in which patients can order their own laboratory tests without having to go through a physician, although not all states allow this. In states that allow direct-access testing, physicians are less able to serve as a check on the use of tests with uncertain clinical utility, and patients' access to tests is limited only by their pocketbooks.165
C. The Role of Non-Governmental Entities in Ensuring Genomic Testing Quality
Many non-governmental entities, such as the Association for Molecular Pathology (AMP),Reference Schrijver166 CAP, and the American College of Medical Genetics and Genomics (ACMG), have issued standards and guidance documents designed to promote the quality of clinical genomic tests. One major topic has been improving the interpretation of sequence variants,Reference Roy, Richards and Kearney167 particularly in the area of oncology.Reference Jennings and Li168 The CAP issued broad-based laboratory standards for next-generation sequencing169 and has a number of standing committees on issues related to molecular pathology and genomics.170 The ACMG has issued a statement about the need for genomic data sharing to promote quality.171 As befits an organization focused on clinical care, a number of its statements are directed more toward clinical utility and practice.Reference David172 These guidelines, and the steps that lead to their development,173 play a crucial role in shaping practice. While these documents are not directly enforceable, their impact can be increased if they are adopted by payers as a condition of coverage and reimbursement.
IV. Recommendations for Governing Genomic Testing to Advance Quality
Because genomics (and clinical laboratory testing generally) is a field where the science is advancing rapidly, regulatory flexibility is essential to ensure safety and efficacy while supporting innovation. Fortunately, complexity and rapid change are not new problems for regulators of medical or other products. Over the past three decades, there have been many areas of administrative law where the regulated industries and technologies grew more complex and product life cycles grew faster, and where opposition to regulation was fierce. In response to this fluid regulatory landscape, some agencies have turned to “new governance” styles of oversight.Reference Bamburger, Burris, Kempa, Shearing, Dorf, Freeman, Lobel, Trubek, Sabel and Solomon174
In theory, new governance embraces “the challenge and the promise of destabilization and social plasticity” and takes account of the polycentric world in which knowledge relevant for oversight is dispersed among many entities and among people with different types of expertise.Reference Burris, Drahos, Shearing and Ford175 It aims to be more transparent, flexible, and democratic than command-and-control (top down) style regulation and to employ “centrally coordinated local problem solving” processes.Reference Sturm, de Búrca and Scott176 New governance is characterized by collaborative interactions among regulators, the regulated industry, and other stakeholders; by regulatory flexibility and responsiveness; and by the use of “soft law” techniques for shaping behavior within the regulated industry. Soft law includes: benchmarking and information sharing to improve practices within an industry; incentives for voluntary adherence to industry standards (including naming and shaming of entities that do not abide by appropriate standards); incentives for developing an exemplary record of adherence to regulatory requirements; education of relevant individuals within regulated entities; and other creative methods for encouraging improvements and safety within industries.Reference Chang177 New governance does not replace regulation but expands the toolbox with which agencies seek to shape behavior. In some cases, new governance may deemphasize regulation in favor of other governance tools, perhaps because the processes for promulgating regulations are slow, regulations are difficult to revise once they are implemented, and in many industries regulatory compliance is disappointingly low.178 In the U.S., examples of new governance primarily come from environmental law and occupational health and safety law.179
Interestingly, several recent proposals by the FDA reflect new governance themes. For instance, the agency has proposed making governance of medical devices more transparent and inclusive through the use of collaborative communities — ongoing forums that bring together numerous stakeholders whose input is relevant for identifying or addressing a medical device governance issue.180 Members of a collaborative community for genomic tests might include patients, care-partners, health care providers, genome scientists, professional organizations, bioethicists, regulators from federal and state agencies, other legal experts, software designers, algorithm designers, clinical laboratory representatives, and device industry representatives. A collaborative community may help to define and specify governance challenges, and it may produce recommendations or other deliverables. The FDA has also proposed innovative, iterative, and adaptive approaches for governing FDA-regulated software.181 Congress has pushed the agency to adopt some aspects of new governance. For instance, the 21st Century Cures Act mandated that the agency obtain early patient input to inform some regulatory decisions.182
Many of this article's recommendations (below) have a distinctively new governance flavor. The authors are cognizant, however, that new governance has its critics whose work often responds to problems that became apparent as agencies implemented new governance models. New governance may fail because its implementation costs are high. Agencies and other stakeholders must be willing to invest significant sums of money, time, and expertise to support robust participation of relevant stakeholders, to gather appropriate data for learning by doing, to provide appropriate feedback to regulated entities, and for other new governance activities.Reference Lee183 And even with appropriate investments, critics argue that soft law favors people or entities with more power over people or entities with less power, and favors stakeholders with concentrated and easily articulated interests over stakeholders whose interests are more diffuse.Reference Alexander and Nejaime184 Others note that new governance undermines traditional notions of government accountability and that its proceduralism may compromise substantive norms of justice.Reference Sabel, Simon, de Búrca, Scott and Simon185
This article takes criticisms of new governance seriously, recognizing that some of its recommendations could be implemented in a suboptimal manner that likely would not improve the quality of genomic tests. For instance, members of a collaborative community or similar group contemplating quality standards for genomic tests would need to be transparent about their interests and biases, and the group would have to be composed in a manner that helped offset biases. The group's processes should be designed to diminish the effects of biases and power imbalances among the participants. This article does not purport to design an entirely new governance process for clinical genomic tests, but to make recommendations that are capable of implementation by agencies, professional organizations, or other governance nodes in the polycentric world of quality control for genomic testing. That our recommendations reflect, to a large extent, a new governance mindset means that past experience with the successes and failures of new governance could help guide their implementation. In any case, these agencies must act within the scope of their statutory authority and comply with the requirements of the Administrative Procedures Act.
A. Findings
Several barriers at times challenge laboratories' ability to deliver high quality genomic tests, including:
a. The clinical evidence base, while developing rapidly, is still incomplete.
b. Current testing methods are less accurate in some parts of the genome and for certain types of variants.
c. Interpretation of variants can be challenging, especially for individuals who are not of northern European ancestry due to a lack of diversity in individuals tested to date.
Quality is a joint effort involving laboratories, expert input (including from physicians and genetic counselors, laboratorians, and bioinformaticists), industry, professional societies, patient advocacy groups, patients themselves, and regulators. Cooperative and consensus-based approaches should be developed to advance quality (see, for example, the “Collaborative Community” concept being discussed by FDA).186
a. Professional and technical standards can advance quality when developed in a transparent, evidence-based, and rigorous fashion that includes all stakeholders and that includes and addresses the interests of patients.
Stakeholders, including patients, need certainty and clarity on:
a. The jurisdiction of each relevant oversight agency.
b. Requirements for demonstrating quality.
c. Regulatory processes.
The same activity should be regulated (or not regulated) in the same way.
a. Genetic and genomic tests can be developed either by the medical device industry (i.e., test kits) or by laboratories (i.e., LDTs), but, in the current scheme, are often regulated differently depending on this distinction.
b. Patients, physicians, and other stakeholders do not care whether a test is a laboratory developed test or a test kit or what type of entity developed or performed the test. Patients, physicians, and others do and should care about whether the test is reasonably safe and effective for its claimed indications, and they should be aware when a test is being used outside its appropriate indications in ways that may lead the results and their interpretations to be misleading or even inaccurate.
The Centers for Medicare & Medicaid Services in their enforcement of CLIA, along with accreditation bodies such as College of American Pathologists and state regulators in CLIA-exempt states, play a major role in overseeing laboratory operations to ensure analytic verification and validity but fail at times to update their regulations or guidance to take account of new laboratory methods and the complexity of modern testing technologies such as genomic testing.
The Food and Drug Administration regulates both analytic and clinical validity of tests in their intended uses, although the scope of its jurisdiction has been challenged by some.
a. With respect to test kits manufactured by third parties and sold to clinical laboratories, FDA's medical device requirements generally address analytical validity and clinical validity as set forth in the intended use of the test and may sometimes touch on clinical utility depending again on the intended use claimed by the manufacturer. The “indication-blind” nature of genomic testing — the fact that it generates vast amounts of data that could be put to any number of uses beyond the problem that led to testing in the first instance — poses a major challenge for FDA's traditional oversight processes.
b. The FDA traditionally has exercised enforcement discretion over LDTs, and its recent attempts to regulate these tests have been spotty and inconsistent. Their actions have been met with opposition by clinical laboratories and other stakeholders and have led to some confusion and uncertainty on the part of stakeholders, including patients.
c. Regulation of software is essential to the quality of genome-scale tests and to interpreting the clinical significance of genomic test data and is a major challenge that FDA has only begun to deal with. It has been widely assumed — but is far from clear — that FDA is the appropriate regulator for all phases of the bioinformatics pipeline, some phases of which (e.g., variant interpretation) may be more in the nature of medical practice regulation than product regulation.
d. Clinical utility is generally addressed in the practice of medicine and in payer decision making. Although tests are not required to make claims about clinical utility as a condition for entering the market, FDA can regulate such claims if test developers choose to make them.
B. Recommendations
1. Relevant regulatory agencies need to develop/maintain expertise in genetic and genomic test development and implementation.
Cooperative and consensus-based standards should be developed to advance quality.
2. Regulators should have an efficient process for identifying, recognizing, and encouraging the adoption of such standards.
i. One of the FDA's current strategic priorities is to establish “collaborative communities” to work toward common objectives in device regulation.187 Such communities might serve as a nucleus or model for cooperative standards development as well as provide input to inform regulatory actions taken by the agencies.
3. CLIA should be modernized and harmonized with broader quality systems and modern terminology.
a. Relevant agencies and scientific stakeholders should work to generate more samples for proficiency testing.
b. Laboratories need to have robust preanalytical processes and standards, including robust purchasing controls, processes for collecting and processing samples, and corrective and prevention action processes.
c. Bioinformatics pipelines need to be reviewed appropriately.
d. CMS needs to develop a genetic and genomics testing specialty.
i. In the interim, CMS should designate which professional organization standards and guidelines, such as the CAP Molecular Pathology Checklist, are recognized as the most authoritative sources with respect to particular aspects of genomic testing.
4. FDA needs to continue to pursue improvements in several arenas.
a. Regulation needs to be risk-based, taking into account the characteristics of tests and the clinical context in which they will be used for their intended purpose.
b. FDA should work collaboratively with other oversight bodies, including state medical practice regulators and professional organizations, to discourage inappropriate uses of genomic tests for unintended purposes that may be unsubstantiated or unsafe.
c. Regulatory processes or standards need to be implemented to assess all phases of software development for use in genomic testing, and to ensure uniform, consistent oversight of this software regardless of whether it is embedded or stand-alone and whatever its provenance. These processes need to ensure that the strengths and weaknesses of software are transparent, both for regulators and user groups such as physicians and laboratorians who rely on software systems. This includes transparency about the limitations of software algorithms, transparent access to non-proprietary databases and useful descriptions of the limitations of proprietary databases on which the software relies, and transparent business practices that foster open discussion about software characteristics and performance.
i. Standards for developing and testing algorithms need to be developed.
ii. Several groups, including FDA, CDC, AMP, and ACMG have been working to develop standards or other regulatory and non-regulatory approaches to assess the validity of bioinformatics systems. Consensus approaches should be adopted, and software developers should be required to demonstrate their adherence.
iii. It would be helpful if scientists, regulatory professionals, and engineers involved in designing and marketing relevant algorithms and software encouraged their professional organizations to develop the expertise necessary to collaborate in proposing and implementing quality approaches.
d. To the extent that FDA chooses to rely on standards stated as special controls, the agency must develop strategies to ensure consideration of the interests of all stakeholders including patients in their development. FDA also needs to make sure that appropriate special controls are implemented in a timely fashion.
e. FDA needs to develop a pathway that relies more on post-market surveillance for rare diseases and breakthrough tests that meet an unmet clinical need or that represent a clinically significant advance over current technology. These tests should have analytical validity before being used clinically and while clinical validity is being confirmed in the post-market space. Using this approach requires application of sound methods for demonstrating clinical validity in the post-market context and a commitment by regulators to ensure that high quality post-market studies are conducted in a timely fashion.
5. The genomics community and other stakeholders need to continue to improve genetic and genomic variant databases and their use in variant interpretation.
6. Limitations of test accuracy and of interpretation of clinical validity need to be clearly communicated to clinicians and patients.
7. Payment systems should recognize and reward quality. Payers should use coverage with evidence development in areas where evidence is not yet sufficiently strong for optimal decision making.
8. CLIA and FDA should work jointly to develop a public, searchable database for use by clinicians, patients, and other stakeholders that displays all information about the regulatory status of genomic tests and devices.
Conclusion
Ensuring that patients receive accurate results from genomic testing is challenging given the field's complexity and rapid evolution. Federal regulators are already actively working to make certain that laboratories take the steps necessary to deliver high quality results, but room for improvement remains at many points along the process. Considering how best to oversee and measure the quality of laboratory diagnostic tests and meet the challenges of rare disease diagnostics are pressing issues. Another area of concern includes clarifying informatics pipelines and developing strategies for validating algorithms critical to sequence assembly and analysis. Although more data are needed in many areas to inform decision making, regulators can require the collection of only some of these data and will have to rely on the actions of others.
Process will be critical. Regulators will need to be inclusive to ensure that the needs of all stakeholders including patients are addressed. Agencies will need to pursue an array of strategies beyond formal regulation, which will require a higher level of transparency. Delivering high quality test results is only the first step because clinicians and patients need to be able to understand the limitations of current genomic knowledge and tests and to know how to use them if we are truly to reap the benefits of this knowledge.
Acknowlegements
Preparation of this article was supported by National Institutes of Health (NIH) grants R01 HG008605 as part of the project on “LawSeq: Building a Sound Legal Foundation for Translating Genomics into Clinical Application” and RM1 HG009034. The content is solely the responsibility of the authors and does not necessarily represent the views of the funders. We particularly thank our colleagues Kenny Beckman and Susan Berry for their helpful comments in discussion. Research assistance was provided by Emily Sachs, Hailey Verano, Jillian Heaviside, Margo Wilkinson, Ahsin Azim, and Kate Hanson. Coordination of the group's work was provided by the incomparable Audrey Boyle. All views expressed are those of the authors and not necessarily the funders or others who provided support and comments.