Oral fluids as a diagnostic specimen
The terms ‘saliva’ and ‘oral fluid’ are often used interchangeably to refer to fluid samples collected from the oral cavity (Kintz et al., Reference Kintz, Cirimele and Ludes2000; Wong, Reference Wong2006). More accurately, saliva is the fluid produced by salivary glands whereas oral fluid is a composite of saliva, serum transudate, mucosal cells and cellular debris, microorganisms, digestive enzymes, and food residues (Schramm et al., Reference Schramm, Smith and Craig1993; Crouch, Reference Crouch2005; Cone and Huestis, Reference Cone and Huestis2007). This review will use the term ‘oral fluid’ as defined by Atkinson et al. (Reference Atkinson, Dawes, Ericson, Fox, Gandara, Malamud, Mandel, Navazesh and Tabak1993): ‘The fluid obtained by insertion of absorptive collectors into the mouth’.
Although various sampling strategies are used for human beings, oral fluid samples in veterinary medicine are usually collected by introducing an absorbent material into the oral cavity (Palmer et al., Reference Palmer, Whipple and Waters2001; Shin et al., Reference Shin, Cho, Cho, Kang, Kim, Kim, Park and Park2004; Cavalcante et al., Reference Cavalcante, Muniz, Jia, Augusto, Troccoli, Medeiros, Dias, Switzer, Soares and Santos2018). Depending on the size of animals, oral fluid samples could be collected by allowing large animals and primates to chew on absorbent material, e.g. cotton rope, or swabbing oral and buccal cavities in small animals (Larghi et al., Reference Larghi, Nebel, Lazaro and Savy1975; Thomas et al., Reference Thomas, Champoux, Suomi and Gunnar1995; Lutz et al., Reference Lutz, Tiefenbacher, Jorgensen, Meyer and Novak2000; Shin et al., Reference Shin, Cho, Cho, Kang, Kim, Kim, Park and Park2004; Smith et al., Reference Smith, Gray, Moxley, Younts-Dahl, Blackford, Hinkley, Hungerford, Milton and Klopfenstein2004; Gomes-Keller et al., Reference Gomes-Keller, Gonczi, Tandon, Riondato, Hofmann-Lehmann, Meli and Lutz2006; Prickett et al., Reference Prickett, Kim, Simer, Yoon and Zimmerman2008; Dietze et al., Reference Dietze, Moritz, Alexandrov, Krstevski, Schlottau, Milovanovic, Hoffmann and Hoffmann2018; Cheng et al., Reference Cheng, Buckley, Van Geelen, Lager, Henao-Diaz, Poonsuk, Pineyro, Baum, Ji, Wang, Main, Zimmerman and Gimenez-Lirola2020).
The presence of viable viral pathogens, pathogen-specific antibody, and nucleic acids in oral fluids has been well-described (Sirisinha and Charupatana, Reference Sirisinha and Charupatana1970; Garrett, Reference Garrett1975; Archibald et al., Reference Archibald, Zon, Groopman, Mclane and Essex1986). In people, the presence of infectious viruses in oral fluid was first demonstrated by bioassay, e.g. clinical signs in cats and monkeys inoculated with oral fluids from humans with mumps (Wollstein, Reference Wollstein1918; Johnson and Goodpasture, Reference Johnson and Goodpasture1934; Henle et al., Reference Henle, Henle, Wendell and Rosenberg1948). Later, it was used to confirm rabies infection in an infant by intracerebral inoculation of Swiss mouse pups with oral fluids from the child (Duffy et al., Reference Duffy, Woolley and Nolting1947). The fact that several viruses including cytomegalovirus, human immunodeficiency virus (HIV) (Groopman et al., Reference Groopman, Salahuddin, Sarngadharan, Markham, Gonda, Sliski and Gallo1984), herpesviruses (Kaufman et al., Reference Kaufman, Brown and Ellison1967; Douglas and Couch, Reference Douglas and Couch1970), Zika virus (Bonaldo et al., Reference Bonaldo, Ribeiro, Lima, Dos Santos, Menezes, Da Cruz, De Mello, Furtado, De Moura, Damasceno, Da Silva, De Castro, Gerber, De Almeida, Lourenco-De-Oliveira, Vasconcelos and Brasil2016), and influenza virus (Vinagre et al., Reference Vinagre, Martinez, Avendano, Landaeta and Pinto2003), added additional evidence to the role of oral fluids as a source of pathogens. In animals, Coxsackie b-1 virus from rabbits (Madonia et al., Reference Madonia, Bahn and Calandra1966), rabies virus from dogs (Larghi et al., Reference Larghi, Nebel, Lazaro and Savy1975), foot-and-mouth disease virus (FMDV) from cattle (Sellers et al., Reference Sellers, Burrows, Mann and Dawe1968), and influenza A virus and porcine reproductive and respiratory syndrome virus (PRRSV) from pigs (Wills et al., Reference Wills, Zimmerman, Yoon, Swenson, Mcginley, Hill, Platt, Christopher-Hennings and Nelson1997; Detmer et al., Reference Detmer, Patnayak, Jiang, Gramer and Goyal2011) can be isolated from oral fluid specimens.
Statement of the problem
In both basic research and diagnostic medicine, the repeatability of quantitative polymerase chain reaction (qPCR) testing is affected by the variation introduced at any point between sample collection and the final test report (Heid et al., Reference Heid, Stevens, Livak and Williams1996; Klein, Reference Klein2002; Hoorfar et al., Reference Hoorfar, Malorny, Abdulmawjood, Cook, Wagner and Fach2004). Ideally, proper controls can be used to verify the integrity of the process accounting for variation. Internal controls that were extracted or amplified concurrently with test samples verify that the procedure was performed correctly and functioned within expected parameters. In addition, external positive amplification controls (template control) containing fixed quantities of purified PCR target nucleic acids may be used to identify run-to-run variation, e.g. concentration of reagents, qPCR profiles, instrument settings. In contrast, external negative amplification controls (non-template controls) are used to detect reagent contamination.
Internal controls are nucleic acids that are either inherent to the specimen matrix (endogenous reference genes) or added (‘spiked’) into test samples (exogenous reference genes) prior to nucleic acid extraction. Importantly, qPCR results can be ‘normalized’ in the context of internal control results to compensate for variation arising from the initial sample nucleic acid quantity and/or concentration, differences among reverse transcription and amplification efficiencies, assay protocols, and/or instrument settings (Vandesompele et al., Reference Vandesompele, De Preter, Pattyn, Poppe, Van Roy, De Paepe and Speleman2002; Bustin and Nolan, Reference Bustin and Nolan2004; Huggett et al., Reference Huggett, Dheda, Bustin and Zumla2005; Bustin et al., Reference Bustin, Benes, Garson, Hellemans, Huggett, Kubista, Mueller, Nolan, Pfaffl, Shipley, Vandesompele and Wittwer2009; Biassoni and Raso, Reference Biassoni and Raso2014). A number of qPCR normalization-compatible internal reference genes have been described for diagnostic matrices in human medicine, e.g. reticulocytes, keratinocytes, oral fluids, bronchoalveolar lavage fluids, tissue samples (Glare et al., Reference Glare, Divjak, Bailey and Walters2002; Silver et al., Reference Silver, Best, Jiang and Thein2006; Bar et al., Reference Bar, Bar and Lehmann2009; Chervoneva et al., Reference Chervoneva, Li, Schulz, Croker, Wilson, Waldman and Hyslop2010; Koppelkamm et al., Reference Koppelkamm, Vennemann, Fracasso, Lutz-Bonengel, Schmidt and Heinrich2010; Martin, Reference Martin2016). In contrast, the use of internal reference genes is less frequently reported in veterinary research, perhaps because of the diversity of specimens and animal species (McIntosh et al., Reference Mcintosh, Tumber, Harding, Krakowka, Ellis and Hill2009; Pol et al., Reference Pol, Deblanc, Oger, Le Dimna, Simon and Le Potier2013; Yan et al., Reference Yan, Toohey-Kurth, Crossley, Bai, Glaser, Tallmadge and Goodman2020). Therefore, the objective of this review is to compare the use of internal reference genes reported in recent human and veterinary qPCR research involving the oral fluid matrix.
Inherent variations in real-time PCR
Although real-time PCR has been used to precisely quantify molecular substances, the data should be interpreted with caution because of the introduction of variations throughout the process. PCR results are typically reported as quantitation cycles (C q), i.e. the number of cycles required for the cumulative fluorescent intensity to meet a pre-determined threshold (Schmittgen and Livak, Reference Schmittgen and Livak2008; Rao et al., Reference Rao, Huang, Zhou and Lin2013). In general terms, samples with a higher initial concentration of target DNA/RNA will require fewer PCR amplification cycles to reach the threshold than those with a lower initial concentration (Schmittgen and Livak, Reference Schmittgen and Livak2008). However, in the laboratory, the C q of any given sample may be affected by extraneous factors, e.g. technicians' proficiency, test protocols, reagents, PCR conditions, and instruments (Johnson et al., Reference Johnson, Nolan, Bustin and Wilks2013; Kralik and Ricchi, Reference Kralik and Ricchi2017). For example, a recent study concluded the process of collecting nasopharyngeal swabs was a significant source of variability and could produce false-negative results in a severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) real-time PCR assay (Basso et al., Reference Basso, Aita, Navaglia, Franchin, Fioretto, Moz, Bozzato, Zambon, Martin, Dal Pra, Crisanti and Plebani2020). To address the problem of variability introduced by extraneous factors, results can be expressed as the DNA/RNA copy number in the sample (absolute quantification) or expressed as the difference in target DNA/RNA (relative quantification) relative to known negative samples (Klein, Reference Klein2002; Schmittgen and Livak, Reference Schmittgen and Livak2008; Kralik and Ricchi, Reference Kralik and Ricchi2017).
Real-time PCR quantification
Absolute quantification converts a C q result to DNA/RNA copy number using either digital qPCR or absolute standard curves. Digital qPCR is done by partitioning a sample into subsamples and then performing qPCR separately on each subsample. Thereafter, the distribution and proportion of subsamples containing molecules of interest are used to estimate the number of DNA/RNA copies based on the Poisson distribution (Dube et al., Reference Dube, Qin and Ramakrishnan2008; Huggett et al., Reference Huggett, Foy, Benes, Emslie, Garson, Haynes, Hellemans, Kubista, Mueller and Nolan2013). Alternatively, absolute quantification based on standard curves uses the relationship between the sample C q and known concentrations of DNA/RNA to interpolate the concentration of target in the sample. Absolute standard curves are typically established by generating C q results of serially diluted standards with known copy numbers of target DNA/RNA.
However, identifying the change of targets in unknown samples relative to negative calibrators may be sufficient for disease surveillance and diagnostic medicine (Livak and Schmittgen, Reference Livak and Schmittgen2001). Relative quantification of qPCR data may be achieved through two approaches: the relative standard curve and the comparative C q (Liu and Saint, Reference Liu and Saint2002). Relative standard curves use methods similar to absolute standard curves except the standards do not have known DNA/RNA copy numbers. Instead, relative standard curves describe the relationship between C q values and the mass of total DNA/RNA for each dilution. The sample C q result can then be interpreted in the context of the relative standard curve. Because both absolute and relative quantification require that standard curves for targets and references be generated in each PCR run to account for run-to-run variation, a comparative C q method (ΔΔC q, pronounced ‘double delta C q’), has been used in gene expression studies (Livak and Schmittgen, Reference Livak and Schmittgen2001; Pfaffl, Reference Pfaffl2001). This method quantifies the expression of a target gene in a treated sample relative to an untreated calibrator in terms of the fold change in gene expression (Rao et al., Reference Rao, Huang, Zhou and Lin2013). Conveniently, the treated sample and untreated calibrator can be collected at different time points, may be derived from different tissues, or obtained from individuals in different treatment groups (Rao et al., Reference Rao, Huang, Zhou and Lin2013). Unlike standard curve methods, the comparative C q method eliminates the need to generate standard curves in each PCR run and, therefore, may be used in high-throughput molecular laboratories performing routine disease diagnostic and surveillance testing.
Real-time PCR data normalization
Data normalization is a statistical procedure designed to control variations introduced in the sampling/testing process and to ensure that results are comparable within and between laboratories (Bylesjö et al., Reference Bylesjö, Cloarec, Rantalainen, Brown, Tauler and Walczak2009; Biassoni and Raso, Reference Biassoni and Raso2014; Filzmoser and Walczak, Reference Filzmoser and Walczak2014). For example, Dahdouh et al. (Reference Dahdouh, Lazaro-Perona, Romero-Gomez, Mingorance and Garcia-Rodriguez2020) contended that direct estimation of SARS-CoV-2 viral load based on raw C qs could neglect variation introduced during the sample collection process, e.g. patient tolerance to nasal swabbing, and concluded that the normalization of raw C qs against marker nucleic acid genes inherent to sampled cell masses or mucosal surfaces should be implemented to ensure the comparability of coronavirus disease 2019 (COVID-19) qPCRs (Dahdouh et al., Reference Dahdouh, Lazaro-Perona, Romero-Gomez, Mingorance and Garcia-Rodriguez2020; Walsh et al., Reference Walsh, Jordan, Clyne, Rohde, Drummond, Byrne, Ahern, Carty, O'brien, O'murchu, O'neill, Smith, Ryan and Harrington2020).
Three methods commonly used for PCR normalization include consistently testing the same amount (mass) of sample, measuring total RNA/DNA, or using endogenous/exogenous reference genes (Huggett et al., Reference Huggett, Dheda, Bustin and Zumla2005):
(1) Testing the same amount of sample is standard practice in molecular and diagnostic laboratories that use standardized protocols, albeit the concentration of detectable target in clinical samples is still affected by sample collection, storage, and handling and, therefore, may not fully represent the initial concentration.
(2) Normalizing qPCR results against the total RNA/DNA content in sample extracts, i.e. prior to PCR, is a more precise approach for controlling sample-to-sample variation (Wang et al., Reference Wang, Zhao, Li, Liu, Ernst, Liu, Liu, Xi and Lei2015). Quantification of total RNA/DNA can be achieved by spectrophotometrically measuring the optical absorbance (OD260) or the fluorescence of dyes that are randomly bound to nucleic acids of the extracted sample (Jones et al., Reference Jones, Yue, Cheung and Singer1998; Green and Sambrook, Reference Green, Sambrook, Green and Sambrook2018). However, using total DNA/RNA for data normalization assumes that the efficiency of reverse-transcription and PCR amplification is identical for each sample, i.e. does not take sample-to-sample variation into account (Bustin, Reference Bustin2002).
(3) The most common approach for qPCR data normalization is to express the C q of target DNA/RNA in the context of the C q of one or more reference genes (Wittwer et al., Reference Wittwer, Herrmann, Moss and Rasmussen1997). To serve this purpose, reference genes must have genetic sequences that differ from the target and be present at predictable concentrations in the sample (Vandesompele et al., Reference Vandesompele, De Preter, Pattyn, Poppe, Van Roy, De Paepe and Speleman2002; Huggett et al., Reference Huggett, Dheda, Bustin and Zumla2005; Bylesjö et al., Reference Bylesjö, Cloarec, Rantalainen, Brown, Tauler and Walczak2009; Guenin et al., Reference Guenin, Mauriat, Pelloux, Van Wuytswinkel, Bellini and Gutierrez2009). Pfaffl (Reference Pfaffl2001) proposed an approach that integrated data normalization and qPCR relative quantification using test sample and negative calibrator results (Equation 1). This method calculates the target-to-reference ratio (R) of the C q difference between a sample and a calibrator (ΔC q) while taking PCR amplification efficiencies for target (E target) and reference (E ref) sequences into account (Pfaffl, Reference Pfaffl2001):
(1)$$R = \displaystyle{{{( {E_{{\rm target}}} ) }^{\Delta C_{\rm q}_{_{{\rm target\;}( {{\rm calibrator}-{\rm sample}} ) } } }} \over {{( {E_{{\rm ref}}} ) }^{\Delta C_{\rm q}_{_{{\rm ref\;}( {{\rm calibrator}-{\rm sample}} ) } } }}}$$
In gene expression studies, samples collected from individuals with no treatment, or prior to treatment, may be used as negative calibrators and/or as a baseline relative to the expression/detection level of target genes in samples from treated individuals. Therefore, the relative quantity of a target gene in a treated sample is expressed as the fold change relative to an untreated calibrator, using a reference gene as a normalizer (Rao et al., Reference Rao, Huang, Zhou and Lin2013).
Both exogenous and endogenous reference genes have been used for data normalization at the individual sample level (Ke et al., Reference Ke, Chen and Yung2000). Exogenous reference genes are artificially synthesized nucleic acids with genetic sequences distinct from the target's (Huggett et al., Reference Huggett, Dheda, Bustin and Zumla2005). These heterologous genes may be spiked into test samples prior to the DNA/RNA extraction procedure at a fixed copy number or concentration (Yan et al., Reference Yan, Toohey-Kurth, Crossley, Bai, Glaser, Tallmadge and Goodman2020) to monitor the efficiency of DNA/RNA extraction and the integrity of reverse transcription and PCR amplification in test samples (Guenin et al., Reference Guenin, Mauriat, Pelloux, Van Wuytswinkel, Bellini and Gutierrez2009; Johnston et al., Reference Johnston, Gallaher and Czaja2012). In contrast, endogenous reference genes are host-specific nucleic acids inherent to the specimen (Yan et al., Reference Yan, Toohey-Kurth, Crossley, Bai, Glaser, Tallmadge and Goodman2020). Since endogenous reference genes are processed concurrently with target DNA/RNA, the detection of these genes reflects both the sample-to-sample variation in the quantity and quality of initial amplifiable DNA/RNA and the variation introduced by the extraction and amplification procedures (Radonic et al., Reference Radonic, Thulke, Mackay, Landt, Siegert and Nitsche2004).
Internal reference genes in oral fluids
Endogenous reference genes have been widely used in gene expression analyses for the purpose of representing sample nucleic acid concentration and as the gold standard for qPCR data normalization (Vandesompele et al., Reference Vandesompele, De Preter, Pattyn, Poppe, Van Roy, De Paepe and Speleman2002; Bustin et al., Reference Bustin, Benes, Nolan and Pfaffl2005; Huggett et al., Reference Huggett, Dheda, Bustin and Zumla2005). However, the expression of common reference genes depends on a variety of factors, e.g. cell/specimen types, sample quality and handling, age of subjects, animal species, and disease/treatment status (Zhong and Simons, Reference Zhong and Simons1999; Hamalainen et al., Reference Hamalainen, Tubman, Vikman, Kyrola, Ylikoski, Warrington and Lahesmaa2001; Selvey et al., Reference Selvey, Thompson, Matthaei, Lea, Irving and Griffiths2001; Deindl et al., Reference Deindl, Boengler, Van Royen and Schaper2002; Glare et al., Reference Glare, Divjak, Bailey and Walters2002). Thus, endogenous reference genes must be validated for their consistency of expression and/or detection in test specimens and under the conditions in which target genes will be evaluated (Mestdagh et al., Reference Mestdagh, Van Vlierberghe, De Weer, Muth, Westermann, Speleman and Vandesompele2009). Typically, this involves comparing the variation in endogenous gene C qs in samples from subjects with potentially impactful biological characteristics, e.g. age, gender, and disease status (Huggett et al., Reference Huggett, Dheda, Bustin and Zumla2005; Robinson et al., Reference Robinson, Sutherland and Sutherland2007; Becker et al., Reference Becker, Hammerle-Fickinger, Riedmaier and Pfaffl2010).
In this review, qPCR-based gene expression and disease diagnostic studies were evaluated for the use of endogenous and/or exogenous reference standards in oral fluid specimens from non-human vertebrate and human subjects. Initially, the MEDLINE® database was searched (title and abstract fields) on 24 October 2020 for refereed scientific publications containing the following searching terms: (‘saliva*’ or ‘oral fluid*’ or ‘oral swab*’) and (‘qpcr*’ or ‘quantitative pcr*’ or ‘real time pcr*’ or ‘real-time pcr*’ or ‘realtime pcr’ or ‘RT-qPCR’ or ‘qRT-PCR’ or ‘real time RT-PCR’ or ‘real-time RT-PCR’ or ‘realtime RT-PCR’) not (review[Publication Type]). Articles were excluded if not written in English, if not applicable to non-human vertebrate animals, if the oral fluid specimen was not collected by insertion of an absorptive collector into the mouth (Atkinson et al., Reference Atkinson, Dawes, Ericson, Fox, Gandara, Malamud, Mandel, Navazesh and Tabak1993), or if only components of oral fluids, e.g. microorganisms, biofilms, salivary extracellular vesicles, were evaluated. The remaining publications were evaluated for the use of internal endogenous and/or exogenous reference genes. A total of 1566 articles were retrieved from MEDLINE®. For the period 2003–2020, 136 met the language, research subject, and full-text criteria (Table 1). Among these, exogenous reference genes were used in 25.7% (35/136), endogenous reference genes in 27.2% (37/136), and 52.9% (72/136) did not include sufficient information on the use of internal reference genes.
ACTB, β-actin; GAPDH, glyceraldehyde-3-phosphate dehydrogenase; rRNAs, 5S, 18S, 28S ribosomal RNAs.
a Buffalo, cattle, deer, goat, and sheep.
b Chipmunk, gerbil, guinea pig, mongoose, mouse, rabbit, rat, shrew, squirrel, and vole.
c Bat, elephant, horse, opossum, poultry, skunk, turtle, and weasel.
d Endogenous reference genes with frequency < 5 pooled.
A similar strategy was used to retrieve oral fluid-based qPCR studies on human subjects from the MEDLINE® database for the articles published between 2016 and 2020. Among the 772 articles retrieved, 184 met the language, species, and content criteria (Table 1).
Exogenous reference genes were used in 14.0% of reviewed studies (25/179), endogenous reference genes in 31.8% (57/179), and 57.0% (102/179) of the studies did not include sufficient information on the use of internal reference genes. As shown in Table 1, β-actin (ACTB) mRNA, ribosomal RNAs (18S and 28S rRNA), and glyceraldehyde-3-phosphate dehydrogenase (GAPDH) mRNA, respectively, were the most frequently used endogenous reference genes in published studies on non-human vertebrates. In human studies, ACTB mRNA, GAPDH mRNA, and U6 small nuclear RNA (snRNA) were the most commonly reported.
Ribosomal RNAs
In mammalian cells, gene expression begins by transcribing DNA into single-stranded messenger RNA (mRNA) in the cell nucleus. From the nucleus, mRNA migrates to the cytoplasm where it is paired with complementary amino acids by ribosomes to build proteins (Sergiev et al., Reference Sergiev, Aleksashin, Chugunova, Polikanov and Dontsova2018). Ribosomes compose two subunits, the small subunit containing the 18S ribosomal RNA (rRNA) and ribosomal proteins, and the large subunit containing 5S, 5.8S, 28S rRNA, and ribosomal proteins (Lafontaine and Tollervey, Reference Lafontaine and Tollervey2001). Since ribosomal genetic material is highly conserved and nearly universal in cell-rich specimens, e.g. cell culture, peripheral blood mononuclear cells, and tissue samples, rRNA is one of the most commonly used internal reference genes for qPCR normalization in gene-expression research (Kozera and Rapacz, Reference Kozera and Rapacz2013; Ban et al., Reference Ban, Beckmann, Cate, Dinman, Dragon, Ellis, Lafontaine, Lindahl, Liljas, Lipton, Mcalear, Moore, Noller, Ortega, Panse, Ramakrishnan, Spahn, Steitz, Tchorzewski, Tollervey, Warren, Williamson, Wilson, Yonath and Yusupov2014). Because the 18S and 28S rRNAs are cleaved from the same single-stranded RNA transcript, the 28S:18S rRNA ratio has been used as an index of the integrity and quality of extracted RNA for electrophoresis-based PCR (Schroeder et al., Reference Schroeder, Mueller, Stocker, Salowsky, Leiber, Gassmann, Lightfoot, Menzel, Granzow and Ragg2006; Becker et al., Reference Becker, Hammerle-Fickinger, Riedmaier and Pfaffl2010). For example, De Ketelaere et al. (Reference De Ketelaere, Goossens, Peelman and Burvenich2006) and Zhao et al. (Reference Zhao, Liu, Li, Yang, Zhao, Liu, Liu, Liu, Yin, Guan and Luo2016) reported 18S rRNA as one of the most consistently expressed genes in bovine polymorphonuclear leukocytes and peripheral blood mononuclear cells (De Ketelaere et al., Reference De Ketelaere, Goossens, Peelman and Burvenich2006; Zhao et al., Reference Zhao, Liu, Li, Yang, Zhao, Liu, Liu, Liu, Yin, Guan and Luo2016). Zhong and Simons (Reference Zhong and Simons1999) concluded that the expression level of 28S rRNA was more consistent in hypoxia-cultured cells than ACTB mRNA, GAPDH mRNA, and cyclophilin mRNA (Zhong and Simons, Reference Zhong and Simons1999; Wang and Heitman, Reference Wang and Heitman2005).
However, the use of rRNAs as internal reference genes in qPCR has several shortcomings. First, their quantity and concentration can vary within specimens from the same species (Ingerslev et al., Reference Ingerslev, Pettersen, Jakobsen, Petersen and Wergeland2006; Rekawiecki et al., Reference Rekawiecki, Kowalik and Kotwica2013). Second, ribosomes are absent from red blood cells and rRNA detection can be inconsistent in specimens in which blood is a significant component. Finally, rRNAs may be overabundant in cell-rich specimens, e.g. peripheral blood mononuclear cells, tissue, and laboratory-cultured cells (Tong et al., Reference Tong, Gao, Wang, Zhou and Zhang2009). As a consequence, when reverse-transcribed and/or amplified simultaneously with the target, they may compete with the target for PCR components, e.g. polymerase, magnesium ions, and dNTP. Furthermore, an overabundance of rRNA increases the risk of cross-contamination during sample handling and testing (Yan et al., Reference Yan, Toohey-Kurth, Crossley, Bai, Glaser, Tallmadge and Goodman2020).
ACTB mRNA
β-Actin, encoded by ACTB mRNA, is an isoform of non-muscle actin protein that primarily serves as a component of the cytoskeleton of eukaryotic cells (Bunnell et al., Reference Bunnell, Burbach, Shimizu and Ervasti2011). ACTB has been used for sample quality assessment and qPCR normalization because of its ubiquitous expression in cells (Hunter and Garrels, Reference Hunter and Garrels1977; Biederman et al., Reference Biederman, Yee and Cortes2004; Johansson et al., Reference Johansson, Fuchs, Okvist, Karimi, Harper, Garrick, Sheedy, Hurd, Bakalkin and Ekstrom2007; Robinson et al., Reference Robinson, Sutherland and Sutherland2007; Ruan and Lai, Reference Ruan and Lai2007; Bar et al., Reference Bar, Bar and Lehmann2009; Die et al., Reference Die, Baldwin, Rowland, Li, Oh, Li, Connor and Ranilla2017), but recent studies have found the expression level of ACTB to vary by animal species, cell and/or specimen type, sample storage time, growth stage, medical treatment, and disease state (Gutala and Reddy, Reference Gutala and Reddy2004; Nishimura et al., Reference Nishimura, Nikawa, Kawano, Nakayama and Ikeda2008; Spalenza et al., Reference Spalenza, Girolami, Bevilacqua, Riondato, Rasero, Nebbia, Sacchi and Martin2011; Panahi et al., Reference Panahi, Salasar Moghaddam, Ghasemi, Hadi Jafari, Shervin Badv, Eskandari and Pedram2016; Khanna et al., Reference Khanna, Johnson and Maron2017; Alshehhi and Haddrill, Reference Alshehhi and Haddrill2019). For example, in human beings, lower expression of ACTB was reported in bronchoalveolar lavage fluid cells and airway endobronchial biopsy samples from asthmatic patients versus clinically normal subjects or subjects treated with inhaled corticosteroids (Glare et al., Reference Glare, Divjak, Bailey and Walters2002). Hamalainen et al. (Reference Hamalainen, Tubman, Vikman, Kyrola, Ylikoski, Warrington and Lahesmaa2001) reported up to 11-fold down-regulation of ACTB expression in T-cells over a 14-day course of T-cell differentiation (Hamalainen et al., Reference Hamalainen, Tubman, Vikman, Kyrola, Ylikoski, Warrington and Lahesmaa2001). In a qPCR reference gene validation study using peripheral blood mononuclear cells and whole blood from healthy and tuberculosis-positive subjects, ACTB showed >30-fold variability in both specimens and was determined to be unsuitable for data normalization (Dheda et al., Reference Dheda, Huggett, Bustin, Johnson, Rook and Zumla2004).
In veterinary studies, the expression of ACTB depends on a number of factors and its use as an internal reference standard requires assessment on a case-by-case basis. Stable expression of ACTB has been reported in feline tissue samples and bovine peripheral blood mononuclear cells (Ingerslev et al., Reference Ingerslev, Pettersen, Jakobsen, Petersen and Wergeland2006; Robinson et al., Reference Robinson, Sutherland and Sutherland2007; Kessler et al., Reference Kessler, Helfer-Hungerbuehler, Cattori, Meli, Zellweger, Ossent, Riond, Reusch, Lutz and Hofmann-Lehmann2009; Jursza et al., Reference Jursza, Skarzynski and Siemieniuch2014), but the expression of ACTB has been reported as low in bovine polymorphonuclear leukocytes (De Ketelaere et al., Reference De Ketelaere, Goossens, Peelman and Burvenich2006). In a study evaluating the expression of 11 housekeeping genes in canine tissue specimens, including bone marrow, various enteric tissues, heart, muscle, pancreas, and spleen, ACTB was found to be the least consistently expressed (Peters et al., Reference Peters, Peeters, Helps and Day2007). Thus, although ACTB has been widely used for data normalization in qPCR studies, care should be taken to validate its consistency of expression in the target species and specimen.
GAPDH mRNA
Encoded by GAPDH mRNA, GAPDH is a cytoplasmic enzyme that facilitates glycolysis, a metabolic pathway to release energy, by converting glyceraldehyde-3-phosphate to 1,3-biphosphoglycerate (Tristan et al., Reference Tristan, Shahani, Sedlak and Sawa2011; Nicholls et al., Reference Nicholls, Li and Liu2012; Alfarouk et al., Reference Alfarouk, Verduzco, Rauch, Muddathir, Adil, Elhassan, Ibrahim, David Polo Orozco, Cardone, Reshkin and Harguindey2014). The ubiquitous expression of GAPDH mRNA in living cells has led to its common use as an endogenous reference control for qPCR normalization in gene expression and disease diagnostic studies (Rebouças et al., Reference Rebouças, Costa, Passos, Passos, Hurk and Silva2013). However, like rRNAs and ACTB mRNAs, the expression of GAPDH mRNA may vary among subjects and treatments. Consistent GAPDH mRNA expression has been reported in oral fluid specimens from premature human neonates, human cervical tissues, and neonatal cardiac ventricular myocytes (Winer et al., Reference Winer, Jung, Shackel and Williams1999; Shen et al., Reference Shen, Li, Ye, Wang, Lu and Xie2010; Maron et al., Reference Maron, Johnson, Dietz, Chen and Bianchi2012). However, the inconsistent expression of GAPDH mRNA has been reported under a number of experimental conditions, e.g. growing collateral arteries of rabbits, asthmatic human subjects with/without corticosteroid treatment, cells cultured under hypoxic conditions, and whole blood from tuberculosis patients (Zhong and Simons, Reference Zhong and Simons1999; Deindl et al., Reference Deindl, Boengler, Van Royen and Schaper2002; Glare et al., Reference Glare, Divjak, Bailey and Walters2002; Dheda et al., Reference Dheda, Huggett, Bustin, Johnson, Rook and Zumla2004). Barber et al. (Reference Barber, Harmer, Coleman and Clark2005) reported up to a 15-fold difference in the expression level of GAPDH mRNA across 72 human tissues (Barber et al., Reference Barber, Harmer, Coleman and Clark2005). Therefore, GAPDH mRNA may not be the appropriate endogenous reference control for the comparison of qPCR results across specimen matrices.
U6 snRNA
After DNA transcription, RNA transcripts undergo modification to become functional mRNAs able to perform protein synthesis (Moore and Proudfoot, Reference Moore and Proudfoot2009). This pre-mRNA processing involves (1) removing introns from pre-mRNAs (splicing); (2) adding a modified guanine nucleotide at the 5′ ends (5′ capping); and (3) adding a long chain of adenine nucleotides at the 3′ end (3′ poly-A tailing). In mammalian cells, U1, U2, U4, U5, and U6 snRNAs complex with RNA-binding proteins to form small nuclear ribonucleoproteins able to perform the splicing activity required to functionalize mRNA (Maniatis and Reed, Reference Maniatis and Reed1987; Brow and Guthrie, Reference Brow and Guthrie1988; Stefl et al., Reference Stefl, Skrisovska and Allain2005). Among five snRNAs, U6 snRNA was the most conserved in size, sequence, and structure across yeast, bean, fly, and mammalian cells (Brow and Guthrie, Reference Brow and Guthrie1988). Because of its small size (~100 nucleotides), U6 snRNA has been used to research the expression of micro RNAs, a group of small single-stranded RNAs known for silencing and interfering with mRNA expressions in plants, animals, and viruses (Bushati and Cohen, Reference Bushati and Cohen2007; Mase et al., Reference Mase, Grasso, Avogaro, D'amato, Tessarolo, Graffigna, Denti and Ravelli2017; Didychuk et al., Reference Didychuk, Butcher and Brow2018). For human samples, U6 snRNA has been used as an internal reference gene for the study of micro-RNA expression in human urinary sediment and serum samples from colorectal adenoma, colorectal adenocarcinoma, and healthy human subjects (Zheng et al., Reference Zheng, Wang, Zhang, Yang, Wang, Du, Li, Li, Qu, Liu and Wang2013; Duan et al., Reference Duan, Cai, Li, Bu, Wang, Yin and Chen2018). However, as observed in other endogenous reference genes, the expression level of U6 snRNA varies among specific specimens and treatments. For example, variation in U6 snRNA expression has been reported in 13 normal and 5 tumorous tissues including colon, esophagus, lung, lymphoid, and prostate (Peltier and Latham, Reference Peltier and Latham2008). Lou et al. (Reference Lou, Ma, Xu, Jiang, Yang, Wang, Jiao and Gao2015) evaluated the expression of U6 snRNA in normal and carcinomatous tissues and showed higher levels of U6 snRNA in carcinoma tissues of human breast, liver, and intrahepatic bile ducts compared to normal adjacent tissues (Lou et al., Reference Lou, Ma, Xu, Jiang, Yang, Wang, Jiao and Gao2015). Therefore, the constancy of U6 snRNA expression should be ascertained prior to implementing its use as an endogenous reference control for qPCR normalization.
Exogenous reference genes
The use of exogenous mRNAs or DNAs added (‘spiked’) to specimens is well-described for qPCR normalization (Johnston et al., Reference Johnston, Gallaher and Czaja2012). Exogenous genes are often artificially synthesized and simultaneously detected by primers and probes distinct from those designed for the target genes. Unlike endogenous reference genes, they reflect variation in nucleic acid extraction and qPCR amplification procedures, but not sample collection and handling. For diagnostic qPCRs, exogenous reference genes provide the advantage of consistency, i.e. to avoid the variation reported for endogenous genes, and, therefore, may be a more reliable normalizer than endogenous genes. However, their use in gene expression research is limited because they do not provide a baseline for the comparison of treated and untreated subjects. Among animal qPCR publications reviewed, internal positive controls included in commercial qPCR assays were the most frequently used while heterologous genes, e.g. algal and enhanced green fluorescent genes, were described as well (Hoffmann et al., Reference Hoffmann, Depner, Schirrmeier and Beer2006; Henderson et al., Reference Henderson, Perkins, Havens, Kelly, Francis, Dole and Shek2013).
Use of endogenous and exogenous reference genes in routine oral fluid diagnostics
Exploration of the diagnostic use of PCR technologies for the detection of pathogen-specific nucleic acids in human oral fluids began in the 1990s (Mandel, Reference Mandel1993; Streckfus and Bigler, Reference Streckfus and Bigler2002) and early successes included Epstein–Barr virus, human herpesvirus type 6, HIV, human cytomegalovirus, and human papillomavirus (Goto et al., Reference Goto, Yeh, Notkins and Prabhakar1991; Saito et al., Reference Saito, Nishimura, Kudo, Fox and Moro1991; Garweg et al., Reference Garweg, Fenner, Bohnke and Schmitz1993; Tominaga et al., Reference Tominaga, FUKUSHIMA, Nishizaki, Watanabe, Masuda and Ogura1996). This developmental work led to PCR testing of oral fluid samples for the surveillance of human papillomavirus, HIV, measles, and others (Johnson et al., Reference Johnson, Parry, Best, Smith, De Silva and Mortimer1988; Frerichs et al., Reference Frerichs, Htoon, Eskes and Lwin1992; Ramsay et al., Reference Ramsay, Brugha and Brown1997; Ahn et al., Reference Ahn, Chan, Zhang, Wang, Khan, Bishop, Westra, Koch and Califano2014). More recently, SARS-CoV-2 has been detected in oral fluids, suggesting that oral fluid could facilitate the efficient surveillance of the ongoing worldwide coronavirus pandemic (COVID-19) (Azzi et al., Reference Azzi, Carcano, Gianfagna, Grossi, Dalla Gasperina, Genoni, Fasano, Sessa, Tettamanti, Carinci, Maurino, Rossi, Tagliabue and Baj2020; Pasomsub et al., Reference Pasomsub, Watcharananan, Boonyawat, Janchompoo, Wongtabtim, Suksuwan, Sungkanuparph and Phuphuakrat2020; To et al., Reference To, Tsang, Yip, Chan, Wu, Chan, Leung, Chik, Choi, Kandamby, Lung, Tam, Poon, Fung, Hung, Cheng, Chan and Yuen2020).
As for human beings, PCR technology has been applied to the detection of viral pathogens in animal oral fluid specimens, including feline herpesvirus 1 in oral swabs from experimentally inoculated cats (Reubel et al., Reference Reubel, Ramos, Hickman, Rimstad, Hoffmann and Pedersen1993), canine distemper virus in dogs (Shin et al., Reference Shin, Cho, Cho, Kang, Kim, Kim, Park and Park2004), Borna disease virus in rodents (Sierra-Honigmann et al., Reference Sierra-Honigmann, Rubin, Estafanous, Yolken and Carbone1993), FMDV in sheep (Callens et al., Reference Callens, De Clercq, Gruia and Danes1998), and PRRSV in swine (Wills et al., Reference Wills, Zimmerman, Yoon, Swenson, Mcginley, Hill, Platt, Christopher-Hennings and Nelson1997). As in human diagnostic medicine, PCR testing has been used in oral fluid-based surveillance and herd-level detection of various swine viral diseases, e.g. porcine circovirus type 2, PRRSV, porcine epidemic diarrhea virus, influenza A virus (Ramirez et al., Reference Ramirez, Wang, Prickett, Pogranichniy, Yoon, Main, Johnson, Rademacher, Hoogland, Hoffmann, Kurtz, Kurtz and Zimmerman2012; Bjustrom-Kraft et al., Reference Bjustrom-Kraft, Christopher-Hennings, Daly, Main, Torrison, Thurn and Zimmerman2018), and others (Henao-Diaz et al., Reference Henao-Diaz, Gimenez-Lirola, Baum and Zimmerman2020).
Several fundamental concerns arise when considering the routine use of endogenous reference genes in oral fluid specimens. First, oral fluid is not a cell-rich specimen and the quantity/concentration of target genes, e.g. viral DNA/RNA, may not be biologically associated with the concentration of endogenous reference genes, as it would in specimens with cellular context (Nybo, Reference Nybo2012). For that reason, endogenous reference genes commonly used with cell-rich specimens may not be valid for qPCR normalization in oral fluids. Second, the quality of oral fluid specimens can be affected by sample collection methods. Rogers et al. (Reference Rogers, Cole, Lan, Crossa and Demerath2007) reported that oral fluid specimens collected via spitting or oral rinse resulted in a higher concentration and quality of DNA compared to oral brush and swab samples (Rogers et al., Reference Rogers, Cole, Lan, Crossa and Demerath2007). Third, few studies have evaluated the expression of common endogenous reference genes in oral fluid specimens.
The ideal endogenous reference gene for the normalization of diagnostic qPCRs would be abundant and consistent across specimen types, stable in diagnostic specimens over time, and independent from the effect of the pathogen (or the treatment) on the host (Thellin et al., Reference Thellin, Zorzi, Lakaye, De Borman, Coumans, Hennen, Grisar, Igout and Heinen1999; Dheda et al., Reference Dheda, Huggett, Bustin, Johnson, Rook and Zumla2004; Radonic et al., Reference Radonic, Thulke, Mackay, Landt, Siegert and Nitsche2004; Mestdagh et al., Reference Mestdagh, Van Vlierberghe, De Weer, Muth, Westermann, Speleman and Vandesompele2009; Chervoneva et al., Reference Chervoneva, Li, Schulz, Croker, Wilson, Waldman and Hyslop2010). Such a reference gene has not been identified (Peltier and Latham, Reference Peltier and Latham2008); however, other genes inherent to oral fluid specimens merit consideration.
Ubiquitous in epithelial tissues throughout the body, mucins are a family of high molecular weight glycoproteins that are used to protect and lubricate mucosal surfaces (Gendler and Spicer, Reference Gendler and Spicer1995; Debailleul et al., Reference Debailleul, Laine, Huet, Mathon, D'hooghe, Aubert and Porchet1998; Moniaux et al., Reference Moniaux, Escande, Porchet, Aubert and Batra2001). The 21 types of mucin identified to date may be divided into gel-forming mucins, soluble mucins, and transmembrane mucins (Kumar et al., Reference Kumar, Cruz, Joshi, Patel, Jahan, Batra and Jain2017). Among these, MUC1, MUC4, MUC5B, MUC7, and MUC19 are secreted by salivary glands (Nielsen et al., Reference Nielsen, Bennett, Wandall, Therkildsen, Hannibal and Clausen1997; Thornton et al., Reference Thornton, Khan, Mehrotra, Howard, Veerman, Packer and Sheehan1999; Sengupta et al., Reference Sengupta, Valdramidou, Huntley, Hicks, Carrington and Corfield2001; Liu et al., Reference Liu, Lague, Nunes, Toselli, Oppenheim, Soares, Troxler and Offner2002; Alos et al., Reference Alos, Lujan, Castillo, Nadal, Carreras, Caballero, De Bolos and Cardesa2005; Linden et al., Reference Linden, Sutton, Karlsson, Korolik and Mcguckin2008), with MUC5B and MUC7, the two major mucins in saliva, constituting ~20% of the total salivary protein (Takehara et al., Reference Takehara, Yanagishita, Podyma-Inoue and Kawaguchi2013). Data are lacking at present, but future research should determine whether mRNAs that transcribe critical mucin domains might serve as endogenous reference standards for oral fluid specimens (Debailleul et al., Reference Debailleul, Laine, Huet, Mathon, D'hooghe, Aubert and Porchet1998).
As an alternative to a single endogenous reference gene, normalizing qPCR data against the geometric mean of multiple endogenous reference genes has been used in gene expression research. As opposed to using a single reference gene, this strategy lowers the risk of introducing additional variation into research data (Vandesompele et al., Reference Vandesompele, De Preter, Pattyn, Poppe, Van Roy, De Paepe and Speleman2002; Bustin et al., Reference Bustin, Benes, Garson, Hellemans, Huggett, Kubista, Mueller, Nolan, Pfaffl, Shipley, Vandesompele and Wittwer2009). For example, in a study comparing the mRNA levels of eight common endogenous reference genes in oral fluid specimens between healthy (n = 9) and autistic (n = 9) males (~4 years of age), the most consistent detection was determined in GAPDH mRNA, but the combination of GAPDH and YWHAZ (tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein and zeta polypeptide) mRNAs provided the best qPCR normalization (Panahi et al., Reference Panahi, Salasar Moghaddam, Ghasemi, Hadi Jafari, Shervin Badv, Eskandari and Pedram2016). Regardless, data normalization using multiple endogenous reference genes is impractical in high-throughput testing laboratories performing diagnostic qPCRs. For that reason, exogenously synthesized genetic sequences spiked into oral fluid samples have been utilized to monitor the DNA/RNA extraction and qPCR testing processes (Howson et al., Reference Howson, Armson, Lyons, Chepkwony, Kasanga, Kandusi, Ndusilo, Yamazaki, Gizaw, Cleaveland, Lembo, Rauh, Nelson, Wood, Mioulet, King and Fowler2018; Weiser et al., Reference Weiser, Poonsuk, Bade, Gauger, Rotolo, Harmon, Gonzalez, Wang, Main and Zimmerman2018; Nagel et al., Reference Nagel, Dimitrakopoulou, Teig, Kern, Lücke, Michna, Korn, Steininger, Shahada, Neumann and Überla2020; Nagura-Ikeda et al., Reference Nagura-Ikeda, Imai, Tabata, Miyoshi, Murahara, Mizuno, Horiuchi, Kato, Imoto, Iwata, Mimura, Ito, Tamura and Kato2020). Although they cannot reflect sample quality, exogenous reference genes can be used for qPCR normalization to provide consistent comparisons across clinical samples (Johnston et al., Reference Johnston, Gallaher and Czaja2012; O'Connell et al., Reference O'connell, Chantler and Barr2017).
Conclusion
Endogenous and exogenous reference genes are used in gene-expression studies to control for variation inherent in the qPCR testing process and achieve qPCR normalization using well-described mathematical approaches, e.g. the ΔC q method proposed by Pfaffl (Reference Pfaffl2001). Although qPCR normalization is recommended to ensure the comparability of results, the majority of oral fluid-based qPCR publications evaluated for this review (52.9% animal studies; 57.0% human studies) did not describe the use of internal controls (Table 1). As oral fluid-based PCRs become more widely implemented in human and veterinary diagnostic settings, this shortcoming should be addressed through the routine use of validated endogenous and/or exogenous reference genes in qPCR testing. The problems inherent with the use of endogenous reference genes include variation in the concentration of endogenous reference genes introduced by specimen matrices, sample quality and handling, subject age, animal species, and/or disease status (Bustin, Reference Bustin2002; Glare et al., Reference Glare, Divjak, Bailey and Walters2002; Bustin and Nolan, Reference Bustin and Nolan2004; Silver et al., Reference Silver, Best, Jiang and Thein2006; Nishimura et al., Reference Nishimura, Nikawa, Kawano, Nakayama and Ikeda2008; Kozera and Rapacz, Reference Kozera and Rapacz2013). One possible solution is to normalize qPCR data using two or more validated endogenous reference genes (Vandesompele et al., Reference Vandesompele, De Preter, Pattyn, Poppe, Van Roy, De Paepe and Speleman2002), but in the high-throughput diagnostic setting, a more efficient and practical approach would be spiking samples with a universally synthesized exogenous gene. Notably, this approach does not control for sample quality (Kavlick, Reference Kavlick2018). Finally, because of their robust and consistent expressions in oral fluids, specific mucin genes should be evaluated for the potential to serve as endogenous reference genes for qPCR normalization.
Conflict of interest
The authors declare no conflicts of interest with respect to their authorship and/or the publication of this manuscript.