Introduction
Human bocaviruses (HBoVs) are members of the family Parvoviridae and are classified within the subfamily Parvoviridae. HBoVs are small non-enveloped viruses with an icosahedral symmetry and a single-stranded DNA genome of approximately 5 kb in length. The genome encodes two forms of non-structural proteins, NS1 and nuclear phosphoprotein 1 (NP1) and two major structural proteins, VP1 and VP2 [Reference Chen1–Reference Dijkman5]. Currently, HBoVs have been classified into four species, HBoV1, HBoV2, HBoV3 and HBoV4, based on the nucleotide (nt) divergence of the VP1 capsid region [Reference Zhao6]. The HBoV genome contains three open-reading frames (ORFs); the first two sequential ORFs (ORF1 and ORF2) encode NS1 and NP1, respectively, and the third downstream ORF3 encodes VP1 and VP2 [Reference Chen1, Reference Babkin7]. NS1 is known to have regulatory functions, including transactivation or induction of apoptosis [Reference Moffatt8–Reference Schildgen10]. Although NP1 is a nucleoprotein, its specific functions are currently unknown, and VP1/VP2 are viral capsid proteins [Reference Schildgen10–Reference Lindner and Modrow12]. NS1 shows the greatest nt sequence conservation of the genome, with the lowest genetic diversity among all HBoV subtypes; thus, this gene has been preferentially used as a target for the detection of HBoV [Reference Chieochansin13, Reference Maggi14]. By contrast, VP1/VP2 forms a variable region exhibiting high genetic diversity and has thus been mostly used for the phylogenetic analysis of HBoV [Reference Blinkova15, Reference Alam16]. ORF1 and ORF3 are present in all parvovirus genomes, whereas the presence of ORF2 in the bocavirus genome is a specific feature distinguishing this group from the other parvoviruses [Reference Fields, Knipe and Howley17]. HBoV NS1 is essential for replication of the viral single-stranded DNA genome, DNA packaging and may play several versatile roles in virus–host interactions [Reference Tewary18]. HBoV is a parvovirus that was first identified in 2005 using a protocol based on DNase treatment, random polymerase chain reaction (PCR) amplification, high-throughput sequencing and bioinformatics analysis. When this virus-screening technique was initially applied to nasopharyngeal swabs and washings from children with unresolved respiratory tract infections, a positive result rate of 3.1% was obtained; hence, it was proposed that HBoV is a causative pathogen of respiratory tract diseases [Reference Allander19, Reference Guido20]. Although HBoV1 and HBoV2 have been mainly detected in stool samples, HBoV3 and HBoV4 are only occasionally detected in these samples [Reference Khamrin3, Reference Allander21–Reference Jin23]. Since its discovery, HBoV has been found to be circulating globally, and is predominantly detected in urine, serum and stool specimens of children with respiratory infections [Reference Allander21, Reference Kesebir24–Reference Xiang26]. Conversely, recent research has raised concerns over its association with transfusion medicine [Reference Li27]. HBoV is the main cause of respiratory tract infection symptoms in infants and toddlers, whose manifestation varies from no symptoms to symptoms such as fever, coughing and runny nose [Reference Schildgen10, Reference Lindner and Modrow12, Reference Arnold28]. In many cases, patients with HBoV detected in their faecal specimens present with symptoms of viral gastroenteritis, including diarrhoea, vomiting and fever [Reference Chieochansin13, Reference Kang29]. HBoV1 has been most frequently detected in respiratory specimens, whereas HBoV2, HBoV3 and HBoV4 are most commonly found in faecal specimens, indicating a predisposition to causing gastrointestinal diseases [Reference Kapoor2, Reference Kapoor30–Reference Arthur33]. In 2009, Kapoor et al. [Reference Kapoor30] identified a new parvovirus, HBoV-2, in stool samples from 98 Pakistani children and 699 British individuals of a mixed-age population, which was suggested to be an aetiologic agent in acute gastroenteritis [Reference Han22]. The high detection rate and a high degree of genetic diversity among these enteric viruses from stool specimens (especially for HBoV2) suggest that they may be pathogenic viruses in acute gastroenteritis, although current data show contradictory conclusions [Reference Jin23, Reference Kapoor30, Reference Arthur33]. Thus, the aim of the present study was to analyse and present the first full-length genome sequence of an HBoV strain from South Korea. Phylogenetic analysis was performed for comparison of the strain with reported genotypes. We expect that these data will prove to be useful not only for advancing research in the molecular biology of HBoVs but also for basic epidemiologic analyses such as tracking of the international spread of the virus.
Materials and methods
Ethics statement
An HBoV-positive stool sample was obtained from a male infant who presented with fever and diarrhoea, provided by the Waterborne Virus Bank (WAVA). Most stool samples were obtained from infants (⩽3 years old) who were hospitalised, and the HBoV2 sample used in this study was also from an infant patient (2 years old). Because of difficulties in tracking the exact records of the patient from the donor hospital, informed consent from the parent of the patient could not be acquired. The Institutional Review Board reviewed and approved the use of this sample for the purpose of research as this study did not directly affect the patient. All of the experimental work and sample collections were supervised by the Catholic Medical Center Office of Human Research Protection Program (CMC OHRP) of South Korea (approval no. MC14SISI0096).
Sample preparation and viral DNA extraction
The stool sample was stored at −20 °C until further analysis. Viral DNA was extracted into 50 µl of elution buffer manually from 140 µl of a 10% faecal suspension prepared in phosphate-buffered saline using the QIAamp DNA mini kit (Qiagen, Hilden, Germany), according to the manufacturer's protocol. Isolated DNA was stored at −20 °C until further use.
Polymerase chain reaction
For the detection of complete HBoV, PCR was performed with the 2× Emerald Amp PCR Master Mix (TaKaRa, Shiga, Japan) on an S1000 thermal cycler (Bio-Rad, Hercules, CA, USA), using primers designed based on the full genome sequence of the detected HBoV (Table 1). The PCR steps comprised initial activation (94 °C for 5 min), 25 cycles of three-step cycling (94 °C for 30 s, 53.8–60.1 °C for 30 s and 72 °C for 1 min) and final extension (72 °C for 7 min). All PCR products were examined by electrophoresis on ethidium bromide-stained 2% agarose gels.
a According to GenBank accession number GU048663.
Determination of the 5′ and 3′-ends of the HBoV genomic DNA
To determine the 5′-ends of HBoV genomic DNA, rapid amplification of cDNA ends (RACE) was performed with the 5′ RACE System for RACE Version 2.0 Kit according to the manufacturer's recommendations (Invitrogen, Carlsbad, CA, USA). Three primers (GSP1, GSP2 and nested GSP) were designed based on the NS1 sequence for 5′-end RACE PCR (Table 1). To obtain the exact sequence of the 3′-end of the HBoV genomic DNA, cDNA was synthesised using reverse transcription with 3′-oligo (dT)-anchor-R (Table 1). The second PCR was conducted using the VP1/VP2-F and 3′-anchor-R primers (Table 1) under the following conditions: 30 cycles of three-step cycling (94 °C for 30 s, 55 °C for 30 s and 72 °C for 1 min) and 72 °C for 10 min.
Cloning and sequencing of the complete genome
All PCR products obtained were extracted using the HiYield Gel/PCR DNA Fragments Extraction Kit (RBC, Taipei, Taiwan) and were cloned into pGEM-T Easy Vectors (Promega, Madison, WI, USA). The cloned vector was transformed into Escherichia coli DH5α competent cells (RBC) according to the manufacturer's instructions and was selected from Luria–Bertani agar plates (Duchefa, Haarlem, the Netherlands) containing 40 mg/ml X-gal, 0.1 mM isopropyl-β-d-thiogalactoside and 50 mg/ml ampicillin at 37 °C for 16–18 h. Selected clones were inoculated in Luria–Bertani broth (Duchefa) and incubated overnight in a shaking incubator (IS-971R, Jeiotech, Daejeon, South Korea) at 37 °C and 200 rpm. Plasmid DNA was purified using the HiYield Plasmid Mini Kit (RBC) and sequenced (Macrogen, Seoul, South Korea). The sequencing results were analysed using BLAST (National Center for Biotechnology Information, NCBI).
Phylogenetic analysis
Comparative sequence analysis, including sequence alignments and estimation of genetic distances, was performed with Clustal W using the Molecular Evolutionary Genetic Analysis (MEGA) software version 7.0 [Reference Tamura34]. Phylogenetic trees were constructed using the neighbour-joining method with a Kimura two-parameter model in MEGA [Reference Saitou and Nei35] and branch support was calculated based on 1000 bootstrap replicates. The complete genome sequences and partial genome sequences were obtained from the NCBI database.
Results
Nucleotide sequence identities
The complete coding sequences of nt and deduced amino acids (aa) of the two non-structural proteins, NS1 and NP1, as well as the two major structural proteins, VP1 and VP2, of the newly obtained South Korean strain CUK-BC20 (GenBank no. MF680549), were compared with those of established reference strains of HBoV1–4. The coding sequence starting from NS1, NP1 to VP1/VP2 (excluding the 5′- and 3′-untranslated sequences) is 4787 nt long. Analysis of the full-length sequence of CUK-BC20 revealed that this similar HBoV strain is most closely related to the Russian HBoV2 strain Rus-Nsc10-N386 (GenBank accession no. JQ964116) with 98.77% similarity (Table 2). Moreover, NS1 and NP1 of strain CUK-BC20 showed nt sequence identity with the HBoV4 strains KC461233 (CMH-S011-11) and FJ973561 (NI-385) at 90.64% and 92.28%, respectively. The NS1 gene of strain CUK-BC20 is 1923 nt long, encoding a polypeptide of 640 aa residues. Surprisingly, the NS1 nt sequence comparison indicated that CUK-BC20 is most closely related to the HBoV2 reference strains at 92.46–98.13% (Table 2). The highest sequence identity (98.13%) was found with the reference strain GU048662 (CU47TH) detected in Thailand. Analysis of the complete NP1 gene sequence revealed a sequence of 648 nt long encoding a polypeptide of 215 aa residues, similar to the NP1 genes of the reference strains GU048662 (CU47TH) and GU048663 (CU54TH). The complete VP1 gene coding sequence of the strain CUK-BC20 is 2004 nt long encoding 667 aa residues. In addition, within this VP1 sequence, another start codon of VP2 (1617 nt long encoding 538 aa residues) was found at nt position 388 from the VP1 starting point. Comparative analysis of the nt sequences of the full-length VP1/VP2 gene with those of the HBoV1–4 reference sequences indicated that the VP1/VP2 gene of CUK-BC20 is most closely related to those of the reference strains JQ964116 (Rus-Nsc10-N386) and EU082214 (W208) detected in Russia and Australia, respectively, at 99.5% (Table 2).
Nucleotide and amino acid polymorphisms
Based on the three HBoV ORFs (NS1, NP1 and VP1/VP2) and the complete genomes of 17 HBoV2 strains, 10 of 1923 nts analysed for the NS1 gene were found to be variable, with five (50%) transitions and five (50%) transversions; four of 648 nts analysed for the NP1 gene were found to be variable, with three (75%) transitions and one (25%) transversion; and seven of 2004 nts analysed for the VP1 gene were found to be variable, with six (85%) transitions and one (15%) transversion (Fig. 1). VP1 amino acid sequences were compared among HBoV1, HBoV2A, HBoV2B, HBoV3 and HBoV4 genotypes using representative strains for each genotype (Fig. 3). In particular, the amino acid sequences from aa 212 to 213 and aa 454 corresponding to a variable region (VR)1 and VR5, respectively, showed genotype-specific substitutions that distinguished the four HBoV genotypes. Among 640 NS1 aa sequences, there were three aa substitutions detected: aa178 (Ser→Gly), aa180 (His→Val) and aa612 (Gly→Arg). Among 215 NP1 aa sequences, there was one aa substitution detected (data not shown): aa9 (Arg→Lys). Among 667 VP1/2 aa sequences, there were four aa substitutions detected (data not shown): aa9 (Gly→Arg), aa68 (Asp→Asn), aa244 (Leu→Arg) and aa282 (Thr→IlE).
Phylogenetic analysis
The HBoV strains used in the phylogenetic analysis included the target Korean strain CUK-BC20, as well as representative strains of HBoV1–4, human parvovirus B19, bovine parvovirus and canine minute virus. Based on the complete genome sequences, the phylogenetic analysis showed that CUK-BC20 is most genetically close to HBoV2 (Fig. 2), which was consistent with the sequence comparison analysis results. The phylogenetic tree based on the complete genome further showed that the different strains of HBoV2 could be clearly divided into three groups (Fig. 3a). No apparent genotypic differences were evident among the phylogenetic trees based on the three HBoV ORFs (NS1, NP1 and VP1/VP2) and the complete genomes of 17 HBoV2 strains (Figs. 3b–d). As previously reported, NS1 appeared to be the most highly conserved gene, whereas VP1/VP2 had the greatest number of nt polymorphisms. Moreover, the phylogenetic trees based on the VP1/VP2 gene and complete genome were nearly identical, which indicated that VP1/VP2 can be used instead of the complete genome to analyse the genetic relationships among HBoVs.
Discussion
HBoV is a newly discovered Parvoviridae virus. However, the pathogenicity of HBoV is still uncertain because of its high co-infection rate with other pathogens; thus, it remains unclear whether HBoVs are the sole etiologic agent or simply a concomitant virus bystander in these cases. Therefore, to gain a better understanding of the prevailing status and pathogenicity of HBoV, more strains need to be simultaneously examined. HBoV is considered a major agent of several respiratory tract diseases, and according to several reports, HBoV is associated with gastrointestinal disorders, commonly by coinfection with Rotavirus, Norovirus and Adenovirus. The proportion of multiple viral infections in HBoV-infected patients has been reported to be between 35% and 91% [Reference Allander21, Reference Choi36–Reference Foulongne39]. In this study, HBoV-infected patients were shown to have been coinfected with Human astrovirus 5. Human astroviruses are recognised as an important cause of infantile gastroenteritis around the world. The conflicting ideas of the pathogenic role of HBoVs are mainly due to the fact that Koch's revised postulates cannot be applied to HBoV, because neither an effective method for virus culture nor an animal model of infection is currently available in practice [Reference Guido20]. Moreover, little is known about the epidemiology and genetic characteristics of HBoV circulating in Korea. This is the first study to determine the whole genome sequence of HBoV in South Korea, based on a strain isolated from a patient with acute gastroenteritis. The identified HBoV strain, CUK-BC20, belongs to genotype 2 and showed no intra- or inter-genogroup recombination of the non-structural protein-encoding region and the VP1,2-encoding region. CUK-BC20 is very similar to the Russian HBoV isolate Rus-Nsc10-N386 (JQ964116), the prototype of which was reported from a patient with acute gastroenteritis at Novosibirsk Child Hospital, Russia in 2010–2011 [Reference Babkin40]. Analysis of the full-length and VP1/2 sequences of the CUK-BC20 strain revealed high similarity (98.77–99.5%) to the Rus-Nsc10-N386 reference strain (HBoV2A), whereas the NS1 sequence showed relatively lower similarity (97.82%) to the reference. Both the NS1 and NP1 sequences of the CUK-BC20 strain showed high similarity (97.3–99.38%) to the CU47TH and CU54TH reference strains (HBoV2). The CUK-BC20 strain showed the lowest similarity (91.98–95.46%) to the PK-5510 strain and showed high (98.2%) amino acid similarity with the CU47TH/Thailand strain isolated in 2007, although higher similarity levels were detected for the full-length and VP1/2 sequences (98.77% and 99.5%, respectively). This result showed that the common ancestor of HBoV2 may have been co-circulating in both Russia and Thailand in 2007–2016. Amino acid substitutions were also detected in NS1, NP1 and VP1/2 for the South Korean strain.
This is the first study reporting the full-length sequence of an HBoV2 strain isolated in South Korea from a clinical sample. This sequence will be useful for comparisons with the full-length HBoV2 sequences of other strains identified globally. Moreover, the information acquired from the whole-genome sequence of strain CUK-BC20 may prove useful for obtaining more accurate diagnoses of HBoV as well as for advancing basic research toward the elucidation of the genetic functions, the prediction of newly appearing pandemic variants via comparison with HBoVs in neighbouring countries and in vaccine development. Overall, broadening the information and genetic resources of HBoVs circulating globally will have important benefits for public health and help to identify new emerging strains of HBoV.
Acknowledgements
This research was supported by a grant from the Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI), funded by the Ministry of Health & Welfare, Republic of Korea (grant number, HI16C0443), as well as by the Korea Ministry of Environment (MOE) as a Public Technology Program based on Environmental Policy (grant number, 2016000210002).
Declaration of Interest
None.