Importance
Viruses occasionally jump into new hosts to cause epidemics and may spread widely due to the movement of humans or animals, or their viruses, with profound consequences for global health. The emergence and epidemiology of new epidemic viruses in companion animals provide a model for understanding disease dynamics and evolution. The H3N2 CIV arose from an avian virus, and infected dogs provide many opportunities for human exposure. H3N2 CIV transmission is dominated by fast-moving outbreaks within dense populations in and between animal shelters or kennels, while sustaining the epidemic likely requires the movement of the virus to more distant dog populations. Viral spread within North America appears to be sustained for a few years at a time, after which the virus dies out. The epidemiological and evolutionary dynamics of this virus in this structured host population show how an acute respiratory pathogen can emerge and spread in a new host and population.
Introduction
Influenza A viruses (IAVs) are extremely successful viral pathogens of vertebrates that are maintained within many natural bird reservoir populations that cause multiple spillovers and outbreaks in mammalian hosts [Reference Yoon, Webby and Webster1]. IAV is an enveloped virus of the family Orthomyxoviridae, with a negative-sense genome arranged into eight units: PB2, PB1, PA, HA, NP, NA, M, and NS [Reference Krammer2]. Some IAVs have overcome a variety of different host barriers to emerge as epidemic or pandemic pathogens in humans and domesticated or wild animals, including swine, horses, dogs, seals, cats, and mink [Reference Yoon, Webby and Webster1, Reference Parrish, Murcia and Holmes3–Reference Wasik, Voorhees and Parrish5]. Spillovers of avian-origin IAV also frequently result in acute disease in other mammals – including humans - or domestic poultry with little or no onward transmission [Reference Reperant, Rimmelzwaan and Kuiken6]. Within the human population, the IAV subtype H1N1 emerged around 1918 and that later underwent reassortment with additional avian strains to create the H2N2 and H3N2 subtypes [Reference Krammer2]. The emergence of IAVs to cause epidemics in new animal hosts provides insights into the processes and principles of viral host-switching, allowing us to better understand future potential human pandemic viral emergences.
Carnivore mammals (members of the Order Carnivora), including dogs, have long been identified as being susceptible to infection by IAVs [Reference Parrish, Murcia and Holmes3, Reference Wasik, Voorhees and Parrish5]. This has included the observation of canine infection with seasonal human influenza strains [Reference Kilbourne and Kehoe7–Reference Yin9], in addition to the successful experimental infection of dogs with a human H3N2 subtype [Reference Paniker and Nair10]. The first sustained outbreak of an emergent canine influenza virus (CIV) occurred around 1999 after the transfer of an equine influenza subtype H3N8 to dogs. That outbreak in the USA was only recognized in 2004 [Reference Crawford11], with the virus initially spread through much of the USA, and one lineage persisted until 2016 [Reference Wasik12]. The epidemiology of H3N8 CIV was largely driven by transmission within and among shelters and kennels with high population turnover [Reference Dalziel13], so that viral lineages were strongly geographically clustered with outbreaks in major metropolitan areas, where the virus mostly died out in those areas in only a few months [Reference Hayward14]. The story of H3N8 CIV suggested that the dog population structure (at least in the USA) was not ideal for influenza to sustain epidemic or endemic infections [Reference Wasik12].
A lineage of avian-derived H3N2 emerged in dogs in eastern Asia (mainland China or Korean Peninsula) around 2004, resulting in sustained circulation that persists to the present day [Reference Li15–Reference Zhu, Hughes and Murcia17]. Viral sequence analysis revealed that the population quickly became separated into several geographic subclades in Asia, showing that the virus lineages circulated separately in a number of different geographic regions [Reference Zhu, Hughes and Murcia17]. In early 2015, the H3N2 CIV was first identified in the North American continent as the cause of an outbreak in the USA around Chicago, Illinois [Reference Voorhees18]. That virus was initially introduced from Korea, and additional outbreaks occurred in many parts of the USA (including around Chicago and in Georgia, Alabama, and North Carolina), but those were largely controlled during early 2017 [Reference Voorhees19]. Subsequent H3N2 CIV outbreaks occurring later in 2017 among dogs in the USA were identified with a temporal gap from earlier cases and impacting independent locations, including among states in the Midwest (Minnesota, Ohio, Indiana, Kentucky) and Southeast (Florida, Georgia, North Carolina, South Carolina) [Reference Voorhees19]. In 2018, further unique outbreaks of H3N2 CIV were identified in Ontario (Canada), without clear sourcing from the neighbouring USA [Reference Weese20, Reference Xu21]. Based on the geographic and temporal spread of these latter cases, as well as on the analysis of viral sequences, it was apparent that multiple international introductions of the H3N2 virus were involved in sparking the North American outbreaks that were seen between mid-2017 and 2018.
Viral emergence events in new hosts start with single infections resulting from spillovers, but those rarely go on to cause outbreaks or epidemics [Reference Wasik22]. The emergence of viruses in epidemic forms is generally thought to be associated with the acquisition of host-adaptive mutations that allow better replication of the virus in the cells and tissues of the new host animal, as well as causing increased transmission [Reference Long23]. These host-adaptive mutations are expected to arise quickly after transfer to the new host – likely during the first rounds of replication after spillover and the first animal-to-animal transfers [Reference Antia24–Reference Rochman, Wolf and Koonin26]. While early selection of such mutations can be obvious shortly after the bottleneck that occurs upon transfer to the new host, later virus populations will diverge by sequence within the new host, and key adaptive changes are harder to define in the background of mutations arising from genetic drift and/or other selection pressures, including the increasing levels of host immunity [Reference Carabelli27–Reference Tamuri29].
Other factors associated with the emergence of respiratory viruses include their epidemiology and host ecology. These include the virological properties of incubation times and shedding patterns, as well as host population structures – such as heterogeneity in the host density and distribution, as well as movement or other connections between separated populations [Reference Anderson and May30–Reference Plowright32]. The human population is dense, well-connected, and globally mobile, resulting in few ecological barriers to global transmission of endemic and newly emerged respiratory pathogens, as shown by seasonal influenza dynamics and during the rapid global spread of the H1N1 pandemic in 2009 or SARS CoV-2 in 2020 [Reference Bedford33–Reference Tegally36]. In contrast, many other animals live in varying size groups, including smaller and isolated populations in the wild where pathogen transmission is self-limiting, large flocks or herds that allow efficient transfer at least within that population, and some animals have long-range movements, including globe-spanning migrations [Reference Yoon, Webby and Webster1, Reference Fourment, Darling and Holmes37, Reference Olsen38]. Domestication and farming of animals can also result in animals being gathered into novel population structures: large well-connected populations on farms, dense animal markets that allow many different animals to come in close contact, and other human-directed movements that allow new modes of spread and maintenance of pathogens compared to those seen in the wild populations of the same hosts [Reference Nelson39, Reference Sievers40].
Companion animals such as dogs and cats have structured populations with additional features compared to those of domestic livestock. Those populations include many households with one or a few animals, as well as large and well-connected populations within kennels or animal shelters. Additional populations in some regions of the world include street animals that likely have variable levels of density and connectivity. In Africa, these complex population structures of dogs and anthropogenic effects of nearby human populations impact rabies virus (RABV) dispersal, transmission, and evolutionary dynamics [Reference Mancy41, Reference Talbi42]. In the USA, dogs and cats live in close proximity to humans, so infected animals would result in high levels of exposure to humans of all ages and health status [Reference Schmidt43].
Here, we continue to examine the emergence, evolution, and epidemiology of the H3N2 canine IAV that has been circulating in its new host for 20 years and provide a detailed understanding of the processes that allowed its emergence and sustained transmission.
Results
Following emergence, H3N2 CIV has transferred repeatedly around Asia and North America and formed multiple clades
We assembled a complete phylogenetic analysis of full H3N2 CIV genomes since 2006 (Figure 1). Our dataset combines obtained sequences for this study with others in public repositories to generate a large-scale H3N2 genome data (n = 297, Supplementary Table S1) that revealed the global virus evolution in dogs. Initial phylogenetic analysis of each individual gene segment gave concordant results (Supplementary Figure S1), indicating that lineages did not result from extensive reassortment of the viral gene segments or exchange with other viral strains – which was further confirmed by reassortment analysis (Supplementary Figure S2). Full-genome alignments and phylogenetic analysis of H3N2 CIV reveal that virus population lineages can be classified into a series of distinct clades, which allow us to track circulation, population variants, and interconnectedness during different stages of the overall epidemic (Figure 1).

Figure 1. Overall global clade structure of H3N2 CIV genetic diversity. ML tree of the full genome data set (n = 297) revealing distinct early lineages and corresponding clades (shaded and labelled). Tips denote geographic sampling source (green = China, blue = Korea, red = USA, purple = Canada). The phylogeny is rooted in the sequence Guangdong/1/2006. Scale denotes nucleotide divergence. White diamonds at nodes represent bootstrap support >99%, grey is support from 90% to 98%. A general timeline of observed sampling of each clade is presented. The final Korean isolate linking Clades 2 and 4, SouthKorea/20170110-1F1/2016, is noted.
Emergent H3N2 CIV prior to 2010 showed extensive early transmission of viruses within and between China and South Korea [Reference Li15, Reference Song16], along with a reported limited outbreak in Thailand [Reference Bunpapong44], with more geographically distinct lineages being established in China (Clade 1) and South Korea (Clade 2). Surveys in South Korea reported fewer cases of CIV between 2016 and 2018, with the Clade 2 virus lineage likely dying out in South Korea around that time [Reference Lee45, Reference Lim46]. Clade 1 viruses were infrequently reported in China after 2015, while CIV surveillance samples from dogs collected from 2017 to 2019 generated viral sequences that appeared to more resemble a Clade 2-derived virus that had first been circulating in South Korea. As such, a new clade (Clade 5) became established in China across several provinces, likely displacing the earlier viral lineages (Clade 1) [Reference Chen47–Reference Lyu49].
The clades and lineages of North America were not derived until 2015 and show connections to several clades originating in Asia at that time (Figure 1). The first recorded introduction of H3N2 CIV into North America in early 2015 caused a series of outbreaks in the vicinity of Chicago, Illinois, and in several southeastern US states, causing a scattered set of outbreaks among dogs that continued through early 2017 [Reference Voorhees19]. Phylogenetic analysis of the viral sequences showed that the virus in US dogs was derived from a Clade 2 virus circulating in South Korea and that US lineage is herein now referred to as Clade 3 (Figure 1). Our previous study also noted that later in 2017, US outbreaks appeared in the Midwest and Southeastern states being infected by a CIV lineage distinct from Clade 3 [Reference Voorhees19]. Phylogenetic analysis of those viruses, at the time, noted a relationship to a single Clade 2 isolate in 2016 (the last CIV sample sequenced in South Korea, SouthKorea/20170110-F1/2016) but with a sufficient branch length that made inferences of source and relatedness difficult. These secondary outbreaks of CIV in the USA formed a distinct Clade 4 (Figure 1). Our newer data also revealed multiple genetically diverse virus lineages in Clade 4 that appear to map to unique and epidemiologically disconnected outbreaks. Here, we extend on these Clade 4 viruses in the USA by analyzing a series of additional outbreaks that occurred from mid-2017 through 2018, including case clusters of genetically similar viruses in California, Florida, Ohio, and Kentucky (Figure 1). Another lineage involved infected dogs first seen in 2018 in California (near San Jose) and subsequently in New York (New York City). These Clade 4 viruses include several of the introductions seen in Ontario (Canada) [Reference Xu21]. In 2019, we identified a short-lived outbreak among imported dogs in quarantine in California, which did not escape into the general dog population. Those viral sequences were not of Clade 4 but showed a relationship to Clade 5 viruses observed in China (reported to be from near Shanghai) and clustered with one group of the outbreak dogs in Ontario in 2018. Further into 2019 and 2020, few cases of H3N2 CIV were reported in the USA, and the virus appeared to have died out in North America (Figure 1).
The interrelatedness of these clades and the structure of H3N2 CIV phylogenetics since roughly 2017 (Clades 4–6) suggest that the movement of viruses between Asia and North America is a hallmark of the overall epidemic.
Recent North American outbreaks of H3N2 CIV show paired epidemiology and phylogenetics
No H3N2 CIV outbreaks were reported in North America from late 2019 through 2020. There have been few reports of outbreaks in Asia, and only a single full genome H3N2 CIV sequence has been reported from China since 2019, for a virus sample collected in 2021 (with two additional HA sequences) [Reference Ou50]. Beginning in early 2021, an additional series of H3N2 CIV cases occurred among diverse regional outbreaks in the USA, and those appeared to be more sustained and on a larger scale relative to the outbreaks occurring in the immediately preceding years. We obtained a variety of new sequences from viral infections between 2019 and 2023 by opportunistic sampling. These covered several outbreak events, including some that appear to be ongoing at the time of preparation of this report (Supplementary Table S1). The sequence analysis showed the outbreaks to be both widely dispersed geographically and to be temporally spread out, as some viral sequence clusters were separated by long branch lengths (Figure 1). This suggested either that we were missing key samples from our analysis or that there were gaps in the viral transmission chains with viruses being reintroduced from outside our network.
We were able to obtain USA-wide diagnostic testing case data between 2021 and early 2024, and that confirmed that our sequencing and phylogenetic analysis covered the main outbreaks that had occurred during that period (Figure 2a). Gaps in our genomic data (Figure 2b, seen as long branch lengths in the phylogeny) mostly aligned with periods of low or absent diagnostic testing positivity, suggesting those gaps did not represent major sampling omissions. The levels of testing for the different outbreaks may differ, but there appeared to be a rough concordance between the number of positive tests and the size of the outbreaks – for example, a large 2021 outbreak in Los Angeles country was represented by a large number of positive results, consistent with the County Department of Health confirmation of 1344 cases and estimates of tens of thousands of dogs being infected (case data at scale in Supplementary Figure S4) [51]. More recent US outbreaks in late 2023 and early 2024 centred around Las Vegas, Nevada, and we do not yet have those virus samples to compare by phylogenomic analysis. Overall, these data confirm that we have captured most of the major outbreaks of the virus, as well as its natural and evolutionary history since 2021 (Figure 2).

Figure 2. Recent circulation of H3N2 CIV in the USA, 2021 to present. (a) Diagnostic positive H3N2 CIV cases (n = 993). (b) Phylogenetics of full genomes among recent US outbreaks, with colour highlight corresponding to timeline match with diagnostic data set and geography. (c) The geography of major US outbreaks is demonstrated with colour circles corresponding to clusters in diagnostic data and phylogeny. Circles are not to scale of cases or genome count. Case data at sample geography and set to logarithmic scale are available in Figure S4.
We modelled the epidemiology of the virus over these major outbreaks using the different data available, including calculating estimates of the effective reproductive number (R) over time [Reference Cori52]. We used the diagnostic testing dataset obtained to infer the scope of each local outbreak (generally within a geographical area), and to make the temporal and sequence connections between the different outbreaks (Figure 3). That analysis revealed that the distinct geographical outbreaks examined experienced fluctuations in R, which may be driven in part by localized depletion of susceptible hosts – but which is also expected for a host population that shows extensive contact heterogeneity – i.e., some animals being in dense and connected populations within animal shelters and kennels, while others are dispersed and disconnected as they are living in households and have many fewer contacts with other dogs. We were able to measure three predominant geographical outbreak centres: Los Angeles, California; Dallas/Fort Worth, Texas; and Las Vegas, Nevada. In each outbreak, periods of R > 1 were measured, including periods of high transmission where the R was between 2.5 and 4. Troughs of transmission with R < 1 were also observed. The largest outbreak (Los Angeles) with the greatest data volume resulted in the best resolution with mean R measures having low error variance across the timescale, where dramatic epidemiological changes to the outbreak (September 2021) can be identified. In contrast, smaller outbreaks (in Dallas and Las Vegas) reported lower case data volume. We calculated a larger degree of error and noise in the data for those outbreaks, making distinct temporal changes in mean R values difficult to identify with confidence. Overall, the epidemiological dynamics measured were similar to the “boom and bust” patterns seen between 2015 and 2017 in the Chicago and Atlanta areas [Reference Voorhees19].

Figure 3. Epidemic dynamics of H3N2 CIV in the USA. (a) Proportion of daily cases attributed to three major outbreaks in Los Angeles, California (CA), Dallas/Fort Worth, Texas (TX), and Nevada (NV). The areas under the red (CA), blue (TX), and yellow (NV) curves represent the proportion of total cases associated with each state on a given day. (b) Daily number of new cases over time. (c) Effective reproductive number (R) estimates for each outbreak during the periods indicated by horizontal bars. Solid lines represent mean estimates, while dashed lines show the 95% credible interval of the posterior distribution.
Directional transfer of H3N2 CIV between Asia and North America in recent years is a result of an increased virus population in Asia
A key question for understanding the epidemiology of the viruses in North America and Asia is the movement of viruses between those two regions, and in particular, the directionality and frequency of any transfers that have occurred. To reveal the most likely routes of transfer, we tested a recent subset of the phylogenetic data (see Methods and materials, Clades 4–6) using two models: discrete trait diffusion and structured coalescent. The implementation of a discrete trait diffusion model with a Bayesian stochastic search variable selection (BSSVS) allows us to infer statistically supported rates of transition between regions across a phylogeny – i.e., showing which region is most likely seeding infections to other regions. We also used a Structured Coalescent Model to estimate the size of virus populations in each region based on their genetic diversity, and thereby inferring the region with a larger infected population. The results of both models generally agreed, inferring similar tree topologies and transmission rates between the two regions (Figure 4a and Supplementary Figure S5). Both analyses suggest five independent introductions from Asia into North America before 2021. Using the discrete trait model, we infer a higher mean rate of transitions from Asia to North America (1.4628 transitions/year, 95% HPD: [0.0332, 3.6722]) than from North America to Asia (0.4968 transitions/year, 95% HPD: [6.035E-6, 1.567]), though both transition rates had strong posterior support (posterior probability of 0.99) and a Bayes Factor support (9000).

Figure 4. H3N2 CIV circulating in Asia act as viral sources for introductions into North America. (a) MCC tree from a structured coalescent analysis of recent subset H3N2 CIV genomes from North America and Asia. Nodes are coloured by their inferred geographic region; the thickness of the branches corresponds to the number of taxa that descend from the given branch. (b) Effective population size estimates for each geographic deme were estimated using MultiTypeTree v8.1.0.
Although discrete trait approaches are computationally tractable, they can perform poorly in the presence of biased sampling. We therefore used a structured coalescent model to independently infer virus population sizes and rates of migrations between North America and Asia. These approaches appear to more clearly model source-sink dynamics in the presence of biased sampling [Reference De Maio53, Reference Dudas54], and also estimating viral effective population size (N e) for each region, a proxy for infected population size [Reference Frost and Volz55]. The mean N e in Asia was 3.4438 (95% HPD: [2.6039, 4.389]), which was significantly greater than the Ne for North America of 0.1571 (95% HPD: [0.106, 0.2162]) (Figure 4b). The higher inferred Ne matches our expectation that there is a larger virus population in Asia which acts as a source for introductions of novel lineages introduced into North America. We also estimated a backwards-in-time migration rate, which represents the rate of viruses moving to a given region A from a given region B. We estimated a higher backwards in time migration rate to North America from Asia (1.260 migrations/year, 95% HPD: [0.4474, 2.1399]) compared to the rate to Asia from North America (0.0894 migrations/year, 95% HPD [2.9079E-3, 0.2134]), consistent with the results of the discrete trait diffusion.
The rate of H3N2 CIV sequence evolution is consistent over host geography and time
To examine the temporal nature of the full genome dataset, we used well-dated samples (YYYY-MM-DD) to analyse the maximum likelihood genome phylogeny, plotting the root-to-tip length data against the sampling date (using the TempEst algorithm). That showed a consistent clock-like rate that averaged 1.76 × 10−3 substitutions/site/year (R 2 = 0.95) over all outbreaks across two continents, and since the emergence of H3N2 CIV in dogs around 2004 (Figure 5a). Rates of individual segments fell in a range of 1.25–2.00 × 10−3 (Supplementary Figure S3). That consistent clock-like evolution revealed no evolutionary behaviours that might indicate pronounced selections (e.g., bursts of adaptive evolution) on the virus in either its apparent reservoir among dogs in Asia or during the outbreak spread within North America. Attempts to analyse the clock by distinct geography or timescale windows showed no noteworthy variation from the full dataset mean (data not shown).

Figure 5. Temporal evolution of H3N2 CIV during continuous dog-to-dog circulation. (a) A root-to-tip analysis of H3N2 CIV full genomes, shows the divergence since the first common ancestor of the virus represented by the basal node of the phylogeny. This shows a consistent evolutionary rate of 1.76 × 10−3 substitutions/site/year. (b) Individual segment ORF substitution rates were calculated in BEAST and compared to H3N8 CIV and human seasonal H3N2. (c) Mean segment ORF dN/dS ratios were calculated using SLAC and compared to H3N8 CIV.
The nucleotide substitution rate for each genome segment was further determined using a Bayesian coalescent method in BEAST. Among the H3N2 CIV segment ORFs, substitution rates were largely similar – with a higher rate for the HA gene segment (rising to significance against 5 of 7 other segments, save NA and NS1) (Figure 5b). These rates are largely similar to those seen for another influenza virus previously seen in dogs (H3N8 equine-origin CIV, analysis from Wasik et al. [Reference Wasik12]), though higher rates were seen in H3N2 CIV among the PB1, PA and HA segments (1.782 vs. 0.87, 1.917 vs. 1.237, and 2.519 vs. 1.102, all 10−3 substitutions/site/year, respectively). The rates seen in both the H3N2 and H3N8 CIVs were lower in nearly all segments than those observed in seasonal H3N2 human influenza virus, with greatest pronouncement and significance relative to the HA and NA glycoproteins [Reference Westgeest56]. To generate a rough proxy of natural selection in the H3N2 CIV during the 20 years of dog-to-dog transmission, we used the model statistical method SLAC (Single-Likelihood Ancestor Counting) [Reference Weaver57] to determine the mean d N/d S ratio for each segment (Figure 5c). The greatest potential signals of positive selection were seen in the HA, NA, and NS segment ORFs. The d N/d S ratios observed in CIV were generally higher than previous measures of H3N8 CIV (2003–2016) evolution, apart from HA and NS.
Global H3N2 CIV evolution has resulted in the cumulative fixation of non-synonymous mutations
To understand the likely functional effects of the viral evolution, we identified non-synonymous mutations that became fixed during H3N2 CIV evolution (Figures 1 and 6). Most clustered at key nodes within the phylogenetic tree, appearing following likely bottlenecks during transfers to new geographical regions. The overall evolution of the virus is re-reviewed in Figure 6, and key non-synonymous mutations that became fixed during the evolutionary pathway are highlighted. For example, 13 coding mutations became fixed within 7 of the 8 gene segments of the virus during the emergence of Clade 2 in South Korea, and additional groups of non-synonymous mutations (ranging in numbers from 3 to 15 over the entire genome) became fixed during the subsequent clade-defining events (Figure 6 and Supplementary Tables S3 and S4).

Figure 6. Fixed mutations at key transitional nodes and international transfer events. (a) An ML tree of H3N2 CIV genomes with geographic sampling sources coded on tips (green = China, blue = Korea, red = USA, purple = Canada) and on the corresponding bar to the right of the tree. (b) Noted start of Clade 2, circulating in South Korea. (c) Thr transition of Clade 2 to Clade 3, international introduction of US outbreaks, 2015–2017. (d) Start of Clade 4, after last known South Korean isolate. (e) A major early genetic bottleneck of Clade 4 by NS1 truncation. (f) Early convergence in Clade 5, ~2018 circulation in China. (g) Later convergence in Clade 5, ~2019 circulation in China. (h) Start of Clade 6, with US outbreaks from 2021 to present.
Mutations span the genome and appear in multiple functional domains of the influenza gene products. A notable fixation was the truncation of the NS1 protein at the C-terminus during Clade 4 evolution. A truncation of NS1 has previously been observed in the H3N8 subtype in horses and dogs and was related to host-specific adaptation of innate immunity [Reference Chauché58, Reference Jackson59]. Several changes in PB2 (S714I) and PA (E327K and S388G) have previously been identified during mammalian adaptation of the polymerase subunits, related to ANP32A co-factor utilization [Reference Long23, Reference Li60, Reference Staller61].
H3N2 CIV evolution has included HA-specific mutations in both the head and tail domains
The HA protein controls receptor binding and cell membrane fusion, and is the major antigenic determinant of influenza, so that mutations in the HA may play key roles in controlling CIV biology and epidemiology. We identified the major fixed HA mutations of H3N2 CIV following the establishment of Clade 4 (Figure 7a and Supplementary Table S3) and mapped their position onto the HA monomer structure (Figure 7b). Following the last reported South Korean-origin genome, several nonsynonymous mutations have fixed in the HA gene of CIV lineages and clades. First, upon the establishment of Clade 4, there fixed an asparagine to aspartic acid mutation (N188D, in H3 numbering) at the top of the HA1 head near the sialic acid receptor binding site (RBS). This mutation was discussed in [Reference Chen47] as potentially playing a role in clade-specific phenotypes. That Clade 4 transition also fixed a stalk mutation (HA2-V89I). During Clade 5 evolution during circulation in China, two major nodes were observed in the phylogenetic analysis. Upon circulation in 2018, there fixed another HA1 mutation from valine to isoleucine (V112I). Later in 2019, there fixed both an HA1 mutation (V78I) as well as a glycine to serine change in the final residue of the signal peptide (Gsig16S). Clade 6 represents all US outbreaks after 2021 where lineages likely resulted from multiple independent introductions, therefore unique HA fixed mutations can be seen with temporal and geographic patterns (Supplementary Table S4). An HA2 mutation was seen fixed in the earliest US samples in 2021 (R82K), while an HA1 asparagine to threonine change was seen starting in 2022 viruses (N171T, in the RBS). In the most recent Clade 6 viruses in the US (2023), we observe regional subclade lineages with unique HA fixations (Figure 7). In the Mid-Atlantic region (around Washington DC and Philadelphia, Pennsylvania) viruses have an HA2 mutation (S113L) in addition to an isoleucine to methionine (I245M) change in the HA1 head near the RBS. In contrast, viruses found in the subclade around Dallas, Texas, contain a HA1-V223I mutation (previously observed in Clade 5) shown to increase virion thermal stability [Reference Liu62] and is located in a region that contains antigenic site D in human H3 viruses.

Figure 7. Recent molecular evolution of H3N2 CIV hemagglutinin. (a) Schematic of the HA ORF showing with nonsynonymous mutations fixed since the last Clade 2 isolate, SouthKorea/20170110-1F1/2016. HA1 and HA2 positions follow H3 numbering. Signal peptide sequence residues are noted as sigX. (b) The location of key HA1 and HA2 mutations on the monomer of HA, occurring during key transitions in the recent evolution of H3N2 CIV. Mutations specific to the most recent Clade 6 US subclades (Mid-Atlantic, yellow and Texas, purple) are highlighted.
Discussion
Here, we further reveal how the H3N2 CIV emerged, evolved, and spread around the world during the 20 years that it has been circulating in dogs, providing an example of an influenza virus switching hosts to cause a sustained epidemic in a mammalian host population. Our results emphasize the interconnections between viruses circulating in different regions of Asia and those in North America. We show that around 2017, the H3N2 CIV was reduced to a single lineage that has since expanded to include viruses in mainland China, the USA and Canada. The data also strongly suggest that there were two introductions into the USA in 2021 and 2022. The first led to outbreaks during 2021 that started in Florida and then caused a large outbreak in the Los Angeles, California area that continued for several months, but which died out by the end of that year. The second introduction occurred around June or July of 2022 that resulted in a sustained outbreak in Dallas/Fort Worth, Texas, which then spread to cause a series of outbreaks in many areas of the USA, which have continued to the present (Figure 2).
Overall and recent epidemiology
The results again confirm that the epidemiology of this virus in dogs in North America is outbreak-driven and constrained by the host population structure. Consistent observations and analysis have shown that CIV (both H3N8 and H3N2) spread is characterized by rapid transmission within geographic areas among group-housed dogs in kennels, animal shelters, and day-care settings [Reference Dalziel13, Reference Hayward14, Reference Voorhees19, Reference Newbury63]. Among recent US outbreaks of CIV, the largest and longest duration have occurred within larger metropolitan areas, where circulation has been sustained for up to several months, although all observed outbreaks ultimately died out after between 1 and about 6 months (Figure 2). The more recent CIV epidemiology was revealed by the analysis of nationwide diagnostic testing data, which showed the duration, intensity, and locations of the major outbreaks that occurred (Figures 2 and 3). That analysis was also compared to parallel phylogenetic and phylogeographic studies based on full genome sequencing data. Although both sets of data resulted from opportunistic sampling, the patterns revealed were similar, indicating that they would explain the virus spread and evolution in North America during this period.
Little recent data is available about the viruses in Asia and their circulation patterns, with only a single full H3N2 CIV sequence being available since 2019, so it is not possible to compare the details of the recent epidemiology or evolution of the viruses in Asia and North America. However, it appears that during the last several years, the viral population has been distributed between China and North America (with no virus being reported from South Korea since around 2017). Our analysis suggests a significantly greater likelihood of transmission from Asia to North America than the reverse (Figure 4). The Centers for Disease Control and Prevention has estimated that as of 2019, roughly 1.06 million dogs are imported into the USA annually [64]. This is a trend that has increased since the 2020 COVID pandemic [Reference Kavin65, Reference Pieracci66], while screening for infectious respiratory diseases in imported dogs is minimal [67].
The effective reproduction number (R) for the H3N2 CIV was estimated from the data and varied widely depending on the structure of the dog population. Within the large metropolitan centres, the R was between 1 and 4, mainly due to the rapid and sustained spread of the virus within more densely housed dogs. It is assumed that within each population outbreak, recovered animals would become resistant due to the development of protective immunity, so without the addition of large numbers of new susceptible animals, the Susceptible-Infected-Recovered (SIR) dynamic would apply, and the virus would die out naturally. While there is a H3N2-specific canine vaccine which can protect dogs from severe disease, we lack data on the use of vaccines, or on the vaccine status of dogs in case and outbreak settings. While administration of vaccination well in advance of virus introduction would likely reduce or stop transmission, vaccination in the face of ongoing outbreaks is unlikely to affect the dynamics of viral spread as incubation and transmission is likely more rapid than the development of vaccine-induced immunity [Reference Newbury63]. Having the vaccination status of dogs in outbreak settings known in future studies would allow the role of prophylactic vaccination to be better understood.
A common prediction is that transmission of a virus in a new host will select for host-adaptive mutations which would result in more efficient infection, replication, and transmission – often referred to as ‘gain-of-function’ mutations. While the H3N2 CIV has acquired many changes with the potential to be host-adaptive during its thousands of exclusively dog-to-dog infections and transmissions, we do not yet see clear evidence of CIV being able to spread more effectively in dogs within dense populations or gaining an increased ability to transfer between geographically separate population centres.
H3N2 CIV evolution and possible host adaptation
The H3N2 CIV has likely undergone over 1000 exclusively dog-to-dog infections and transfers since it emerged, and there has been a linear acquisition of genetic change since 2005. While some adaptive changes likely arose during the first series of transfers of the H3N2 CIVs in dogs, those are difficult to define as the sequence of the directly ancestral avian virus is not known [Reference Zhu, Hughes and Murcia17]. The evolution of the virus included several clades that appear to have been established through single bottleneck events which may be due to epidemiological events. We therefore cannot determine which mutations were under positive selection which resulted in their fixation, versus neutral or even deleterious mutations that were fixed during the bottlenecks.
While several mutations in H3N2 CIV are in viral genes and proteins associated with host adaptation in other influenza spillovers (Supplementary Tables S3 and S4), we do not have clear evidence that those have resulted in gain of function for the virus [Reference Chen47, Reference Martinez-Sobrido68]. In other studies small numbers of experimental passages in ferrets were sufficient to make high pathogenicity H5N1 influenza viruses more transmissible in that new host [Reference Herfst69, Reference Imai70], while for SARS CoV-2 several variants that show increased transmission in humans have arisen during the first years of spread in humans in the face of rising immunity [Reference Hill71–Reference Viana74]. A possible explanation for the apparent lack of gain of function for H3N2 CIV is that the complex environmental pressures of the canine population may be constraining the viral evolution, since most spread and infections have occurred in dense populations of animals in close contact, but which result in SIR-mediated die outs. Mutations selected in those host populations would therefore differ from those that would favour the long-distance transfers between population centres required to sustain long-term transmission. Properties that may favour the second type of spread might include persistent infections, prolonged shedding, virus stability, and fomite-mediated transfer, as suggested by the life-history trade-off evolutionary theory [Reference Goldhill and Turner75].
Are H3N2 CIVs a risk to humans and might those risks be changing?
The risk of emergence of the H3N2 CIV into additional mammalian hosts (including humans) is a concern, but currently, the threat is unknown. While swine have often been proposed to be intermediate host that allows influenza viruses to jump onward into humans to cause sustained epidemics on at least a few occasions (“mixing vessels”), other mammalian-adapted influenza viruses (in horses, seals, cows, mink) have so far not been the origin of human epidemics [Reference Parrish, Murcia and Holmes3, Reference Kareinen76, Reference Uyeki77]. The changes in the H3N2 CIV have included previously identified mammalian-adaptive sites [Reference Long23], some experimentally confirmed as having biological effects [Reference Chen47], and the H3N2 CIV appears to have lost the host range for at least some avian hosts [Reference Zhou78]. Other concerning adaptive changes that underlie known human adaptation and pandemic risk (including NP changes to evade MxA/BTN3A3 restriction factors, and HA RBS changes to the 226/228 residues) are absent, as the H3N2 CIV remains ‘avian-like’ [Reference Mänz79–Reference Shi81]. Changes have occurred in NA domains related to ligand binding, but the constellation of changes does not include any known inhibitor-resistance markers, suggesting antiviral use in zoonotic exposure would be efficacious [82].
Cross-species transmission of H3N2 CIV into cats has occurred in Korea and the USA [Reference Jeoung83, Reference Zhao84]. In a serosurvey of racehorses in China, some riding club horses were positive for H3N2 CIV, positivity being strongly associated with exposure to dogs [Reference Zhou85]. To date, no natural human infections with either H3N8 or H3N2 CIV have been reported. Following the 2015 US outbreak of H3N2 CIV, a human spillover risk assessment was performed on a Clade 3 isolate, examining for receptor utilization, replication kinetics in human cells, and antibody cross-reactivity, and concluded that H3N2 CIV posed a low risk for human populations [Reference Martinez-Sobrido68].
Besides point mutations, reassortments of influenza viruses allow the mixing of viruses to find favourable genetic combinations [Reference Lowen86, Reference Ma87]. No reassortant influenza viruses were seen in dogs in the USA (Supplementary Figures S1 and S2). H3N2 CIV reassortants with human, swine, and avian influenza viruses have been reported in Korea and China [Reference Chen88–Reference Song91], but none have generated sustained transmission chain lineages and the fitness effects for either dogs or humans are unknown. While the H3N8 and H3N2 CIVs both circulated in dogs in the USA during 2015 and 2016, the two strains were not found within the same population so no recombinant opportunities likely occurred [Reference Wasik12, Reference Voorhees19]. The lack of reassortant viruses in North American dogs suggests differences in host population ecology or epidemiology of the CIV and the other viruses leading to fewer mixed infections. It is unknown whether additional viral strains, including high pathogenicity avian influenza (HPAIV) H5N1 clade 2.3.4.4b may create reassortant events, but additional screening of domestic animals for influenza viruses is likely warranted [Reference Brown92–Reference Sillman95].
Overall conclusions
Respiratory viruses emerging to cause human pandemics readily spread through host populations, crossing oceans within days and overwhelming control measures [Reference Bedford33, Reference Tsui96, Reference Worobey97]. Here, we show that emerging epidemic diseases in animal populations may differ significantly from those seen in humans, even though the transmission potential for each virus is similar. The different outcomes are mostly due to the different population structures and inter-connectedness of different wild or domesticated animals [Reference Dalziel13]. There are around 900 million domestic dogs worldwide, and many live in very close proximity to their human owners, allowing for high levels of human exposure [Reference Schmidt43]. While the H3N2 CIV has circulated widely in Asia and North America, and undergone several transfers between those regions, it has not been reported from Europe, Australia, Africa, or other regions with large dog populations and veterinary disease testing. The relatively unrestricted import of large numbers of rescued dogs from Korea and China to the USA may have facilitated the repeated introduction of the virus. The movement of CIV-infected dogs from the USA or Canada (or from Asia) to Europe, Australia, or other places around the globe seems less likely. The H3N2 CIV has repeatedly died out in North America, and the H3N8 CIV also died out in North American dogs in 2017 with little directed intervention [Reference Wasik12, Reference Voorhees19]. This indicates that modest interventions such as animal symptom screening, quarantine, and slowing the movement of infected dogs would allow the H3N2 CIV to be eradicated from the dog population, while increased testing or quarantine of imported dogs would likely prevent the virus from being re-introduced. The epidemiological and evolutionary patterns seen here may also prove a useful comparison with influenza viruses in other hosts – horses, swine, mink, seals, and cows – where interventions at key points could also result in them being controlled or eradicated.
Methods and materials
H3N2 sample collection
Virus samples were obtained from various diagnostic centres, including the Animal Health Diagnostic Center (Cornell University College of Veterinary Medicine), Wisconsin Veterinary Diagnostic Laboratory (University of Wisconsin School of Veterinary Medicine), and Antech Diagnostics. These samples followed from routine passive diagnostic testing for respiratory disease in dogs (listed in Supplementary Table S1, red). Samples were received as either nasal or pharyngeal swab material or extracted total nucleic acid (tNA) first utilized for quantitative reverse transcription-PCR (qRT-PCR) positivity. Two additional samples were first isolated in embryonated chicken eggs or MDCK cells for virus isolation (noted in Supplementary Table S1). Viral RNA (vRNA) was extracted from clinical swab samples and virus isolation using the QIAamp Viral RNA Mini kit. Purified vRNA or tNA was then either directly used for influenza multi-segment RT-PCR or stored at −80°C.
Generation of influenza viral sequences
Virus genomes from samples were generated as cDNAs using a whole genome multi-segment RT-PCR modified from the Zhou et al. CDC protocol, described previously and repeated here in full [Reference Wasik12, Reference Zhou98]. A common set of primers (5′ to 3′, uni12a, GTTACGCGCCAGCAAAAGCAGG; uni12b, GTTACGCGCCAGCGAAAGCAGG; uni13, GTTACGCGCCAGTAGAAACAAGG) that recognize the terminal sequences of the influenza A segments were used in a single reaction with SuperScript III OneStep RT-PCR with Platinum Taq DNA polymerase (Invitrogen). Following confirmation by gel electrophoresis, viral cDNA was purified either by standard PCR reaction desalting columns or with a 0.45× volume of AMPure XP beads (Beckman Coulter). Libraries were generated either by Nextera XT with 1 ng of cDNA material or Nextera FLEX with 150–200 ng (Invitrogen). Libraries were multiplexed, pooled, and sequenced using Illumina MiSeq 2 × 250 sequencing. As most samples were isolated from direct nasal swabs and may contain intra-host viral diversity at sequencing depth, raw reads were deposited in the Sequence Read Archive (SRA) at NCBI.
Consensus sequence editing was performed using Geneious Prime. Paired reads were trimmed using BBduk script (https://jgi.doe.gov/data-and-tools/bbtools/bb-tools-user-guide/bbduk-guide/) and merged. Each sequence was assembled by mapping to a reference sequence of a previously annotated H3N2 isolate (A/canine/Illinois/41915/2015(H3N2)). Consensus positions had read depth > 300 and > 75% identity.
Diagnostic testing data and R estimation
Diagnostic qRT-PCR tests positive for H3N2 CIV (n = 993) were obtained, spanning from 2021 July 27 to 2024 January 12, identifying cases in 17 US states by zip code. R estimation from daily case data followed the standard methodology described in [Reference Cori52]. The generation time was assumed to follow a discretized version of a gamma distribution with a mean of 3.5 days and a variance of 2 days. The temporal window over which R was assumed constant was 7 days.
Phylogenetic analysis
H3N2 CIV nucleotide sequences were downloaded from the NCBI Influenza Virus Database and were compiled with generated consensus genomes for this study and organized in Geneious Prime. We examined the larger database collection of H3N2 present in dog hosts for both inter- and intra-subtype reassortants using RDP4 (seven methods: RDP, GENECONV, Bootscan, MaxChi, Chimaera, SiScan, 3seq) and excluded those found to be a statistically significant outlier in two or more methods [Reference Martin99]. The dataset with full genome coverage (n = 297) is provided in Supplementary Table S1. Sequences were manually trimmed to their major open reading frames (PB2: 2280 nt, PB1: 2274 nt, PA: 2151 nt, HA: 1701 nt, NP: 1497 nt, NA: 1410 nt, M1: 759 nt, and NS1: 654-693 nt) and either analyzed separately or concatenated with all other genome segments from the same virus sample. Nucleotide sequences were aligned by MUSCLE [Reference Edgar100] in the Geneious Prime platform. Maximum likelihood (ML) phylogenetic analysis was performed by either PhyML or IQ Tree [Reference Guindon101, Reference Trifinopoulos102], employing a general time-reversable (GTR) substitution model, gamma-distributed (Γ4) rate variation among sites, and bootstrap resampling (1000 replications). ML trees were visualized and annotated using FigTree v1.4.4 (tree.bio.ed.ac.uk/software/figtree/). An additional reassortment analysis for all 297 aligned H3N2 CIV sequences was performed using Treesort [Reference Markin103]. Reassortment events were inferred on each branch of the phylogeny using HA as the fixed reference tree.
The temporal signal was assessed by a regression root-to-tip genetic distance against the date of sampling using our ML tree and the TempEst v.1.5.3 software [Reference Rambaut104]. Accurate collection dating to the day (YYYY-MM-DD) was utilized for all samples where this information was available. Sampling dates used for all isolates are listed in Supplementary Table S1.
Bayesian phylodynamic analysis
We used a Bayesian coalescent approach to better estimate our phylogenetic relationships, divergence times, and population dynamics. Analyses were performed in BEAST v.1.10.4 [Reference Suchard105], where Markov chain Monte Carlo (MCMC) sampling was performed with a strict clock, a GTR substitution rate with gamma distribution in four categories (Γ4), set for temporal normalization by sample collection date (YYYY-MM-DD to best accuracy), and assuming a Bayesian Skyline Plot (BSP) prior demographic model. Analyses were performed for a minimum of 100 M events, with replicates combined in Log Combiner v1.10.4. Outputs were examined for statistical convergence in Tracer v1.7.2 (effective sample size [ESS] ≥200, consistent traces, removed burn-in at 10%–15%) [Reference Rambaut106].
For the discrete trait diffusion model [Reference Lemey107] and the structured coalescent model [Reference Vaughan108], we used a subset of diverged lineage genomes (n = 169) starting with the last Korean isolate (SouthKorea/20170110-1F1/2016) in Clade 2 and related lineages in Clades 4, 5, and 6. This dataset was annotated for two geographic locations (Asia or North America) based on sample collection. These annotations for the geographic region were used as discrete traits for discrete trait diffusion modelling and as demes for structured coalescent modelling. We first performed a discrete trait diffusion model for the sequences using BEAST v1.10.4. We used an HKY nucleotide substitution model with gamma-distributed rate variation among sites, a lognormal relaxed clock model, and a GMRF skyride model [Reference Drummond109–Reference Shapiro, Rambaut and Drummond111]. We ran six independent MCMC chains of 50 million steps, logging every 5000 steps. Outputs were examined for statistical convergence in Tracer v1.7.2 (ESS ≥200, consistent traces, removed burn-in at 10%). The three runs with the highest ESS values were selected and combined using LogCombiner v1.10.4. The last 500 posterior-sampled trees from the combined runs were used as an empirical tree set to perform discrete trait diffusion analysis. We performed an ancestral state reconstruction to determine the discrete trait states across all branches of the phylogeny. We used the BSSVS method to estimate the most parsimonious asymmetric rate matrix between discrete traits [Reference Lemey107]. We used a Poisson mean prior to 1 non-zero rate. The Bayes factor support for the rates was calculated using SpreaD3 v0.9.7.1 [Reference Bielejec112]. The maximum clade credibility (MCC) tree was summarized with the program TreeAnnotator v1.10.4 using the combined tree files for the runs with the highest ESS value using a posterior sample of 10000 trees. The structured coalescent analysis was implemented using the MultiTypeTree v 8.1.0 package available in BEAST v2.7.6 [Reference Bouckaert113]. These models estimate an effective population size (Ne) in each deme, and asymmetric migration rates between demes [Reference Vaughan108]. We calculate the asymmetric migration rates as ‘backwards-in-time’ rates, which parameterize the rate of movement of ancestral tree lineages between demes backwards in time. This represents the probabilities per unit time of individual members of a particular deme having just transitioned from some other deme. We also calculate the ‘forwards-in-time’ rates, which parameterize the (constant) probability per unit time that an individual in some deme will transition to a new deme. The relationship between the backwards and forwards migration rates is given by:

where B ij is the backwards-in-time rate from deme i to deme j, F ij is the forwards-in-time rate from deme i to deme j, and where N i and N j are the effective population sizes of these two demes [Reference De Maio53]. We used an asymmetric model of migration with a lognormal prior distribution for the rate with a mean of 1 and upper limit of 20 as well as a uniform prior distribution for the Log Population size tau (Effective population size) between 0.001 and 10000. We ran six independent MCMC chains of 100 million steps, logging every 10000 steps. Run convergence was assessed using Tracer 1.7.2 (ESS ≥200, consistent traces, removed burn-in 10%). The three runs with the highest ESS values across parameters were combined using LogCombiner v1.10.4. The MCC tree was summarized for the tree files of the three runs with the highest ESS using the program TreeAnnotator v1.10.4 for a posterior sample of 10000 trees. Visualization for MCC trees was achieved using the BALTIC python package (https://github.com/evogytis/baltic). Branches of the phylogenies were coloured based on annotations for the most probable ancestral state or deme for the MCC tree of the discrete trait diffusion model analysis and structured coalescent analysis respectively.
Analysis of selection pressures
The relative numbers of synonymous (d S) and nonsynonymous (d N) nucleotide substitutions per site in each ORF of each segment were analysed for the signature of positive selection (i.e., adaptive evolution, p < 0.05 or > 0.95 posterior) using SLAC (Single-Likelihood Ancestor Counting) within the Datamonkey package (datamonkey.org/; [Reference Weaver57]).
Hemagglutinin structural modelling
HA structures and positions of CIV mutations were visualized in Mol* from the RCSB Protein Database (PDB). The structure of A/HongKong/1/1968 (4FNK) was used as an H3 model.
Supplementary material
The supplementary material for this article can be found at http://doi.org/10.1017/S0950268825000251.
Data availability statement
All generated full genome sequence data have been submitted to NCBI under BioProject PRJNA971216. Raw sequence reads from amplified direct swab samples were deposited in the Sequence Read Archive (SRA). Consensus genome sequences were submitted to Genbank, with accession numbers listed in Supplementary Table S1, in addition to public sequences retrieved and employed. All BioSample and SRA accession numbers of NGS of direct nasal swabs generated in this study are listed in Supplementary Table S2.
Acknowledgements
We wish to thank multiple sources for narratives of observed outbreaks, in addition to the collection and sharing of diagnostic samples. We thank the University of Wisconsin Veterinary Diagnostic Lab (WVDL) for its support in diagnostic testing for H3N2 CIV. We thank the Cornell Animal Health Diagnostic Center (AHDC), and particularly Diego Diel and Lina Covaleda, for support in diagnostic testing for H3N2 CIV and virus isolation. We thank Joseph Flint, Guillaume Claud Reboul, and Kelly Sams of the Goodman Lab for technical support in library preparation and sequencing. Significant lab support was provided by Wendy Weichert. We thank Timothy Vaughan, ETH Zurich, for helpful discussions on structured coalescent models. We thank Eddie Holmes, University of Sydney, for key feedback and insight during the drafting of this article. We thank the anonymous peer reviewers for their significant insights and suggestions to improve the manuscript. CML is currently an employee of Antech Diagnostics, Mars Petcare Science and Diagnostics; however, this organization did not participate in funding this work. This work was partially funded by the Influenza Division of the Centers for Disease Control (CDC), Animal-Human Interface program, under contract #75D30121P12812 to CRP and LBG. The content and commentary in this article are solely the responsibility of the authors and do not necessarily represent the official views of the CDC.
Author contribution
Formal analysis: B.D.D., L.D., M.A.M., B.R.W.; Visualization: B.D.D., L.D., M.A.M., B.R.W.; Writing – review & editing: B.D.D., C.R.P., L.D., L.B.G., L.H.M., M.A.M., B.R.W.; Resources: C.M.L., S.N.; Conceptualization: C.R.P., I.E.V., L.B.G., L.H.M., B.R.W.; Funding acquisition: C.R.P., L.B.G.; Investigation: C.R.P., I.E.V., S.N., B.R.W.; Project administration: C.R.P., L.B.G., B.R.W.; Supervision: C.R.P., L.B.G., L.H.M.; Writing – original draft: C.R.P., B.R.W.; Data curation: I.E.V., L.D., M.A.M., B.R.W.; Methodology: I.E.V., L.D., B.R.W.; Validation: L.D., B.R.W.