INTRODUCTION
Cryptosporidium is a protozoan parasite that causes the diarrhoeal disease, cryptosporidiosis. Although many species have been identified, 96% of human disease in England is caused by the two species C. parvum and C. hominis [Reference Chalmers1]. First recognized as a human pathogen in 1976, Cryptosporidium has been screened for routinely in faecal specimens from patients with diarrhoea in the UK and some other countries since the 1990s with UK guidance recommending routine testing [Reference Nichols, Hunter, Wait and Ronchi2–Reference Chalmers4] and reporting of confirmed infections to the local and national public health surveillance systems. Between 1983, when surveillance of cryptosporidiosis in England and Wales began, and 2005 the Health Protection Agency's Centre for Infection received notification of 151 outbreaks involving 9893 cases, giving a mean outbreak size of 66 [Reference Chalmers1]. Identified outbreaks comprised about 10% of reported cases [Reference Nichols5]. Three quarters (113/151) of identified outbreaks were related to water. Of these, 62 were related to drinking water supplies, 44 to swimming pool use, and seven to other recreational water exposures. Risk assessment and targeted risk reduction measures, such as drinking water filtration, have been associated with a reduction in public water supply-associated outbreaks and in the mean size of identified outbreaks [Reference Nichols5]. Most incident cases of cryptosporidiosis are not diagnosed as being due to Cryptosporidium [Reference Chalmers1, Reference Smith6, Reference Tam7]. The apparent size of outbreaks, estimated from laboratory-confirmed cases, will therefore usually substantially underestimate the true number of cases involved in the outbreak. Recreational water contact, such as at swimming pools; animal contact, such as at petting farms; and person-to-person spread, such as among daycare nursery children and staff, form an increasing proportion of identified outbreaks [Reference Nichols5].
Early detection of outbreaks has the potential to significantly reduce their scale [Reference Chalmers1]. Recognition of outbreaks at any stage of their evolution may support identification of preventable risk factors for future outbreaks or for sporadic cases. The proportion of cryptosporidiosis outbreaks that are detected is uncertain [Reference Chalmers1]. Although temporal clustering alone has detected large, typically waterborne, outbreaks [Reference Proctor, Blair and Davis8], it is unlikely to provide an efficient outbreak detection signal for small outbreaks against a background of endemic disease. Furthermore, it is limited by the marked temporal clustering of apparently sporadic cases into spring and autumn peaks [Reference Chalmers1]. Although very large and widespread outbreaks of cryptosporidiosis can still occur detected outbreaks of cryptosporidiosis are often spatially restricted in line with the geographical patterning of their sources in water supply zones and recreational water or childcare facilities with localized catchment areas [Reference Nichols5]. The combination of spatial and temporal clustering may be able to identify small localized outbreaks when temporal analysis alone cannot. Enhanced surveillance involving systematic recording and analysis of exposure information from individual cases may also identify small outbreaks with a shared exposure that do not create a detectable signal over sporadic background cases [Reference Chalmers9].
Since spatiotemporal analysis and systematic risk factor surveillance require resources, their application for public health purposes must rest on an evidence base for effectiveness. The conditions needed to justify such enhanced surveillance could be summarized as: first, that there are enough additional missed, but detectable, outbreaks to constitute a significant public health burden; second, that these methods can identify additional outbreaks efficiently; and third, that investigation of these types of outbreaks provides information to inform effective primary preventative and outbreak control measures. We hypothesized that apparently sporadic cases may include undetected small local outbreaks, that many of these outbreaks would produce detectable spatiotemporal clustering signals or identifiable shared risk factors that would allow their detection if this information were collected and analysed. To test this hypothesis we applied risk factor surveillance prospectively for a period of 2 years, and used spatiotemporal analysis to identify clusters retrospectively on the same dataset.
MATERIALS AND METHODS
Study population
The English counties of Berkshire, Buckinghamshire and Oxfordshire, with a combined population of 2 226 600 (2009 estimate) formed a single administrative unit for public health reporting purposes, with reports made to the Thames Valley Health Protection Unit (TVHPU). Reported cases of laboratory-confirmed cryptosporidiosis during the calendar years 2009–2010 were included. One hospital laboratory routinely tested faecal samples from patients with diarrhoea aged <12 years and others followed national guidance in testing all samples. Positive results are reported at least weekly by each laboratory.
Case and outbreak definitions
Cases were defined as laboratory-confirmed Cryptosporidium infection. Local laboratories used microscopy for detection and reported positive results to TVHPU through an automated electronic system in line with routine national UK systems for reporting of infectious diseases. An outbreak was defined as either ⩾2 cases linked to a common source, or ⩾2 cases in a statistically significant spatiotemporal cluster which had reported an exposure to a shared risk factor but where no outbreak investigation had been undertaken at the time. Where the shared exposure was either a shared household or shared international travel this was not counted as an outbreak. Secondary cases, defined as those occurring in a household sufficiently later than the primary case (3 days) to have possibly been caused by household transmission, were excluded from outbreaks when assessing size.
Enhanced surveillance
A risk factor questionnaire (see Supplementary online material) was developed and administered to cases with laboratory-confirmed Cryptosporidium infection. The questionnaire included questions on travel, water supply, water consumption, animal contacts, and recreational water exposure. The standard method of administration was by telephone interview by a health protection practitioner or public health trainee at TVHPU. When it was not possible to make contact by telephone because no number was available, or no reply obtained on available numbers after at least three attempts, or when the respondent was not willing to undertake the questionnaire by phone, either postal questionnaire or face-to-face interview by a local government environmental health officer was attempted. Postal or face-to-face questionnaires used the same instrument. The choice of alternative methods did not follow a standard protocol, varying between the different local government authorities in the study area.
Questionnaire responses were entered onto an Excel spreadsheet by administrative staff. Risk factors were reviewed by clinicians by checking each new questionnaire against recent cases for shared exposures. Where ⩾2 cases were identified with shared risk factors these were assessed by the unit outbreak assessment protocol.
Spatiotemporal modelling using SaTScan
All cases of cryptosporidiosis in the study population between 1 January 2009 and 31 December 2010 were analysed retrospectively using the scan statistic in the software SaTScan v. 9.1.1 [Reference Kulldorff10] to seek spatiotemporal clusters. This retrospective analysis simulated monthly prospective investigation of all data that would have been available on the last working day of each month (see Fig. 1). The SaTScan scan statistic evaluates space–time clusters by gradually scanning a cylindrical window, where the base represents space and the height represents time, across recently notified cases of Cryptosporidium. For each window a likelihood ratio statistic is computed based on the number of observed and expected cases within and outside the window. The window with the highest likelihood ratio is the most likely cluster and is assigned a P value through 999 Monte Carlo simulations [Reference Kulldorff11]. The longest duration of a cluster was set to 90 days, apart from in the first 6 months of the study when it was set to 50% of the study period. The maximum spatial cluster size was set to 10% of the population at risk. These choices were pragmatic, with a 90-day maximum window being short enough to support the detection of clustering in time during the 2 years of the study, while also allowing for outbreaks to be relatively prolonged such as may occur if a swimming pool and the associated swimmers become contaminated and leading to a prolonged common source outbreak. Results are presented for the cylindrical model assuming a Poisson distribution of cases. Other models are considered in the Discussion section.
Validating identified clusters and comparing the results of each approach
Risk factor data for cases belonging to the clusters identified by SaTScan were examined to verify whether the cases had a common exposure. The earliest time at which the clusters could be identified by SaTScan (given the simulated 'last day of the month’ run of the model) was compared with the time at which known clusters had been identified prospectively by TVHPU using the enhanced surveillance questionnaire.
RESULTS
Prospective enhanced surveillance
Four hundred and six cases of laboratory-confirmed cryptosporidiosis were reported, 216 in 2009 (53·2%) and 190 in 2010 (46·8%). The weekly numerical peaks of cases were seen in week 48 in 2009 (10 cases) and week 46 in 2010 (13 cases). Enhanced surveillance questionnaires were available for 366 (90%) of these cases. Five outbreaks were identified through the combination of enhanced surveillance questionnaires and routine public health activities [outbreak (OB)1–OB5, Table 1]. The outbreaks ranged in size from 3 to 14 (median 3.5) laboratory-confirmed cases within the study population. Overall they comprised 36 cases (Fig. 1).
SS, SaTScan; OB, outbreak.
* Date of notification of the second linked case.
† The number of SaTScan clusters included ⩾2 cases from this outbreak, these numbers add up to greater than the number of clusters by SaTScan because some clusters contained cases from more than one outbreak.
‡ Nineteen primary laboratory-confirmed cases were identified in OB5 and six in OB1 but five and two, respectively, were outside the study area.
§ There were five pairs of linked cases that either lived in the same house or had travelled to the same destination simultaneously.
Spatiotemporal cluster detection
SaTScan (SS) analysis identified 29 clusters comprising 146 cases. All five of the outbreaks recognized by prospective enhanced surveillance were detected within these clusters. Following review of enhanced surveillance questionnaire data for the cases in each of the 29 clusters detected, three additional outbreaks were identified where clustered isolates shared exposure to a risk factor (SS1–SS3, n = 2–3, Table 1). In line with expectations that an outbreak would generate more than one cluster given that clusters were sought monthly looking back over the preceding 90 days, cases from these eight outbreaks contributed to 19 of the 29 clusters identified by SaTScan, while no outbreak could be detected within the other ten clusters. Two of the 19 clusters associated with outbreaks contained cases from two outbreaks. Eight clusters contained cases from an outbreak as well as others linked by sharing household or travel-related exposures (i.e. non-outbreak). Of the ten clusters not associated with an outbreak, seven did not include any cases with identifiable shared risk factor exposures, shared household or shared travel links. The remaining three clusters contained cases with shared household or travel-related exposures but not cases with links that met our outbreak definition. Overall, 53 (36%) of the 146 individual cases identified in one or more clusters were epidemiologically linked, reducing to 43 (29%) when excluding household and travel-related clusters.
Characteristics of outbreaks detected and detection method performance
The eight identified outbreaks involved between two and 14 laboratory-confirmed cases within the study area (Table 1). Six outbreaks were linked to swimming pools or leisure centres with swimming pools, and two to petting farm or open farm exposures. OB2 and OB5 were each identified as outbreaks after three cases had reported exposure to a common setting. OB1 was notified as an outbreak prior to the onset date of the second laboratory-confirmed case due to an additional confirmed case notified outside of the study area. The notification of local clinicians and, where appropriate, provision of information to groups sharing the same exposure in response to outbreaks detected during 2009–2010 may have led to additional case detection for some of these promptly identified outbreaks.
Five outbreaks (OB1–OB5) were detected by both enhanced surveillance questionnaires at the time of receipt and by SaTScan models. Two of these were detected earlier by SaTScan when it was run monthly on the last working day of the month (1 day and 7 days earlier) and three by prospective enhanced surveillance questionnaires (9, 34, and 63 days earlier). In the outbreak detected 63 days earlier by enhanced surveillance questionnaires, early cases reported a shared swimming pool exposure but did not live close to each other and so were not identified by SaTScan as a spatiotemporal cluster. The three additional outbreaks identified retrospectively following review of SaTScan clusters (SS1–SS3) were small outbreaks of two, three, and two cases for which shared exposures were identified on review of the enhanced surveillance questionnaires by SaTScan cluster that had been missed before this review.
DISCUSSION
The systematic application of surveillance questionnaires identified five outbreaks of cryptosporidiosis. The addition of retrospective spatiotemporal cluster detection using SaTScan identified these and a further three outbreaks. This detection of eight outbreaks identified over 2 years contrasts with preceding national data of 151 outbreaks reported to national surveillance over a 23-year period [Reference Chalmers1], on which basis our study population, assuming a similar risk to elsewhere in the country, would be expected to have one detected outbreak every 4 years. The outbreaks observed were mostly too small to have been detected in the absence of either systematic collation of risk factor data or formal assessment for spatial as well as temporal clustering. The study area does not have particular or unique features and the identified outbreaks were geographically widespread within the study area. This suggests that similar undetected outbreaks may exist in apparently sporadic cases of cryptosporidiosis in other areas and at other times, which is supported by the relatively common reporting of swimming pool exposure among sporadic cases in other work [Reference Chalmers9]. Although individually small in terms of confirmed cases, the frequency of these outbreaks means that any processes contributing to them may contribute substantially to overall disease burden.
This work also shows that SaTScan is a sensitive and specific approach for detecting even very small spatiotemporal clusters created by common-source cryptosporidiosis outbreaks. Our application of SaTScan identified 29 clusters, 22 of which mapped to these eight small outbreaks, or onto cases which shared a household or had travelled together. Repeated identification of outbreaks in subsequent clusters is not a limitation of the approach but an advantage since it is a desirable feature of an outbreak detection system that it consistently identifies epidemiological signals present in the data [Reference Jiang and Cooper12]. The remaining seven clusters may have been false-positive signals; however, it is also possible that some of these seven clusters shared a common source not identified in our questionnaire data. Overall, we consider the performance of the approach excellent in detecting small cryptosporidiosis outbreaks. Tight spatiotemporal clustering may occur due to limited catchment areas of facilities such as swimming pools, and transient contamination and transmission in these settings. This may mean that the SaTScan methods are particularly well matched to our application and to this disease. Nonetheless, the findings of this study identifying a set of missed 'mini-outbreaks’ should encourage consideration of evaluated implementation of these enhanced surveillance approaches to other pathogens which might share similar epidemiological factors.
SaTScan [Reference Kulldorff10] applies a scan statistic, the most usual approach in this area [Reference Kulldorff13–Reference Takahashi21] although Bayesian approaches have also been developed [Reference Jiang and Cooper12]. Recent software development in this area has focused on improving performance to detect large scale disease outbreaks and on work with syndromic data. The better performance of newer models in these scenarios contrasts with a recurrent finding of equivalent or superior performance of a simple scan statistic as implemented in SaTScan, and in particular the Poisson fixed circular model to detect small geographically restricted outbreaks [Reference Kulldorff14, Reference Yao, Tang and Zhan17, Reference Neill20, Reference Takahashi21]. This, and its relative ease of use, support the application of SaTScan in detecting small local outbreaks [Reference Robertson and Nelson22]. SaTScan allows some of the flexibility found in newer approaches [Reference Robertson and Nelson22], such as allowing clusters to be identified in non-circular geographical areas and using population denominators or area as a denominator. The results presented here were for the circular fixed Poisson model which was the most sensitive of the six options tested. In coming to this choice circular and elliptical model options were also applied when assessing clustering of cases and this was done using individual case postcode for location, Office for National Statistics Lower Super Output Area centroid for location, and finally Lower Super Output Area both for location and to define the underlying population size to support a Poisson model of cases per member of the underlying population. An elliptical Poisson model had slightly higher specificity in this dataset but detected some clusters later than the circular model. The agreement of our experience with previous literature on the Poisson fixed circular model would direct future work to consider different variations in the spatial and temporal characteristics of the scanning window rather than further exploration of different models. We used a maximum duration of 90 days and a maximum spatial cluster size of 10% of the study population. Varying these values using the Poisson circular model may allow improved performance. The optimal values will depend on the nature of outbreaks that are occurring in terms of outbreak duration and extent of geographical clustering. In outbreaks of Cryptosporidium, this may vary between urban and rural areas where the catchment populations for particular leisure facilities may differ. We also adopted a pragmatic approach of seeking clusters monthly. This may also be further optimized in future research. In the absence of such further evidence to guide changes to temporal and spatial cluster windows we consider that settings allowing clusters to last for up to 90 days and across a population of ∼220 000 is likely to be efficient for the detection of small and local outbreaks such as would be expected when transmission is among children at a swimming pool setting and other local catchment area facilities.
The systematically recorded and analysed questionnaire was also effective in identifying outbreaks. Five outbreaks had been detected before SaTScan was applied, far in excess of what would be expected based on national data [Reference Chalmers1]. Had the questionnaire data been reviewed correctly prospectively, all eight confirmed and suspected outbreaks could have been identified as evidenced by the shared exposures identified on review of the questionnaire data for cases in SaTScan clusters. Human error and differing names for the same facility can contribute to shared exposures being missed. Systematic coding and searching of exposures for all cases with exposure questionnaires could improve performance. Without this, variations in how exposures are named or clerical error in data entry may result in missed outbreaks. We are now developing more structured approaches to recording and identifying shared exposures, including the recording and automated searching of postcodes for venues.
In this dataset, 146 cases (36%) belonged to the 29 clusters identified by SaTScan. An approach of restricting enhanced surveillance questionnaires to cases identified as belonging to a SaTScan-detected cluster would have thus reduced the number of cases followed up with questionnaires by 64%. It would also allow the use of bespoke questionnaires for later cases in some clusters where initial cases appear to suggest a possible source of infection for the cluster. However, this would need to be balanced against the questionnaires being delayed until after a cluster was identified (Fig. 1) with the consequent possible reduction from the 90% response rate achieved by prompt telephone questionnaire and follow up, and a possible increase in recall bias due to the greater delay in between exposure and response. Applying questionnaires to cases identified as part of a cluster by SaTScan, but not to other cases, also restricts outbreak detection to cases that are clustered in space and time. The substantially earlier identification of two outbreaks (34 and 63 days earlier) by enhanced surveillance questionnaires was either due to linked exposures of cases in the study population which were not geographically clustered, or one laboratory-confirmed case in the area with a completed questionnaire being considered part of an outbreak in the light of information on other cases outside the study area. We are currently evaluating an approach of using enhanced surveillance questionnaires after cluster detection in other populations where surveillance questionnaires are not used routinely to assess the utility and feasibility of this approach in practice. A further widely applied approach to detecting hidden outbreaks, where there is no epidemiological evidence to suggest that cases are linked, particularly, is microbial subtyping. This has been particularly effective for bacterial infections and the advent of genome sequence data is supporting increased application of subtyping including organisms where phenotyping has not proved useful for outbreak detection [Reference Cody23]. Most Cryptosporidium-positive samples in England are discarded and few referred to the national reference laboratory. The lack or routine referral of samples for typing limits the application of this approach to outbreak detection as well as technical limitations to typing of parasites compared to bacteria and viruses. In England and Wales, the usual level of typing done for apparently sporadic cases is to the confirm species [Reference Chalmers9]. Nonetheless, with the advent of genome sequencing, and upstream Cryptosporidium DNA preparation techniques to enable application to routinely submitted clinical samples, this may be a further tool to identify outbreaks when validated.
Our application of two approaches to Cryptosporidium surveillance was associated with a substantially greater than expected level of outbreak detection in the study population and showed that recreational water facilities may be particularly important sources of small outbreaks. The additional resource needed to apply these systematic approaches to identifying outbreak signals within the surveillance data was small in the context of a pre-existing UK policy of testing for, and reporting of, this pathogen in cases of gastroenteritis. Our findings fit with the literature on outbreak size decreasing and these types of exposures becoming relatively more important following the application of control measures to public drinking water systems [Reference Nichols5]. However, the finding is novel in that the number of outbreaks is an order of magnitude higher than expected and their size an order of magnitude lower [Reference Chalmers1]. Given the small numbers of cases per outbreak with a median of just 3·5 one could question whether their detection has any public health utility. Given that most remain small their detection will not allow outbreak control interventions to decrease future cases, except in the unusual cases where either the small outbreak represented the start of what would have been a larger outbreak, or where it leads to investigation of underlying remediable risk factors at a facility which pose a high risk for further transmission if not remedied. However, identification and quantification of shared risk factors across these outbreaks, which together comprise 5–10% of the overall disease burden in our population, may offer a basis for proactive intervention across the settings producing these cases. In addition, many apparently sporadic cases may be caused by the same risk factors that are driving these small detected outbreaks. Since most incident cryptosporidiosis cases are not diagnosed or reported [Reference Chalmers1] many apparently sporadic cases may be part of small outbreaks where the other cases went undiagnosed. This would make risk factors identified relevant to some apparently sporadic disease as well as the 5–10% of cases that we have been able to link within outbreaks. If our findings are replicated in a larger population, the identification of these outbreaks could identify swimming pools and other recreational facilities (such as animal petting attractions) associated with outbreaks in sufficient numbers to allow formal comparison with similar facilities lacking such outbreaks through analytical epidemiological studies with, for example, swimming pools as the unit of analysis. The study of fixed and transient risk factors associated with such facilities being involved in an outbreak may identify remediable factors contributing to both outbreaks and apparently sporadic disease. Evidence of factors associated with transmission could inform guidance to swimming pools to prevent such small outbreaks. Such evidence could also support the control of larger incidents, such as the substantial outbreaks in the Western United States in 2007. A review of this incident recommended that when case numbers go up enhanced control measures should be applied to swimming pools [24]. This recommendation is based on evidence and assumptions that swimming pool exposure makes a major contribution to sustaining Cryptosporidium transmission in these events. Identification of leisure facility features associated with transmission of this infection should contribute to better targeted preventative and outbreak control interventions.
SUPPLEMENTARY MATERIAL
For supplementary material accompanying this paper visit http://dx.doi.org/10.1017/S0950268814000673.
ACKNOWLEDGEMENTS
This research received no specific grant from any funding agency, commercial or not-for-profit sectors. A.D.M.B. is funded by the National Institute for Health Research.
DECLARATION OF INTEREST
None.