INTRODUCTION
Hepatitis A is an acute liver disease, usually self-limiting, caused by hepatitis A virus (HAV). Transmission is person to person by the faecal–oral route through contaminated food, especially raw or undercooked shellfish, and water [Reference Heyman1]. Humans are the only significant reservoir. A reservoir is any person, animal, arthropod, plant, soil, or substance, or a combination of these, in which an infectious agent normally lives and multiplies, on which it depends primarily for survival, and where it reproduces itself in such a manner that it can be transmitted to a susceptible host [Reference Heyman1, Reference Last2]. The incubation period is between 15 and 50 days [Reference Desenclos3]. Risk of symptomatic infection, as well as the severity, is directly related to age. In children aged <6 years HAV infection is usually asymptomatic, causing jaundice in only 10% of cases. The clinical picture varies from the mild form, lasting 1–2 weeks to a severe and disabling form of several months' duration. Sudden hepatic failure is rare and usually occurs in the elderly or persons with chronic liver disease [Reference Heyman1].
Hepatitis A occurs in both sporadic and epidemic form and formerly had a tendency to cyclic recurrences [Reference Heyman1]. The disease is distributed worldwide; however, there are large differences between regions. For practical purposes the world can be divided into areas with very low, intermediate, and high endemicity and it may vary from one region to another within a country [Reference Heyman1]. In areas of low endemicity (Western Europe, North America, Australia), hepatitis A usually appears as sporadic cases in high-risk groups or as outbreaks affecting a small number of people [Reference Bell, Anderson, Feinstone, Mandell, Douglas and Bennet4].
Seroepidemiological studies carried out in Spain show an increase in the susceptibility to hepatitis A in the population born after 1966, nevertheless Spain is classified as a low endemicity region [Reference Besag, York and Molliè5]. In Spain hepatitis A is a compulsory notifiable disease and individual cases are reported to the National Epidemiological Surveillance Network (Red Nacional de Vigilancia Epidemiologica, RENAVE). Incidence rates at the regional level show variations in different regions with a minimum of 0·18 cases/100 000 inhabitants per year in north-east provinces and a maximum of 18·5 cases/100 000 inhabitants per year in the African regions (Ceuta and Melilla) (see Supplementary material, available online). Time trend shows an unsteady decrease in the rates, in 1997 the national rate was 4·54 cases/100 000 inhabitants and in the last year of this study, 2007, the rate was 2·27 cases/100 000 inhabitants. The analysis of these spatial and temporal variations in the incidence of hepatitis A in Spain motivated the present study.
The aim of our study is to analyse the space–time pattern of hepatitis A risk at the municipal level in Spain during the period 1997–2007.
MATERIALS AND METHODS
This is a retrospective study that analysed the space–time risk of hepatitis A at global and local levels. At the global level we used two estimates of risk: the standardized incidence ratio (SIR) and the posterior probability that the smoothed relative risk is >1 (PP). At the local level we used the scan statistic method to analyse the space–time clusters and we compared the detected clusters with the outbreaks officially notified during the same period. According to Real Decreto 2210/1995 of 28 December [6] an outbreak is defined as: ‘A significant increase in the proportion of cases in relation to the expected values’.
Study area
According to the National Statistics Institute (INE) Spain had a population of 45 200 737 in 2007, distributed in 8112 municipalities. The population varies from municipalities with six inhabitants to municipalities with 3 132 463.
The average population density was 91·4 inhabitants/km2 varying from municipalities with a density of 781·8 inhabitants/km2 to municipalities with 9·1 inhabitants/km2.
These municipalities are aggregated into 52 provinces which in turn are aggregated into 17 autonomous regions and two autonomous cities. The area of Spain is 504 750 km2. Figure 1 shows the Spanish autonomous regions, the provinces and municipalities.
Data sources
We collected cases of hepatitis A reported to RENAVE for the period 1997–2007. Each case notified includes information of age, sex, municipality of case assignment, reporting week and case classification (suspected, probable, confirmed). We excluded from our analysis cases with incomplete or non-existent information about age, sex, municipality, notification week. Outbreak information was obtained from those outbreaks reported to RENAVE during the study period. Regarding the total population we used the 2001 population census stratified by age and sex, from INE. Stratification was into 5-year groups for both men and women. For the purpose of space–time modelling we assigned to each case the UTM coordinates, datum 50 (x, y) of the centroid of the municipality as spatial component. For the time component, the day of onset of symptoms was calculated as the median day of the notification week of each case.
Space–time analysis
Global risk estimate
As stated above, we used two estimates of risk at municipal level to analyse the spatial pattern of hepatitis A in Spain: SIR and PP.
The expected number of cases was estimated for each geographical unit (i.e. municipality in our study), by the indirect method using the national rate as reference. Thus we assumed that the municipal unit had the same incidence rate as the standard region, Spain. Rates were standardized by age and sex.
After obtaining the expected number of cases we computed the SIR for each municipality taking the observed number of cases as the numerator and the expected number of cases as the denominator for each unit. SIR measures the relative risk of a municipality with respect to Spain. The municipalities in which the SIR was significant (P<0·05) were taken into account.
We computed the PP for the smoothed relative risk using the model of Besag et al. (BYM) [Reference Besag, York and Molliè5]. This is a spatial smoothing method based on a Poisson regression model with two random effects, heterogeneity and spatial contiguity. The spatial term uses a conditional autoregressive distribution (CAR). The following formula shows the form of the model:
where λi is the relative risk in area i, O i is the number of cases in the area i, α is a constant, E i is the expected number of cases, h i is the municipal heterogeneity term and b i the spatial term.
To calculate the spatial contiguity effect we built the weight neighbourhoods matrix according to the Rook criterion. This means that those municipalities that share a border with municipality i are its neighbours [Reference Moreno and Vayá7]. Bayesian estimation of the model was computed using the simulation algorithm Markov chain Monte Carlo (MCMC). We also calculated the PP that the smoothed relative risk is >1. We followed the suggestion of Richardson and considered as statistically significant those PPs >0·8 [Reference Richardson8].
Local space–time clusters analysis
In the analysis of space–time clusters cases were adjusted for age and sex. The detection of local space–time clusters was carried out by the space–time scan statistic (developed and fully explained by Kulldorff et al. [Reference Kulldorff and Nagarwalla9–Reference Kulldorff11]) assuming a Poisson distribution.
A cylindrical window that continuously changed in centre, radius and height scanned the geographical area for potential clusters. More precisely the window moved from the centroid of one municipality to the centroid of other municipalities. For each location the radius varied continuously from zero to a maximum of 50 km, and the height of the cylinder, representing the time dimension, varied from zero to a maximum of 40 days considering that the incubation period is quite variable (15–50 days). Therefore for each centroid the cylindrical windows included different sets of neighbouring municipalities and time periods. For each location and size of the scanning window the null hypothesis was that the risk was constant in space and time, and it was the same for each municipality. The alternative hypothesis was that the risk was higher inside than outside the window. During the space–time process many different cylinder sides were evaluated in order to find the most likely cluster. Likelihood functions were calculated and maximized. The most likely space–time cluster was the one with the maximum likelihood corresponding to a given location, radius and time-frame. Its P value was obtained through Monte Carlo hypothesis testing (9999 replications), with a 95% confidence interval.
where C is the total number of cases, c the observed cases within the window and E[c] the adjusted covariate of the expected cases within each window under the null hypothesis. I( ) is an indicator function that equals 1 when the window has more cases than expected under the null hypothesis and 0 otherwise.
Finally, the detected space–time clusters were grouped by year of occurrence and compared with the notified outbreaks using the tetrachoric correlation test.
Statistical analysis was performed with Stata 8, WinBugs 14 and SaTScan 9.0.1 [12] developed by Kulldorff [Reference Kulldorff10]. The maps were created using ArcGIS 9.3.
RESULTS
Cases and incidence rates
During the study period, 1997–2007, 8144 cases of hepatitis A were reported to RENAVE. We excluded 2394 (29%) cases because they had no information of any variable analysed. Of the remaining cases (n=5750), 58% were men and 42% women. More than 70% of these cases were confirmed. We observed great variability in the incidence of hepatitis A across the whole of Spain with a SIR that varies from one municipality to another (from 0 to 1149 cases/100 000 inhabitants).
Space–time analysis
Global risk estimate
Figure 2 shows the spatial distribution of the risk of hepatitis A in Spain (SIR). High risk is concentrated in areas of the Mediterranean coast and in the north of the country. Moreover, high risk appears in the south and some municipalities appear scattered in the central area. Figure 3 shows the municipalities with statistically significant SIR.
Figure 4 shows the spatial distribution of PP. This map has a pattern similar to the patterns described in Figures 2 and 3, thus regions with statistically significant PP (>0·8) are concentrated on the Mediterranean coast in the direction of east towards the southern mainland.
Local space–time clusters analysis
The space–time cluster analysis, adjusted for age and sex, showed 44 non-overlapping and statistically significant space–time clusters (Fig. 5, Table 1). Table 1 shows that the most likely cluster includes 138 municipalities sited in two provinces on the Mediterranean coast. In this area 146 cases were detected between 15 October 1999 and 23 November 1999 and the expected number of cases was 2·8; which gives a relative risk of 53·530. This cluster coincided with an outbreak due to consumption of contaminated coquina clams imported from Peru [Reference Sanchez13]. Secondary clusters covering a smaller number of municipalities were concentrated mostly in the south; however, four of these were in the north. These clusters appeared at different times throughout the 11 years of the study period. Moreover, Table 1 shows the date of cluster detection by the scan method and duration of cluster, and the date of onset of symptoms of the first and last case of each outbreak reported to the outbreak surveillance system.
RR, Relative risk; LLR, log likelihood ratio.
* No date.
** No outbreak notification.
Table 2 shows the correlations between clusters and RENAVE outbreaks for the years 1997–2007, except for 2002 and 2003 which were not analysed as no space clusters were found in those years. The tetrachoric correlation test shows that the correlation is highest in 2004 with a value of 0·6410 and lowest in 2001 with a rate of −0·2735.
DISCUSSION
This is the first study to analyse the space–time risk of hepatitis A in Spain at the municipal level. There are some spatial studies of hepatitis A but these are limited to a specific province or region [Reference Gutierrez14, Reference Oviedo15]. Similar studies have been conducted in other countries to analyse the space–time clusters of hepatitis A with the same methods [Reference Guis16, Reference Sowmyanarayanan17]. Further, these methods have been used in other countries to analyse the space–time clusters of other infectious diseases such as tuberculosis, influenza, salmonellosis, giardiasis, malaria or West Nile virus infection [Reference Braga18–Reference Perez21]. Thus it is important to discuss some conceptual and methodological issues.
Cases and incidence rates
We should name some limitations regarding case notification source. First, it should be taken into account that a case should be attributed to the municipality where the infection took place; however, in some cases the notified municipality was the residence municipality. Second, the exact date of onset of symptoms was not available, only the week of notification was recorded. Nevertheless, the use of the median day of the notification week does not imply a significant bias. Finally there was the problem of subnotification of cases, which can differ by region and this could produce an underestimation of the risk which would affect its spatial distribution. However, we consider that those discrepancies have not introduced an important bias in the estimated spatial pattern and temporal clusters.
In high endemicity regions the age at first infection will be lower and so may be symptomatic disease, as the risk of symptomatic disease increases with age. Even though within Spain there are regions of high and low prevalence, Spain is classified as a low endemicity country as a whole.
Different seroprevalence levels induced by different disease prevalence or by different vaccination programmes, result in different notification rates from region to region. This poses a limitation to our study as it is not possible to include the seroprevalence for each municipality, or even region, and there are different vaccination strategies in the Spanish regions.
According to the seroprevalence survey of 1996, in the >40 years age group 70% living in cities and 84·3% living in rural areas had suffered from the disease [22]. Unfortunately apart from the national seroepidemiological survey in 1996, we do not have other information about the seroepidemiology of hepatitis A in Spain. Small studies exist but the populations studied vary considerably (a town or a whole region) and are from different times [Reference Dal Re, Garcia-Corbeira and Garcia-de-Lomas23–Reference Soriano28].
With respect to the vaccination schedules, only three Autonomous Regions included HAV vaccination universally: Catalonia started in 1998 with a pilot programme against HAV+HBV aimed at 11-year-old children. In 2000 Ceuta and Melilla started vaccination of children aged between 1 and 2 years against HAV and vaccination of 13-year-olds against HAV and HBV. However, all regions vaccinated the population in the risk groups.
The main limitation in this study is that the notification rate at national level varies widely between regions. Although some regions notify hepatitis A cases if they do not include the variables used in this study (municipality, age, sex) those cases were not included.
As notification of hepatitis A at the national level in Spain is annual, a real-time analysis is not feasible. However, the notification protocols are currently under review and it is proposed to have weekly notification for hepatitis A.
Global risk estimate
The results from SIR, PP and space–time cluster analysis, suggest a similar space–time pattern in the distribution of hepatitis A risk in Spain. Furthermore, these outcomes are consistent with the outbreaks reported to RENAVE for the study period.
The fact that risk is located around the same areas may suggest a possible environmental origin. A potential hypothesis may be the relationship of the cases to the quality of the drinking water due to its difference between regions in Spain. There are studies that related cloudy drinking water with gastrointestinal illness in the USA [Reference Nunes29].
Local space–time clusters analysis
We chose Kulldorff's method for the space–time cluster analysis because it has several advantages: it adjusts for population density and confounding variables (e.g. age, gender); there is no pre-selection bias since the clusters are searched with no prior hypothesis on their location, size or time period; the test statistic takes into account multiple testing and delivers a single P value; and if a cluster is detected, its location and time-frame is specified.
The correlation between outbreaks and clusters is higher (0·6) for some years and much lower for others. This may be because some outbreaks have not been detected or notified to RENAVE; but it may be also related to the method for detecting space–time clusters. In some cases the date of commencement of the cluster does not coincide with the declaration of the outbreak, this may be due to the difference in the sources analysed (outbreaks and cases). The Scan method is able to detect as clusters those outbreaks that have been notified and also identify other clusters that have not been declared and should be investigated. This technique has been applied to other diseases not analysed in this study. For example, Nunes et al. used these techniques to analyse the incidence of tuberculosis in Portugal [Reference Nunes29]. Odoi et al. investigated space–time cluster of giardiasis in Canada [Reference Odoi20]. Sugumaran et al. conducted a space–time study for West Nile virus using this methodology [Reference Sugumaran, Larson and Degroote30]. Other authors such as Mammen et al. analysed space–time clusters of dengue in Thailand [Reference Mammen31]. Nordin et al. used these methods for prospective analysis of spatial clusters for a possible bioterrorist attack [Reference Nordin32].
Regarding the reported outbreaks, only four of them, that included those municipalities that formed part of detected clusters, were foodborne (such as the outbreak due to consumption of coquina clams). The transmission in most of these outbreaks was unknown, or person to person (which does not rule out an environmental origin). Therefore, future studies should included in subsequent models environmental variables such as quality of drinking water, given that hepatitis A can be waterborne; meteorological data such as temperature and humidity; changes in agriculture; and other socioeconomic variables that may affect the spatial distribution of risk such as overcrowding or rurality index. There are studies linking the incidence of hepatitis A with environmental and socioeconomic variables. Sowmyanarayanan et al. analysed an outbreak of hepatitis A in India associated with coliforms in drinking water [Reference Sowmyanarayanan17]. Guis et al. analysed space–time clusters in an area of France linking them to environmental factors [Reference Guis16], and Braga et al. estimated risk areas of hepatitis A in Brazil adjusting for socioeconomic status and other hygiene variables [Reference Braga18].
We conclude that the estimated spatiotemporal pattern suggests hypotheses about possible links between environmental or socioeconomic factors and the spatiotemporal distribution of hepatitis A risk. Furthermore, we have shown how these spatial statistics methodologies can be complementary tools in epidemiological surveillance of infectious diseases.
NOTE
Supplementary material accompanies this paper on the Journal's website (http://journals.cambridge.org/hyg).
DECLARATION OF INTEREST
None.