INTRODUCTION
Crimean-Congo haemorrhagic fever (CCHF) is a viral zoonotic disease with a high mortality rate in humans. CCHF is a public health problem in many regions of the world such as Eastern Europe, Asia, the Middle East and Africa. The ecological complexity of vector-borne diseases, therapeutic controversy, and human-to-human transmission of a zoonotic infection make CCHF an interesting topic for research [Reference Ergönül1].
There has been a substantial increase in reports of CCHF virus (CCHFV) over the past 5 years and the geographical range of CCHFV is the most extensive of the tick-borne viruses that affect human health [Reference Ergönül1–Reference Chinikar3]. CCHFV can be transferred from endemic to non-endemic areas due to carriage of large numbers of infected ticks by migrating birds and livestock. This can cause the spread of CCHFV into uninfected areas [Reference Vorou4]. Although there are other species of genera of ticks which act as vectors of CCHF [Reference Shepherd5], the genus Hyalomma has the main role in the transmission cycle of CCHF, especially in the southeast of Iran [Reference Chinikar3, Reference Mehravaran6, Reference Yaser7].
Several studies have discussed the relationship between climatic factors and the vector's life-cycle, ecological conditions, and consequent occurrence of CCHF in human populations [Reference Ergönül1, Reference Gale8–Reference Vescio10]. On the other hand, climatic factors may lead to livestock diseases through their effects on a number of factors including the range and abundance of vectors and wildlife reservoirs, survival of the pathogen in the environment and farming practice [Reference Gale8, Reference Gray11]. The changes in climatic conditions have been suggested as facilitatory factors for reproduction of the tick population and the consequent increase in the incidence of tick-borne infectious diseases [Reference Ergönül1, Reference Gray11]. Moreover, legal and potentially illegal animal transportation from neighbouring countries impose a risk of international spread from endemic countries to other ones [Reference Vorou4, Reference Gray11, Reference Alam12]. There are both legal and illegal livestock transportation and uncontrolled population movement between Iran and its neighbouring countries, Pakistan and Afghanistan [Reference Alam12]. Therefore, the occurrence of CCHF appears to be affected by the import of livestock from neighbouring countries [Reference Mostafavi9] and this should be considered in early warning of CCHF outbreaks.
So far the occurrence of CCHF has been reported in 23–26 out of 30 provinces of Iran. From 1999 (starting the register of CCHF in Iran) to 2012, the Sistan-va-Baluchistan province, which is located in the southeast of Iran, has been recognized as the most important endemic site of the disease in the country [Reference Chinikar3, Reference Mardani13, Reference Keshtkar-Jahromi14]. In Iran the most affected professions included jobs that involved the handling of blood and organs from infected livestock [Reference Chinikar3]. Therefore, infected livestock play a main role in the transmission cycle of this infection.
It is clear that an early warning of CCHF outbreaks based on related factors could prevent outbreaks, decrease mortality rates, and help to target preventive actions. An early warning system is a competent tool for working as a surveillance system for early diagnosis of the disease in humans and animals and for monitoring suspected outbreaks [Reference Vorou4].
This study reports the surveillance data collected during the past 13 years in Sistan-va-Baluchistan province, and uses the climate findings and legal livestock importation from Pakistan (LIP) details to create a basis for making early warnings of CCHF outbreaks. Specifically, this study attempts to follow two goals. First, it explores the potential impact of weather variability and LIP on CCHF incidence using valuable time-series models. Second, it examines the difference between the predictive ability of MSM and SARIMA models.
MATERIALS AND METHODS
Study area
Sistan-va-Baluchistan province is located in the southeast of Iran, and has common borders with Pakistan and Afghanistan. Its capital is Zahedan, which is located between 45° 32′ and 48° E and 34° 47′ and 35° 1′ N. The economy is mainly based on agriculture and livestock. As a result, a large proportion of the population comes into close contact with livestock. This provides a high risk of exposure to CCHF virus. Sistan-va-Baluchistan province today accounts for one of the driest regions of Iran with a slight increase in rainfall from east to west and an obvious rise in humidity in winter. The climate is tropical with two distinct seasons: a dry season from April to October, characterized by relatively low rainfall from November to March. The climate condition of this province is similar to the border provinces of Pakistan and Afghanistan [Reference Raziei15]. About 10% of the population of this province consists of immigrants from Afghanistan and Pakistan. There is also some nomadic population in sparse and scattered villages and their usual occupation is tenting and trading livestock [Reference Alavi-Naini16].
Case definition
A confirmed CCHF case was defined as one with a positive IgM or IgG serological test (ELISA method) and/or positive by RT–PCR detection of viral RNA in the serum sample sent to the Laboratory of Arboviruses and Viral Haemorrhagic Fevers, Pasture Institute of Iran, Tehran, Iran [Reference Chinikar17]. All cases had been reported and registered according to the surveillance and control programme of CCHF in Iran. An individual was considered to be a suspect case when exhibiting sudden onset of fever, myalgia, and different haemorrhagic manifestations with an epidemiological background such as a history of tick bite, handling animal or human blood or tissue. These suspected cases were screened as probable cases whose symptoms include thrombocytopenia (platelets <150 000/mm3), leucopenia [white blood cell count (WBC) <3000/mm3] or leucocytosis (WBC >9000/mm3). Finally, any CCHF probable case whose serum was positive for immunoglobulin M (IgM) antibodies and/or who was positive by RT–PCR detection for viral RNA was considered a CCHF confirmed case. The samples of cases classified as a probable CCHF are sent to the National Reference Laboratory [Reference Chinikar18].
Data collection
The data of all confirmed CCHF cases have been registered from 2000 to 2012 by the surveillance system of the Province Health Centre of Zahedan University of Medical Sciences (ZUMS). This databank is under the supervision of the Centre for Management of Communicable Diseases in Iran. However, data from 2013 was used for checking the model's validity.
The monthly temperature data (°C), the monthly accumulated rainfall (in mm3), and the monthly relative humidity (percentage) were collected from the meteorological organization of Sistan-va-Baluchistan province. We used the records of two synoptic centres located in the east (Saravan station) and northeast (Zabol station) of the province. As the occurrence of CCHF in Sistan-va-Baluchistan province appeared to be affected by the import of livestock from neighbouring countries, the data for LIP were collected from the veterinary organization of Sistan-va-Baluchistan province and also via searching the documents related to quarantine during these years. As the importation of legal livestock requires the authorization of the Veterinary Organization of Sistan-va-Baluchistan province, this organization supervised all legal livestock importation from Pakistan and registered the number of imported livestock each day.
Statistical analyses
Simple (unadjusted) analyses were conducted for response and each explanatory variable as univariate analysis. Cross-correlation coefficients were used to compute a series of correlations between explanatory variables (climate factors and LIP) and the incidence of CCHF over a range of time lags. A time lag was defined as the time span between explanatory-variable observation and the incidence of CCHF (e.g. in this study the correlation between CCHF incidence and mean temperature at lag 1 is the correlation between number of CCHF cases in this month and mean temperature in the previous month).
SARIMA model
As both CCHF incidence and weather variables exhibited strong seasonal variation and fluctuations in their annual means, the seasonality was adjusted by first seasonally differencing the series, replacing each observation by the difference between itself and the observation a year ago (i.e. seasonal component was removed by a seasonal differencing: Zt – Zt – s (where Zt = values of the time series at time t and Zt – s = values of the time series at time t −12 months). Four steps were undertaken in the modelling as follows.
First, the variance of the series was stabilized by natural logarithm transformation. The dependent variable, LIP and accumulated rainfall were log transformed. Then, prior to modelling, both CCHF incidences and explanatory variables were transformed into a stationary input series [Reference Helfenstein19]. Second, SARIMA models were developed using the log-transformed monthly incidence of CCHF. The log transformation facilitated the assumption of normally distributed responses. Seven main parameters were selected when fitting the SARIMA (p,d,q)(P,D,Q)s model: the order of autoregression (p) and seasonal autoregression (P), the order of integration or regular differencing (d) and seasonal integration or seasonal differencing (D), and the order of moving average (q) and seasonal moving average (Q), and the length of seasonal period (s). To identify the order of moving average and autoregressive parameters, the structure of temporal dependence of stationary time series is assessed by the analysis of autocorrelation (ACF) and partial autocorrelation (PACF) functions, respectively. The selection of SARIMA processes was conducted using Akaike's Information Criterion (AIC). Of all the models tested, a SARIMA (1,0,1)(0,1,1)12 model was found to best fit the data [Reference Helfenstein19].
The explanatory variables with different lags (delayed effects) were found to correlate with CCHF cases using cross-correlation coefficients as univariate analysis. The regression SARIMA model was fitted with climate variables and LIP as external regressors to CCHF incidence.
Third, the ACF of residuals and the Ljung–Box test were used to check if the assumption were met [Reference Helfenstein19]. Finally, the best final model was selected using AIC, which measures how well the model fits the series and the validity of the models was checked by fitting with data from 2013. The root mean square error (RMSE) for models was also assessed. The RMSE is equal to the square root of the mean of the differences between true and predicted values [Reference Helfenstein19].
Markov switching model (MSM)
Although we can control for seasonality and trend, when the occurrence of outcome is not linear (such as CCHF occurrence in this study) or the research is analysing surveillance data from small geographical areas with health conditions that are rare, the Box–Jenkins SARIMA model is not recommended [Reference Nobre20]. On the other hand, MSM is one of the time-series models that is used for modelling nonlinear outcomes.
The MSM [Reference Lu, Zeng and Chen21] belongs to the family of state-space models. There are two types of equations in this model: measurement equations and transition equations. The measurement equation defines how hidden states affect the observable random variables. The transition equation defines how the state variables evolve over time. The observable random variables in the MSM depend on their historical values as well as the hidden state variables. This setting makes the MSM more suitable for time-series-related problems. A simple MSM can be written as:
Equation (1) defines how the hidden state variable S t controls the dynamics of the observable random variable y t . In a non-outbreak period (S t = 0), y t is determined by a drift term a 0,0 and the autoregressive parameter a 1,0. If an outbreak occurs (S t = 1), the drift term increases to a 0,0+a 0,1 and the autoregressive parameter increases to a 1,0 + a 1,1 (assuming a 0,1 ⩾0 and a 1,1 ⩾0). Equation (2) indicates that the hidden states evolve following a Markov process with transition probability P ij . Pij is the probability of state j at time t conditional to state i at time t – 1.
A MSM with exogenous variables and modelling of seasonal effects can be written as:
where $b_1 \sum\nolimits_{i = 1}^{12} {\sin \left( {{\textstyle{{2\pi i} \over {12}}}} \right)} $ and $b_2 \sum\nolimits_{i = 1}^{12} {\cos \left( {{\textstyle{{2\pi i} \over {12}}}} \right)} $ are seasonality control terms and $\sum\nolimits_{i = 1}^k {c_i \;v_{t,i}} $ are exogenous and potential controlling factors. If necessary, lagged independent variables can also be included. For example, we can set v t,1 = x t–1, v t,2 = x t–2, …, v t,n = x t–n . The expectation maximization (EM) algorithm [Reference McLachlan and Krishnan22] was used for model estimation.
However, the probability of outbreak at period t + 1, could be estimated as follows:
The validity of the MSM was also checked by fitting with data from 2013. However, the final SARIMA and MSM models were comparable based on AIC, RMSE and absolute mean number of cases (number of predicted cases minus number of observed cases in 2013). In this model, seasonality was modelled using a sinusoidal transformation of time, including both sin(2πi/12) and cos(2πi/12) in the regression models, where i represents the number of the month (e.g. January = 1, etc.).
The statistical software Stata v. 10 (StataCorp., USA) and OxMetrics 6.01 (oxmetrics.net/) was used for all analyses.
RESULTS
Descriptive analysis
Between January 2000 and December 2012, 647 confirmed CCHF cases were reported from Sistan-va-Baluchistan province, in the southeast of Iran. The disease was most common in the months of May, June, July and August and there was no clear pattern of decline during these years. The years 2002, 2008 and 2010 were the worst with regard to the number of cases and occurrence of outbreaks. The trend of number of cases from 2000 to 2013 is displayed in Figure 1. The trend of significant climate data and LIP during the study period (the data for climate and LIP from 2000 to 2012 was used for analysing and modelling) are displayed in Figure 2.
Cross-correlation function
The results of the cross-correlations adjusted for seasonality show that the incidence of CCHF was significantly associated with LIP, mean temperature and maximum monthly relative humidity at lags of up to 2 months. Moreover, significant correlation for monthly mean temperature was found at a lag of 5 months (reversed) and maximum monthly relative humidity at lag of 3 month. We also found that the monthly accumulated rainfall was inversely correlated with CCHF incidence at a lag of 1 month and directly correlated to disease incidence at a lag of 5 months. However, there was no significant correlation between CCHF incidence and the other variables in any lag (Table 1).
* Statistically significant at the 5% level.
SARIMA model
The best-fit final SARIMA model (based on AIC and results of Lrtest) show that first-order autoregression, first-order moving average, first-order seasonal moving average, monthly mean temperature at a prior moving average (lag) of 2 and 5 months, maximum monthly relative humidity at a lag of 2 months, monthly accumulated rainfall at a lag of 5 months and LIP without delay (lag-0), were significantly associated with CCHF incidence. The model estimated with the explanatory variables was a better fit than the model without these variables, in terms of smaller values of AIC and RMSE (Table 2). Because of collinearity, the significant climate variables at lag-0 month, was not included in the multiple SARIMA model. The Ljung–Box test confirmed that the time-series residuals were statistically not dependent (P = 0·6). Moreover, autocorrelation functions and histogram of residuals approved the independence and normality of residuals, respectively.
CCHF, Crimean-Congo haemorrhagic fever; s.e., standard error; CI, confidence interval; LIP, legal livestock importation from Pakistan; AIC, Akaike's Information Criterion; RMSE, root mean square error.
The selected SARIMA model fitted observed data from January 2000 to December 2012. To predict for 2013, the 12 month-step approach showed the smallest RMSE and the predictions and their confidence intervals were improved after the variables introduced remained in the final model (Fig. 3).
MSM
We normalized the dependent variable using log transformation. Based on results of the simple MSM fitted model, one time lag of CCHF can help to predict this disease in later time. (Table 3). As stated earlier, the simple MSM is y t = a 0,0 + a 0,1 s t + (a 1,0 + a 1,1 s t )y t–1 + e t . Therefore, we found that a 0,0 = 0·207, a 0,1 = 1·24–0·207 = 1·033, a 1,0 = 0·315, a 1,1 = 0·454–0·315 = 0·139. Furthermore, based on the simple model, the transition probabilities can be estimated as follows:
where, P 00 is the probability of non-outbreak state at both periods t and t+1, P 11 is the probability of outbreak state at both periods t and t + 1, P 01 is the probability of changing the series from non-outbreak state at period t to outbreak state at period t + 1 and P 10 is the probability of changing the series from outbreak state at period t to non-outbreak state at period t + 1. Based on equation (6) the P(S t + 1 = 1) is as follows:
As our data was up to December 2012, this means that the probability of an outbreak is relatively low in January 2013 (1 month later).
CCHF, Crimean-Congo haemorrhagic fever; MSM, Markov switching model; s.e., standard error; CI, confidence interval; LIP, legal livestock importation from Pakistan; AIC, Akaike's Information Criterion; RMSE, root mean square error.
The best-fit final MSM with explanatory variables based on regimen change pattern, show that first-order autoregression (a 1,0 + a 1,1 S t ), monthly mean temperature at a prior moving average or lags of 1 and 2 months, maximum monthly relative humidity at a lag of 2 months, and LIP at lags of up to 1 month were significantly associated with CCHF outbreaks. The MSM estimated with the explanatory variables was a better fit than the model without these variables, in terms of smaller values of AIC and RMSE (Table 3). As the most difference between coefficient of explanatory variables in outbreak and non-outbreak functions is due to the LIP variable, this is therefore the most important factor in causing outbreaks in the southeast of Iran. The transition probabilities in multiple MSM are different from simple MSM and the explanatory variables changed the probability of outbreaks and their transitions. Similar to simple MSM (mentioned above), we can calculate the P(S t+1 = 1) using estimated coefficients in multiple MSM in Table 3.
The selected MSM fitted observed data from January 2000 to December 2012. We also present the 12-month step approach and its confidence intervals (Fig. 4).
Comparison of MSM and SARIMA models
The nature and magnitude of the effect estimates is not too different for the two methods used in this study. The selection of variables was based on cross-correlation function. Both MSM and SARIMA models show that LIP without delay (lag-0), monthly mean temperature and monthly mean maximum humidity at a lag of 2 month were significantly associated with CCHF disease. Moreover, monthly mean temperature and LIP at a lag of 1 month were significantly associated with CCHF disease only in the MSM.
However, in both SARIMA and MSM models, the validity of the models was checked by fitting with data from 2013. When the models were fitted to the data of 2013, the models were able to predict infections in that year in an appropriate way (Figs 3, 4); the peaks and dips of the prediction and infection curves are in the same direction in both SARIMA and MSM models. However, the mean number of cases (number of predicted cases minus number of observed cases in 2013) (regardless of mathematical sign) was 0·61 and 0·82 for MSM and SARIMA models, respectively. Although both models had reasonable accuracy over the predictive period, based on AIC and RMSE, the MSM in both simple and multiple models had slightly better fitted than the SARIMA model (Tables 2 and 3).
DISCUSSION
Based on our findings in this study there was no clear pattern of decline in the reported number of CCHF cases during the past 13 years and a fluctuation was seen in occurrence of this disease with three peaks in years 2002, 2008 and 2010. This result is somewhat similar to previous study conducted in Bulgaria that describe the trends of CCHF between 1997 and 2009 [Reference Vescio10]. However, in a Turkish study the number of cases increased markedly from 2004 to 2007 [Reference Yilmaz23].
The results of this study (based on two reliable models) suggest that climate variability (particularly mean temperature and maximum humidity) and LIP may have played a significant role in the incidence of CCHF in the southeast of Iran either directly or through other unmeasured variables. This result is relatively similar to previous studies conducted in Iran and Europe [Reference Mostafavi9, Reference Vescio10]. The predominant effect of climate variables was observed after a lag of 1 and 2 or 3 months for some climate variables (accumulated rainfall, mean temperature, maximum relative humidity). However, the predominant effect of LIP was after a 1-month lag or without delay (lag-0).
In the southeast of Iran, the main routs of transmission for CCHF is contact with blood or tissue of infected livestock [Reference Chinikar3]. However, it should be mentioned that an increase in reservoir (ticks) activity and population cause an increase in the source (infected livestock) of the virus. Therefore, it might be concluded that climatic variables do not influence directly the incidence of the disease, but only indirectly, and through their effect on the life-cycle dynamics of both vector and virus and the consequent infection in livestock. On the other hand, the several successive phases from tick hatching to appearance of human cases led to global cumulative lags in our study (especially in univariate analysis using cross-correlations).
As stated earlier, although the genus Hyalomma is not the only vector of CCHF, it has, however, the main role, especially in the southeast of Iran [Reference Mehravaran6, Reference Yaser7]. As the activity of Hyalomma is not limited to one specific month and they are found throughout the year [Reference Rahbari, Nabian and Shayan24], there is a positive relationship between maximum humidity and temperature with CCHF incidence in different lags. It could be concluded that increases in heat and moisture in a specific month could not predict the number of cases in a later month.
Moreover, this study has demonstrated a reverse relationship between CCHF incidence and average temperature at a lag of 5 months in both univariate and multiple regression SARIMA models. This means that the increasing average temperature in October–November decreased the likelihood of CCHF cases being reported in the following year. A similar finding has previously been observed in Turkey and Bulgaria [Reference Vescio10, Reference Estrada-Peña25]. The warmer temperatures in winter and shorter annual cold periods limit development and increase mortality of tick stages [Reference Estrada-Peña26].
Some studies have observed that a decrease in rainfall might produce a more suitable condition for increased tick activity [Reference Estrada-Peña26, Reference Estrada-Peña27]. We also observed such a finding in the univariate analysis of the relationship between CCHF incidence and rainfall with a lag of 1 month. The numbers of cases in the months (even from March to September where rainfall is negligible) with low rainfall were more than in the months with high rainfall.
In the southeast of Iran, most of the patients were reported from March to August and there are few cases in the rest of the year. Therefore, the MSM showed that in both simple and multiple models, P 00 and P 11 were more than P 10 and P 01, respectively. This means the regimen (i.e. outbreak or non-outbreak periods) is not intended to change. On the other hand, in both simple and multiple models, P 10 was more than P 01. This means the prolongation of non-outbreak periods, lead to a decrement in outbreak probability during the year. This study showed that LIP is the most important factor in outbreaks in the southeast of Iran. Moreover, a previous study showed that the pattern of CCHF distribution on the other side of the border, i.e. in the Baluchistan region of Pakistan, is somehow different from results of our study, with two annual surges in April and August [Reference Sheikh28]. On the other hand, the yearly peak in the number of cases in Sistan-va-Baluchistan province of Iran follows the first surge of the disease in Baluchistan province, Pakistan. This is probably due to transmission of infected immature ticks by rodents, small animals and birds [Reference Ergönül1, Reference Nateghpour29] and infected mature ticks by livestock imported from Pakistan to Iran [Reference Mostafavi9]. Therefore, we conclude that, if the livestock importation from Pakistan ceased from March to July, the probability of CCHF incidence and outbreaks would decrease and consequently the outbreak period would be short.
The climate factors do not affect the incidence of CCHF in humans directly, but the effect of climate factors on CCHF outbreaks is through their effect on the vector's (Hyalomma ticks) life-cycle. On the other hand, in the southeast of Iran, CCHF is transmitted mainly through contact with blood and body fluids of infected animals during the viraemic phase of disease [Reference Izadi30, Reference Naieni31]. There is illegal animal trade and uncontrolled population movements between Pakistan and Afghanistan (through Quetta city) [Reference Alam12] and consequently between Iran and Afghanistan (through Nimroz province); therefore, LIP could influence CCHF incidence in human populations directly in this region. Since the health system is not able to intervene in weather variability, ceasing LIP in spring and summer, or more suitable quarantine in the border areas, could help to prevent outbreaks. It should be noted that, in accord with the results of previous studies, the CCHFV genome isolated from Iranian patients is similar to that from Afghanistan and Pakistan that hasa close relationship with the CCHF Matin strain [Reference Chinikar2, Reference Chinikar3, Reference Alam12, Reference Chinikar32, Reference Ölschläger33]; therefore the migration of CCHFV is free and unrestricted between Iran and Pakistan [Reference Mild34].
We obtained relatively similar results from two different time-series models: SARIMA and MSM. Both methods show a clear association of weather variables and LIP with CCHF disease in the southeast of Iran. However, we found that the MSM allowed for more information about the series and outbreak detection regarding transition probabilities. On the other hand, with respect to goodness of fit and predictive accuracy, the MSM was better than SARIMA. Therefore, the MSM has several advantages compared to the SARIMA model, in particular, its forecasting capability and its richer information on time-related changes; but generally, both MSM and SARIMA modelling are useful for interpreting and applying surveillance data in disease control and prevention.
This study has two major strengths. First, to our knowledge, this is the first time-series study to examine the relationship between major weather variables and LIP with CCHF incidence in a most prevalent area in Iran using valuable time-series models. Our data demonstrate that in addition to LIP, of all climate variables, only mean temperature, maximum relative humidity and rainfall are associated with CCHF. Second, we compared two common time-series analysis methods and found that a MSM forecasting model appeared to be more suitable than a SARIMA model in the assessment of the relationship between CCHF and some explanatory variables.
Some limitations of this study should also be acknowledged. The occurrence of CCHF is complex; CCHF is not only influenced by weather, but by many other biological, social, and environmental factors such as change in agricultural activities, ranching, nomadic population and illegal livestock trading that might also lead to bias in causal inference in this study.
However, near the borders, the weather conditions in neighbouring countries is the same as in Sistan-va-Baluchistan province in Iran (data not shown) and the results of this study could be generalized to Pakistan (Baluchistan province) and Afghanistan (Nimroz province).
The findings of this study may assist local public health authorities to utilize the model developed in this study to identify the communities that require particular attention and to mobilize limited resources to effectively control and prevent outbreaks of CCHF during epidemic seasons. These findings may have applications as a decision support tool in planning disease control and risk-management programmes.
ACKNOWLEDGEMENTS
The authors thank Mr Hasanzehi, Zahedan University of Medical Sciences, Iran, and Mr Mohammadzadeh, Zabol University of Medical Sciences, Iran, for their help in data registering. This work was supported by the research deputy of Tehran University of Medical Sciences (grant no. 12/234).
DECLARATION OF INTEREST
None.