INTRODUCTION
Social media refers to Internet applications and platforms that allow users to create and share content in a computer-mediated environment [Reference Newkirk, Bender and Hedberg1]. There is an increased research potential in the use of social media and its impact on individual, organizational and social levels on a data-driven level [Reference Ahmed, Scheepers and Stockdale2]. Data derived and analysed from social media are known as social media analytics [Reference Zeng3]. This is of particular interest to researchers in public and private sectors particularly in disaster, crisis, and emergency management. Social media analytics can be leveraged to develop timely, locally situated warnings from the community level for informing the hierarchical system (e.g. public health authorities) for improving the preparedness and response strategies for dealing with emerging outbreaks [Reference Zeng3]. In the field of emergency knowledge management, social media was identified as a serious knowledge (information) management platform for disaster response which acts as an intermediary between responding authorities and the public during a crisis [Reference Yates and Paquette4]. Compared to traditional Internet technologies and communication methods (e.g. face to face), social media is an element that facilitates information sharing and direct user interaction through the content of its communication data on the Internet. For example, wikis were used by the United States Department of State, USAID and the US military to disseminate immediate information in the Haiti earthquake [Reference Yates and Paquette4]. Use of social media analytics are not without its challenges. First, information accuracy needs to be constantly checked and validated, especially with large volumes of data retrieval from the source. This is due to the emphasis on how social media may factor into decision-making models and processes of organizations, especially in emergency crisis environments where conditions can change extremely and rapidly [Reference Zeng3, Reference Yates and Paquette4].
In emergency crisis management, social media analytics such as Google Trends, Twitter and Facebook have been used to support situational awareness during crisis events and has been found essential for decision making [Reference Zeng3–Reference Gaspar6] in many different situations, e.g. from infectious disease monitoring to regional planning and multipurpose campaigning. For example, data from Twitter tweets were extracted and analysed on a user-centric level to visualize and understand the place, time and theme components of evolving situations over time for decision-making purposes which may include crisis situations such as natural disasters (e.g. the Haiti earthquake) and even environmental disasters (e.g. impact of oil spills on birds) [Reference MacEachren5]. Twitter tweets were also analysed in the 2011 Escherichia coli outbreak in Spain to assess psychosocial factors in individuals with the intent of using it as a source to disseminate immediate information from authorities; monitor the population response and to implement food crisis communication strategies [Reference Gaspar6]. Based on some of these examples, social media analytics and its benefits may provide immediate data to optimize public health and/or crisis surveillance and response from the authorities when communicating and interacting with the population at risk.
In the case of the West Africa Ebola outbreak, rapid assessment, contact tracing, isolation of infected individuals, safe cremation/burial of the deceased and access to laboratory services were hampered initially due to lack of community engagement [7–11]. Control of outbreaks often requires coordinated medical services with community engagement. We propose that bringing together the formal and community-based ad hoc networks could facilitate the transmission of both strong signals (i.e. infections, confirmed cases, deaths in hospital or clinic settings) together with weak signals from the community where there are isolated symptoms and a small number of suspected cases, thereby making the overall surveillance and intervention strategy far more effective. This paper provides an overview of disease surveillance through social media and in particular, Ebola and events of interest in the timelines as reported in Program for Monitoring Emerging Diseases (ProMED) and the Factiva database in the 2014 West African epidemic.
Disease surveillance through social media
Social media analytics has been used in disease outbreak surveillance such as influenza, dengue and zoonotic illness, where symptoms of a particular disease are monitored and aggregated for the early detection of an outbreak [Reference Zeng3, Reference Bernardo12]. The most common disease which utilizes social media analytics for early detection and surveillance is influenza [Reference Bernardo12].
The Centers for Disease Control and Prevention (CDC) relies on outpatient reporting and laboratory test results nationwide as its primary surveillance for influenza. The confirmation of outbreaks by CDC takes about 2 weeks after they occur, where research on social media-related disease surveillance through social media suggests detection earlier than traditional methods [Reference Schmidt13]. To identify flu trends, CDC collaborated with Google Inc. to launch Google Flu Trends [14], which uses Google's search queries to monitor flu-related searches against reported illness displayed graphically on a map. Google Flu Trends was discontinued in 2014. Initially, Google Flu trends was considered a potential source for early detection due to its capability to detect the trends but it started to overpredict flu cases [Reference Lazer15]. In the case of Google Flu Trends, it was vulnerable because of the changes in its search query algorithm programming [Reference Lazer15], and online search behaviour may not reflect when and where an outbreak is occurring.
Compared to the case of Google Flu Trends, other social media may be more open where users do not use search terms. For example, in Twitter (a rapidly growing microblogging platform), each tweet is 140 characters long and has enough contextual information compared to what is offered by search terms in ‘the Google Flu Trends’ site. A study based on social media surveillance suggested the possibility of mining blog data for the identification of influenza trends [Reference Corley16]. Blogs were classified into a ranked tier structure and text-mined to identify blogs with the terms ‘influenza’ and/or ‘flu’ at the first stage; at the final stage, bloggers who displayed direct knowledge of symptoms corresponding to influenza-like illness (ILI) were analysed [Reference Corley16]. The results extracted from blog posts are found to be significantly correlated with the CDC ILI-Net data [Reference Corley16].
A preliminary study was carried out on the use of Twitter to detect increasing influenza trends and outbreaks by de Quincey & Kostkova [Reference de Quincey and Kostkova17]. The duration of that study was 7 days observation starting from 14:00 hours on 7 May 2009 (Thursday) to 14:00 hours on 14 May 2009 (Thursday) [Reference de Quincey and Kostkova17]. During this 7-day period, there were a total of 135 438 tweets posted by 70 756 unique users that contained the term ‘flu’ [Reference de Quincey and Kostkova17]. The researchers conducted a preliminary analysis of that batch which included measuring word frequencies (i.e. flu, swine, H1N1, flu-bird, virus, outbreaks) and found that it may be useful for capturing data about influenza [Reference de Quincey and Kostkova17]. Many researchers [Reference Corley16, Reference Paul and Dredze18, Reference Chunara, Andrews and Brownstein19] have suggested the use of social media analytics against traditional health reports. All found strong correlations between the use of social media analytics and traditional laboratory data to detect abnormal disease trends that may indicate a potential outbreak, such as for influenza [Reference Corley16–Reference Paul and Dredze18] and cholera [Reference Chunara, Andrews and Brownstein19].
Successful collaboration between public and private health agencies could potentially facilitate the development of a surveillance system that mines social media data for detecting signals of disease outbreak [Reference Newkirk, Bender and Hedberg1]. Another significant initiative that public health agencies could take is to host secure online surveys or web forums where individuals with confirmed pathogen-specific illness would anonymously provide information related to foodborne disease exposures. It would be a cost-effective approach as Internet-based outbreak investigation requires less from limited public health funding [Reference Newkirk, Bender and Hedberg1].
The Ebola outbreak
ProMED, an Internet-based reporting system, which aims to rapidly disseminate information globally on disease outbreaks in humans and animals, received an initial request for information (RFI). ProMED was alerted to a report in Standard Media Kenya in March [20]. The report referred to a localized outbreak of unknown viral haemorrhagic fever which had occurred in the border village of Guéckédou Prefecture, Guinea [Reference Gatherer21, Reference Baize22]. [A ProMED reporter reading the Standard Media Kenya reported the problem, by filing a request for information (RFI) – to the other medical and healthcare professionals who are part of the ProMED international community.] Within a few months, it had reached epidemic status and affected the neighbouring countries of Liberia and Sierra Leone [Reference Gatherer21]. Small isolated outbreaks have been known to occur in sub-Saharan Africa but no cases had ever been reported in Guinea [Reference Meltzer23–Reference Towers26]. Over the months from March to October 2014, a ProMED RFI on an isolated outbreak in a border Guinean village turned into the West African Ebola epidemic with exported cases to other regions of the world. By then, the outbreak magnified and propagated to larger populations around the African region with an exported case in the USA. This led to the declaration of an International Health Emergency by the World Health Organization (WHO) under the International Health Regulations (2005) on 8 August 2014 [27].
The recent WHO Ebola Response Roadmap Situation Report of 15 October 2014 highlighted that there were 8997 confirmed, probable, and suspected cases of Ebola virus disease, which had been reported in seven affected countries (Guinea, Liberia, Nigeria, Senegal, Sierra Leone, Spain, USA) causing 4493 deaths (almost a 50% mortality rate) [28]. This spread also affected the healthcare workforce with 427 confirmed cases and 236 deaths (a 55% mortality rate). There is further evidence suggesting countries such as Nigeria, Senegal, Spain, and the USA with localized transmission which has been imported from a country with wider and intense transmission [28]. The U.S. CDC predicted that Ebola cases will reach 20 000 per week by December 2014. Moreover, the CDC released a report on 16 September 2014 predicting as many as 550 000 to 1·4 million cases of the Ebola virus in Liberia and Sierra Leone alone, by 20 January 2015, according to two worst-case scenarios from scientists studying the historic outbreak [Reference Meltzer23].
METHODS
The term ‘Ebola’ was used as an initial search in Google News with the date range of 13 April 2014 to 18 September 2014. Whenever possible, official sources (e.g. WHO) and major news media [e.g. Cable News Network (CNN)] were preferred over unofficial sources and lesser known media; nevertheless, details provided by all retrieved sources were considered for credibility and cogency. The following three sources were used to extract media data with regard to Ebola.
-
(a) The ProMED emailing system is an Internet-based global reporting system widely used by those working with human, animal and plant infectious diseases. Information and reports are screened by professional moderators. We searched in ProMED for notifications containing the key word ‘Ebola’, with dates ranging from 18 March 2014 to 18 September 2014. A further search was performed in ProMED with the term ‘Ebola’ in the two categories of Post and Subject key words for the duration of 19 March 2014 (first RFI) [20] to 15 October 2014. The data were aggregated into weekly counts and a total of 31 weeks of ProMED reports were analysed. Out of 272 ProMED reports reviewed, only 240 were relevant as the others referred to a repeated summary of earlier ProMED reports. The first epidemic week is based on the initial RFI report.
-
(b) Two online news databases were queried: ProQuest Newsstand (1500 newspaper sources) and Dow Jones Factiva (some top newspapers such as AFP and Reuters), using the key word ‘Ebola’ for the period of 13 April 2014 to 18 September 2014. Data sources were systematically collected from media reports, the WHO and the International SOS:
-
• Owing to Ebola, declared as an International Health Emergency, International news agencies such as CNN, BBC, Reuters, Wall Street Journal and the Voice of America were used to extract details of the location where cases were confirmed with links to the original destinations highlighting details of known individuals carrying the virus.
-
• Some local news agencies, such as US-based ABC News, The Washington Post, Nigerian-based Vanguard; German-based Deutsche Welle, Norwegian-based The Norway Post, and Liberian-based Daily Observer also provided additional information about Ebola cases.
-
• News and information gathering platforms, such as Google News, Yahoo News, and Wikipedia, provided further information for indexing.
-
-
(c) Searches were performed in Google Trends, a public web facility provided by Google that returns the relative search frequencies of search terms and phrases. The key word ‘Ebola’ and a date range of 13 April 2014 to 18 September 2014 were applied. Weekly data were downloaded for the key word ‘Ebola’ (including search phrases containing it), during the period from 13 April 2014 to 18 September 2014, for the five countries: Guinea, Liberia, Sierra Leone, USA and UK.
RESULTS
Based on both media and ProMED reports, the first sign of the problem was signalled as an RFI by ProMED in early March 2014 as an undiagnosed viral haemorrhagic disease [20]. The initial responders were Médecins San Frontières (MSF), CDC, and the WHO [7, Reference Williams24].
The search of ProMED reports with key word ‘Ebola’ in the subject heading (Fig. 1) shows an awareness of the spread of Ebola early in April (Supplementary Table S1). By contrast, news in the mass media was largely responsive to two significant events: (i) when the WHO declared Ebola an international health emergency on 8 August 2014 [25], and (ii) when Ebola reached Texas, USA on 30 September, as shown in Figure 2. In particular, the response to the latter was much more intense than that to the former, with roughly three times more news headlines published containing the key word ‘Ebola’ during the latter period. Public worldwide attention, as captured by Internet search query statistics (Fig. 3), spiked and decreased after the WHO declaration, but rose steeply when Ebola reached Texas. However, results for the three affected West Africans countries illustrate that local attention spiked very early in April, with a second steep spike observed slightly before the WHO declaration.
Figure 1 shows the ProMED reports with ‘Ebola’ in the subject and post heading, 18 March 2014 to 16 October 2014. The two vertical lines identify the week when the WHO declared Ebola an international health emergency and when Ebola reached Texas. There was an early increase for ProMED mail (all languages) starting in April and it was a steeper increase in October. For ProMED mail (English), the reporting interest was steadier.
Figure 2(a, b) provides search results for mass media news articles with headlines containing ‘Ebola’: (a) ProQuest Newsstand, (b) Dow Jones Factiva (Supplementary Table S2). The former contains news in all areas, whereas the latter is more tailored to the economics and finance fields. ‘Wire feeds’ are news distributed through the Internet to websites and subscribers; ‘newspapers’ are printed or online version of newspapers. The two vertical lines identify the week when the WHO declared Ebola an international health emergency and when Ebola reached Texas. In both Fig. 2(a and b) a sharp increase can be seen after the second vertical line is visible for all curves except for ProQuest Newsstand wire feeds, possibly delayed by the heterogeneity of wire feeds focuses. Fig. 2b , which is oriented towards economics and finance, shows much less attention towards the first than the second vertical line.
Figure 3a presents the result of Google Trends for the key word ‘Ebola’, which reflects Google search query relative volumes, showing a comparison with (a) key word ‘flu’, and (b) among five countries. While Google does not disclose absolute numbers, the vertical axis shows the relative frequencies in the comparison groups. The two vertical lines identify the week when the WHO declared Ebola an international health emergency and when Ebola reached Texas. (a) Public reaction to the two events is highly visible. The gradual increase in ‘flu’ searches likely corresponds to the start of the flu season near autumn. (b) Searches in Liberia and Sierra Leone spiked slightly prior to the first vertical line, suggesting that the epidemic has captured local public attention, slightly before the WHO (i.e. the international health authority), took emergency action. Attention in the USA and UK only started to rise after Ebola cases reached Texas, USA.
CONCLUSION
There were visible time gaps between the international responses to Ebola based on the ProMED timeline compared to the MSF timeline. On 22 March 2014, the Guinean government declared an Ebola outbreak [Reference Williams24]. On 31 March 2014, MSF publicly declared that an Ebola outbreak was out of control and by 21 June 2014, MSF again appealed for international action [Reference Williams24]. Despite MSF's repeated international appeals, there was no declaration of an international health emergency until 8 August 2014 [25]. By July, the international media interest in Ebola started to increase. In comparing the responses by two news agencies, CNN had more reports focused on Ebola than the BBC. Two of the plausible reasons may be attributed to the following: the USA initiated the medical evacuations before the UK started; and CNN has a stronger domestic presence in the USA. This is highly suggestive that the local population would be either more inclined to be interested (or panicking) in an unusual event happening in their respective country which could be due to the fact that Ebola would have been extensively covered in the US domestic news media [Reference Towers26]. There is still not much known about how social media discussions of a novel outbreak compare to discussions about other public health issues which could give us an idea of the public's perception of susceptibility and severity to a potential health threat [Reference Gaspar6, Reference Guidry29]. Further social media research should be performed, as public perception is core to planning crisis communication strategies.
Due to the notion of an interconnected and interdependent world, we need to view the world as a single interconnected system where localized problems, if not contained well, could propagate quickly to more locations affecting the globe. The Ebola outbreak has highlighted glaring gaps in responses to potential global health threats. This has significant implications for improving the detection, preparedness and response of future disease outbreaks for the disaster medicine and public health preparedness community.
We are increasingly seeing a delay and disconnection of the transmission of locally situated information to the hierarchical system to make the overall preparedness and response more proactive rather than reactive for dealing with complex emergencies such as Ebola. One approach we suggest could be to develop open infrastructure for data sharing and access (like the Development Gateway Project initiated by the World Bank) [Reference Kramarz and Momani30] and ad hoc locally situated support systems (develop a formal and informal education and learning platform for local organizations involved in preparedness and response to Ebola). This can offer opportunity by bringing the formal and community-based ad hoc networks required to facilitate the transmission of both strong signals (i.e. infections, confirmed cases, deaths in hospital or clinical settings) and weak signals from the community where there are isolated symptoms and a small number of suspected cases. This would make the overall surveillance and intervention strategy far more effective. We suggest the possibility of social media and news as a complementary tool to traditionally based surveillance systems. It can offer the opportunity for the community to participate as part of a social surveillance in developed nations, rather than those in developing nations. The use of Google Trends with a local linguistic focus in developing nations could possibly be a useful and a cheaper option due to the costs of mining news media in lower-income developing nations (where Internet and mobile technologies may still be beyond the reach of the majority) [Reference Khan31]. While some advantages can be gained in terms of a command control coordination system, there are still issues, especially those of cost, which need to be followed up in developing nations especially concerning the improvement of an overall preparedness and response strategy.
SUPPLEMENTARY MATERIAL
For supplementary material accompanying this paper visit http://dx.doi.org/10.1017/S095026881600039X.
DECLARATION OF INTEREST
None.