Policy Significance Statement
Unlocking the potential of data for social good in Africa requires fresh thinking on institutional mechanisms for fostering data access, action to build skills and capabilities across organizations involved in data sharing, and platforms bringing together different actors across the data ecosystem in a shared space. Drawing from real-world case studies of the use of data science in the COVID-19 response, this study investigates successful data access strategies, effective intervention processes and procedures, and policy interventions that can inform future policy development to increase data sharing for societal benefit in the longer term. Several audiences aiming to enable or participate in data-sharing collaborations and support data-enabled innovations for policymaking can draw valuable lessons from this report. More specifically, the findings in this report provide a blueprint for more effective data sharing for several groups, including policymakers and organizations that curate public interest data, National Statistical Offices, data-driven research, and policy organizations.
Introduction
As COVID-19 cases spread across Africa, many governments were concerned about the ability to access data to rapidly inform policy responses, which increased focus on data collaborations and sharing initiatives across organizations and sectors. As a result, a variety of data-driven initiatives emerged, especially in three areas: (i) tracking and prediction, (ii) situation data dashboards, and (iii) managing lockdowns and social distancing efforts. To derive lessons from the experiences of those who implemented data science solutions for the COVID-19 response, we studied data-driven solution providers, enablers, data producers and curators, coordinators, and decision-makers from government, academia, NGO, and private sectors to identify practices that can increase data sharing for societal benefit in the longer term.
Research questions
With the above in mind, we gathered data (about different data-driven COVID-19 response solutions developed on the continent) to answer the following research questions.
-
• What data access strategies have been successful in different countries or projects during the COVID-19 response, and how could these be adopted in the longer term?
-
• What intervention processes and procedures have successfully formed data collaborations for the COVID-19 response?
-
• What policy interventions would help address or lessen the impact of the current barriers to the use of data science in policy?
Data collection
Data were collected between April and October 2021, focusing on data-driven COVID-19 solutions in Africa. We obtained these data through a survey questionnaire distributed to the African data science community and an extensive online search for data-driven projects across African nations.
A significant number of solutions came from university websites, primarily via research groups, science, technology, and computing faculties; National Statistical Offices (NSOs) and national COVID-19 data dashboards; private and public organization websites, including Flowminder, Data for SDGs, Dataminr, GIZ, and so forth; and regional health bodies such as the WHO Africa website and Center for Disease Control (CDC) Africa.
Both the desk research and the survey collected 97 unique initiatives of data access and use in the COVID-19 response. The data were analyzed and filtered based on criteria including the ability to determine the data types used, the data access strategies employed, the level of solution implementation (i.e., “in use,” “prototype ready to be deployed,” and “prototype not ready to be deployed”), and the ability to verify the countries of implementation. This process reduced the data to 72 data-driven COVID-19 solutions from 42 African countries.
Limitations: We acknowledge that these results do not represent all COVID-19 solutions conducted on the continent. Due to time constraints and the need to engage practitioners, we had to limit the data collection period to ensure progress in the project.
Practitioner engagements
Subsequently, we identified solutions where we could engage with contact persons (practitioners) willing to share their experiences in depth. This phase led to interactions with 22 practitioners who provided insights into their on-ground experiences. From their experiences, we aimed to understand the organizational, policy, and legal challenges encountered during their data science projects and the interventions to overcome them. A “focus workshop” convened these 22 practitioners, representing government institutions, national and international NGOs, academia, the private sector, and the data science community from whom experiences were gathered, including challenges and barriers they encountered in utilizing data for policy responses.
The data collected from the focus workshop included lessons from the lightning talks by practitioners, a mapping of challenges, and narratives of the strategies and interventions used to overcome the difficulties. These were studied to identify three case studies that presented precise but also replicable approaches to data access.
Case study experiences
The final stage of our approach involved a dialogue with three data-driven solution practitioners to discuss their experiences in detail, understand their case study, and derive lessons that can inform future data sharing and use initiatives. We explored specific answers to questions such as “What would you say were the enablers for your initiative, and how can policymakers use and benefit from the data-driven initiative?” These and more helped us gain insights into real-world experiences, which was a key requirement for this study.
Africa’s data-driven COVID-19 response: findings
Figure 1 shows the landscape of the data-driven COVID-19 solutions developed on the continent, according to the data collected.
The data-driven solutions analyzed comprised initiatives by the private sector, government, academia, and NGOs. These initiatives fall into various categories, highlighted in Figure 2, and the percentage of solutions per category.
The data-driven solutions analyzed impacted several areas of the COVID-19 response, including tracking and prediction of the disease, contact tracing of the cases and infected people, population mapping to aid resource allocation for interventions, and capability mapping of health facilities, among other things.
Factors for successful implementation of data-driven projects: case studies
An essential aspect of this study was establishing the approaches that led to the successful implementation of data-driven projects. This section highlights these approaches, supplemented with critical case studies that provide insights into how the data-driven initiatives were initiated and implemented. We aimed to understand the actors’ experiences launching data-driven engagements, defining data access strategies, and establishing data safety protocols, governance, and incentive models. Additionally, we offer case study-specific recommendations on how policymakers can best leverage data-driven solutions, among other insights.
Case Study 1: A strategic data access initiative: Data for Good Project in Ghana (Ghana leaders, 2022).
The Data for Good project, a collaboration between the Ghana Statistical Service (GSS), Vodafone Ghana, and Flowminder Foundation, exemplifies effective data access strategies. Initially established in 2018 to integrate mobile network data for public health and urban planning, this partnership was quickly leveraged during the COVID-19 pandemic. The preexisting agreement allowed GSS to monitor mobility patterns through anonymized and aggregated telecommunications data, providing crucial insights for managing travel restrictions and social distancing efforts.
The success of this initiative hinged on several factors:
-
1. Pre-tested privacy measures: Vodafone Ghana preprocessed personal data to ensure privacy, allowing Flowminder to use aggregated data for analysis without transferring non-aggregated data.
-
2. Established analysis tools: Flowminder’s FlowKit, an open-source toolkit, facilitated Call Detail Records (CDR) data analysis.
-
3. Clear data flow, workflow, and ownership guidelines: The agreement between the three organizations outlines that pseudonymized data can never leave Vodafone’s servers. Flowminder can only process data on Vodafone’s internal system without transferring non-aggregated data. The agreement also outlines the responsibilities of the different parties for analyzing data.
-
4. Existing data governance mechanisms: The Data for Good Project has a steering committee consisting of high-level individuals in government and academia who give direction to the project. This committee devised a rigorous approval process for external data requests for COVID-19-related research studies.
-
5. Other enablers included the Statistical Service Act 2019, which mandated that the GSS use big data to produce official statistics, and Ghana’s Data Protection Commission, which played an oversight role during the drafting and signing of the original partnership agreement, including the data-sharing protocols for the project. They facilitated the process and commented on all drafts of the initial agreement. Additionally, the support of the Global Partnership for Sustainable Development Data (GPSDD) was crucial, as their studies provided a framework for exploring non-traditional data sources like mobile phone data.
Rachel Bowers (Impact Evaluation and Data for Development Researcher, GSS) reflects on how policymakers can use data-driven initiatives to enable a responsive data policy for Africa. “After sharing the first report (on mobility) with the Ministry of Health and the then Ministry of Planning, GSS was invited to sit on the Research Committee convened under the Presidential task team on COVID-19, and this ability to share evidence quickly with the relevant decision-makers. Policymakers can benefit from the data by making a seat at the table/giving platforms for those producing relevant evidence, particularly when it is a government partner.”
More recommendations from this project can be found in a report: Analysis of CDR to inform the COVID-19 response in Ghana—opportunities and challenges Li et al., Reference Li, Bowers, Seidu, Akoto-Bamfo, Bessah, Owusu and Smeets2021.
Case Study 2: A productive intervention process: GPSDD and the UN Economic Commission for Africa (UNECA) Initiative (Data for a resilient Africa, 2022).
The GPSDD and UNECA demonstrated productive intervention processes by rapidly assessing data needs and forming partnerships across 50 countries. Their approach included:
-
1. Assessing data needs and challenges in different countries by contacting UN institutions and national government bodies, including NSOs, Ministries of Health, and national presidential panels on COVID-19 and asking for their data requirements and current data-sharing approaches.
-
2. Identifying available and accessible data resources in different countries that could be used to develop policy. This included, for example, identifying available health infrastructure and its locations, gathering information to track the virus, and forecasting case numbers.
-
3. Connecting key stakeholders and decision-makers in each country by identifying relevant institutions and bringing them together to clarify areas of need and address trust issues.
-
4. Identifying skills needed to use that data and finding technical partners to assist in upskilling relevant national stakeholders.
-
5. Establishing partnerships with potential suppliers of data, technology, analytics, and capacity/skill-building.
GPSDD’s long-standing relationships as a trusted broker in the data space helped accelerate progress for COVID-19. They were able to identify relevant agencies or government departments by country, find the right focal point(s) within these institutions, and uncover the required information while navigating issues of trust, institutional politics, and confidentiality.
Lessons from GPSDD and UNECA initiative
-
• Institutions developing data science collaborations need to deliver certain core functions, including the ability to broker large and complex partnerships across sectors, share knowledge and technology across organizations, define needs, priorities, and pathways for bridging the data gaps, leverage existing relationships/partnerships, and establish some mutual value proposition for all partners in a data-sharing partnership. They should also be able to guide companies through the process of creating data-sharing contracts.
-
• Sharing lessons across institutions about what works in data science collaborations can help rapidly scale success. For example, in this case study, success in one country helped GPSDD drive success in several others, which resulted in initiatives in about 50 African countries, as we learned from Mr. Victor Ohuruogu (Senior Africa Regional Manager for the GPSDD).
Recommendations for long-term adoption: we had a chat with Mr. Victor Ohuruogu, and this is a recommended way forward to enable responsive data policy for Africa.
-
1. Governments and their partners should start focusing on building more secure, resilient, and ethically sound data systems.
-
2. Local data ecosystems, such as the national statistical system, must develop better data governance capabilities (for end-to-end management of all data transactions).
-
3. There is a need for a better Government commitment and practice of inclusive and equitable policy decision-making, which can only thrive when there is a political priority on the agenda of data and its proactive use throughout the development lifecycle.
-
4. Governments should implement laws, mechanisms, and policies that provide institutional and technology infrastructures to unlock access, enable effective data sharing and policies, and deepen a data use culture.
Case Study 3: Stakeholder engagements in project development: The Nigeria Data Hub (National Coronavirus Geospatial Data Hub, 2022).
Early in the pandemic, Nigerian decision-makers recognized the need for reliable and frequent data from which insights could be derived to support and be the basis for COVID-19 response decision-making. The policymakers wanted a one-stop center for information dissemination to the public because of fears of low public trust in healthcare systems and widespread misinformation about the virus that added to the challenges of effectively controlling COVID-19.
With the help of GPSDD and the UNECA in partnership with ESRI, GRID3, and FRAYM, the Nigerian NSO put together a centralized National Coronavirus Geospatial Data Hub. The hub combined data analysis and visualizations from various national and state (regional) datasets, including the Nigeria CDC, Mobility data from Google (2022), Nigeria’s COVID-19 Risk Profiles (2022), Community Vulnerability Index (CCVI) data (2022), and more. The dashboard helped aid government bodies working on COVID-19 prevention and response by providing data about the current state of the pandemic. It aided the public in discovering the latest status of the pandemic via various resources. This initiative demonstrates the role that multiparty collaborations can play in enabling the use of data for COVID-19 response, creating pathways for data access and sharing that can be further exploited for social good.
Lessons learned from Nigeria Data Hub initiative.
-
1. Operationalizing data policy projects requires convening partners to a common agenda. Data intermediaries (i.e., broker organizations) can facilitate this, for example GPSDD, which initiated and established grounds for data partnerships in this case. GPSDD helped NSOs assess the data needs, explore data access possibilities, and bring other partners on board for the analysis.
-
2. Forming local (national) expert committees to foster the data agenda drives progress. Expert committees are technical committees with the knowledge or ability to identify what data are available and missing and then contact relevant institutions to negotiate reliable and frequent data access.
-
3. Forming collaborations and developing solutions with already established organizations in a country is faster. For example, in this case study, partnering with GRID3, which already had geospatial and socioeconomic data for Nigeria, made it easier and quicker to get some solutions running.
Pathways from data-driven COVID-19 response solutions that can increase data sharing for social good beyond the pandemic (Table 1).
Challenges and lessons from Africa’s data-sharing practices
Survey responses and workshop discussions demonstrated a range of challenges and barriers to data sharing that practitioners and decision-makers experienced during data-driven projects. These challenges and the subsequent lessons learned can be categorized into three broad areas: data literacy, data governance capabilities, and partnership building.
1. Data literacy
The awareness of the value of data and how to achieve that value could be higher. We learned through the study that government institutions require further capability building to understand the availability, potential, and use of the data they hold. We also learned that there are few case studies to learn from and refine roles and practices in this area. Data controllers (mainly passive data producers) require further guidance on standards and procedures to make the data more valuable for data curators and decision-makers.
The Mo Ibrahim Foundation (MIF) report (COVID-19 in Africa: A Call for Coordinated Governance, Improved Health Structures and Better Data, 2022) also highlighted the need for capacity strengthening. The report states that “The capacity of African countries to address healthcare challenges remains hindered by a lack of data coverage, stemming from weak statistical capacity.”
Data science deployment experiences in the COVID-19 response showed that the data literacy challenge was evident where:
-
I. Key stakeholders (especially in public institutions) needed help understanding what data sources exist, the potential of these data, and who has the power to share it. This lack of understanding mainly affected the use of new data types like CDR.
-
II. Stakeholders found it challenging to understand their role in data-sharing arrangements (who does what in enabling data access and use) and reported difficulties finding existing solutions or initiatives.
-
III. Finally, stakeholders did not collect COVID-19 data in a “forward-looking way” (in formats that can be further processed and analyzed), resulting in data analysts finding difficulties using the data or having to do extra work to get data in analyzable formats.
These, coupled with the need for more clarity about data rights, what they are, and who defines them, translated into the inability to set up mechanisms to exploit these data fully.
2. Data governance capabilities
The COVID-19 pandemic crisis exemplified the need for trustworthy data governance, addressing privacy issues, accountability, and more in handling data, especially new data sources. These issues were evident with concerns such as whether sharing particular data types is legally or ethically acceptable (including publicly generated data in public offices). Questions about what data should be made available—or not—and with whom contribute to a lack of confidence from data controllers about sharing data with external parties and the specific terms under which it should be shared.These made data producers shy away from engaging in data-sharing initiatives. A relative absence of legal frameworks (i.e., policies and procedures) for data sharing, as was experienced by actors in the COVID-19 response, also hindered the establishment of data-sharing structures.
An overview of data governance in Uganda, Kenya, Ghana, and South Africa during COVID-19 response
CIPESA, which stands for Collaboration on International ICT Policy for East and Southern Africa, has conducted several studies assessing data governance in the case of COVID-19 response. In their report (COVID-19 Data Governance in Kenya: Lessons for the Future, 2022) that reviewed relevant data governance frameworks (i.e., rules governing data utilization) that were in place in Kenya during the pandemic, they highlighted practices like issuance of “a guidance note” on access to personal data as a way that was used to provide direction on how technological innovations could request for access to personal data from government institutions or private entities to build solutions for the response. They say that “the guidance note served to re-emphasize the key principles of data processing as provided under the Data Protection Act, 2019”. However, the report states that there was a lack of clarity and limited public awareness of the data governance framework, which the government and its partners employed in collecting, processing, and managing COVID-19 data.
Reports also show that in some cases where countries already had regulations on data governance, similar challenges were experienced because the regulations were not comprehensive enough to cover issues regarding new data-driven solutions such as contact tracing. For example, a temporary regulation amendment (Disaster management act, 2002) created a contact tracing database in South Africa. Still, this move was criticized as citizens thought it enabled the government to conduct population surveillance (South Africans are worried the government, 2022).
Similar to the conditions in most other countries in Africa, in their report (COVID-19 and Data Rights in Uganda Report, 2022), CIPESA states that much opaqueness surrounded data-driven COVID-19 solutions in Uganda and that regulations around the use of data for COVID-19 focused on addressing the health emergency and paid lip service to the protection of data rights.
Ghana has been one of the countries that exemplified successful data-driven approaches to COVID-19 response, having had prior experience in data collection and private sector data access arrangements. Some of this success is attributed to the fact that Ghana enacted several laws and policies for how data are collected, processed, shared, and utilized, including data privacy and protection issues. However, in their report (Data Governance and Public Trust, 2022), CIPESA learned that Ghana’s efforts have been attributed to low levels of public trust in institutions that collect and control data. Their study found a prevalence of mistrust in public data controllers in Ghana, and the main reason for the mistrust, according to their research, is the lack of public education and awareness among data subjects as well as personal experiences relating to data breaches by data controllers.
The Centre of Studies of Economies of Africa (CSEA) report (Strengthening the Data Governance in Africa, 2021) states that there needs to be more uniformity in policy approaches being adopted by member states. In this report, they also observed the need for more implementation strategies; there is a general deficit in institutional capacity to support a well-functioning data governance environment in Africa. They identified four focus areas that comprise the features of a good data governance framework: technology, economy, security, law, and human rights. In their stakeholder consultations, two-thirds of respondents thought COVID-19 impacted the perception of the importance of data governance.
3. Partnership building
Finally, we learned that there is a divide between solution providers, decision-makers, and data controllers, which hinders the development of potentially impactful data collaborations. Furthermore, we learned that where there were no strong multi-stakeholder partnerships before the pandemic, data access and sharing initiatives took a long time or never happened. This demonstrated a need for intentionality in creating grounds for partnerships to engage with stakeholders constantly, explore opportunities to unlock access, and enable effective data sharing and policies across public and private institutions. Creating spaces where different parties involved in the data ecosystem can come together to discuss data access and use and bridge the gap between solution providers, decision-makers, and data controllers.
One common factor among successful case studies for COVID-19 data-driven solutions was the presence of a convening partner who brought stakeholders playing different roles together in making data access and use possible for the COVID-19 response. Well-established partnerships build trust and mutual understanding between solution providers and policymakers, increasing the chances of solution uptake. Several studies have recommended a multi-stakeholder approach to foster data sharing and use and as an avenue for data use and protection accountability. This suggests the need to maximize partnerships while working toward effectively implementing a strong data protection landscape for Africa (Ilori and Adeboye, Reference Ilori and Adeboye2020, Reference Ilori and Adeboye20 April).
The report (Bridging the Data-Policy Gap in Africa) by the Partnership in Statistics for Development in the 21st Century (PARIS21 and the MIF) provides a six-point roadmap for bridging the data-policy gap in which they call for action to build and strengthen partnerships among other things including improving communications, establishing data governance, and strengthening data literacy.
Recommendations and conclusion
Data science deployment experiences in the COVID-19 response highlight three areas that can benefit from the lessons learned during the pandemic. These include institutions, platforms, and people.
Institutions in the public sector (including central government and NSOs) and their partners in the private sector and NGOs provide the organizational, legal, and technical infrastructure to enable effective data sharing and implement data access policies across public and private organizations. Experiences of data science implementation in the context of COVID-19 show that making data use a political priority for national governments is vital to successful collaborations between these institutions. Notably, NSOs remain cornerstones for data ecosystems in Africa. Building their role requires mechanisms that strengthen data literacy and create data governance capabilities, such as defining clear and accessible laws and guidelines that govern data protection and sharing in the public and private sectors. Such mechanisms can support NSOs in prioritizing and promoting the data agenda across other public sectors, and building long-term institutional partnerships.
Effective use of data in policymaking typically requires a multi-stakeholder approach, making people an essential ingredient for successful data-driven policymaking. For example, those developing and deploying data-science interventions must understand the priorities, incentives, and operational decision-making practices among actors, including data curators, controllers, producers, and users. Fundamentally, data-driven solutions need people with domain knowledge and technical skills who can work across often complex organizational configurations. Reviewing the roles and capabilities required in COVID-19 response solutions can help policymakers understand what roles and practices are needed, build skills across different corporate actors, and create avenues for the public to learn, understand, and engage with data-driven solutions.
Finally, governments, the private sector, NGO solution providers, data producers and controllers, researchers, and other data ecosystem stakeholders need platforms to bring them together to initiate collaborations that support data-driven policymaking. Experiences of COVID-19 response show how trusted third-party organizations can help create these platforms by supporting policymakers in the complexities of brokering (identifying, convening, and coordinating) data-sharing initiatives. COVID-19 data collaborations testify that these practices can translate into evidence for policymaking. There is an opportunity now to build on the foundation of existing, successful platforms from the COVID-19 response to refine such structures. This will help build trust among stakeholders in the data ecosystem and secure long-term partnerships.
The lessons from the deployment of data science for policy development and implementation and the recommended actions from the experiences of data-driven initiatives can guide government institutions and their counterparts on how to build on these initiatives’ successes to increase data sharing for societal benefit in the longer term. In addition, the lessons from the different case studies demonstrate that unlocking the potential of data sharing for social good in Africa requires fresh thinking like the one demonstrated in the COVID-19 response.
Data availability statement
The data collected and analyzed for this study are openly available via Zenodo: https://zenodo.org/records/13365282?token=eyJhbGciOiJIUzUxMiJ9.eyJpZCI6IjdhZjVhZDE2LTc2ZmQtNGJjOS05NDViLTFkYTZkYTJlOWFjMSIsImRhdGEiOnt9LCJyYW5kb20iOiJkMzY5MDdiZTY3NWM1NzczZGFhMDAzNzY2ZjZjNDM2NSJ9.oOnvFU0d-_Z8wdUwoDXAy98hVJeAbtpxRYsrAEPD9Xv8nvrjT_YkcAU5BAB3JWmyJWlqSlZNY6BMulbUCoG4Jw (Africa’s data-driven COVID-19, 2022).
Acknowledgments
The author is grateful to Data Science Africa (research fellowship program) and to Prof Neil Lawrence (DeepMind Professor of Machine Learning at the University of Cambridge) for the opportunity to carry out this research under his guidance.
The author would also like to thank the following for supporting the study.
-
1. Victor Ohuruogu, Global Partnership for Sustainable Development Data
-
2. Rachel Bowers, Ghana Statistical Service (GSS)
-
3. Lola Talabi-Oni, National Bureau of Statistics, Nigeria
Author contribution
Conceptualization, M.A, N.L, and J.M. Data collection and analysis, M.A. Writing, M.A, and J.M. Review and editing, J.M, and N.L.
Funding statement
This work received no specific grant from any funding agency, commercial, or not-for-profit sectors.
Competing interest
The authors declare none.
Data contributors
Individuals that provided their project data or helped collect data about data-driven COVID-19 solutions in Africa.
Wuraola Oyewusi, Ayomide Owoyemi, Ernest Mwebaze, Jessica Oyugi, Aurelle Tchagna Kouanou, Tamaro Green, Ezekiel Adebayo Ogundepo, Vasileios Lampos, and Duncan Sembunya.
Finally, the authors want to declare that an abstract (Increasing data sharing for social good, 2023) and a discussion paper (Increasing data sharing for social good, 2023) of this manuscript was submitted on Zenedo for the data for policy 2022 conference participants to access.
Workshop participants
Domain experts from academia, government, private and NGO sectors that participated in the workshop organized for this study and shared their experiences (as enablers, solution providers, data producers and curators, facilitators/coordinators and decision-makers) in using data science in COVID-19 response.
Definition of data sharing as used in this study
Data sharing in this study covers a range of data types, including data from nonconventional sources such as mobile telephones accessed via agreements between organizations and verified data (COVID-19 statistics) published by government authorities during the response period. This study did not examine the types of sharing licenses for different data-driven COVID-19 solutions.
Statement of bias
The study considers Africa a homogenous unit because of African nations’ similarities and challenges in the data ecosystem (Data for a resilient Africa, 2022). These similarities make it ideal for governments and their stakeholders to learn from each other to inform their data strategies. The study also recognizes the African Union’s efforts for a unified Africa for inclusive and sustainable development and collective prosperity pursued under Pan-Africanism and African Renaissance (Agenda, 2022).
Comments
No Comments have been published for this article.