Policy Significance Statement
The findings will inform policymakers on the key areas to address and support by dedicated measures and interventions. The evidence collected reveals the need to enhance the capacity of local governments to deal with data, not only at the technical level, but also at the legal, organizational, and cultural ones, which tend to be overlooked by cities as well. Such improved capacity will be a prerequisite for setting up wider and sustainable data ecosystems that enable the creation of better public services and data-driven policies.
1. Introduction
The exponential growth of data collection, combined with the rapid development of technologies, opens possibilities for analyzing data to address political and societal challenges. Governments, research institutions, and individuals are producing and making available large amounts of data on a variety of platforms (Chen et al., Reference Chen and Zhang2014). This new availability of data provides the potential for creating, managing, and sustaining different data-sharing initiatives, for example, within smart cities (Abu-Matar, Reference Abu-Matar2016), open data initiatives (Lee, Reference Lee2014), and scientific data communities (Lindman, Reference Lindman, Kinnari and Rossi2016).
According to policymakers, data promise significant benefits to the political and social life, such as the support of decision-making processes and the enhancement of citizen services (Dawes, Reference Dawes, Vidiasova and Parkhimovich2016). Data are also expected to enable citizen’s participation to achieve a high level of citizen-centricity and on data-driven decision-making to improve the quality of life of citizens (Pereira et al., Reference Pereira, Eibl, Stylianou, Martínez, Neophytou and Parycek2018). Yet, data-driven innovation is rarely created by a single organization or in traditional value chains. Instead, various data sources from different organizations are combined and enriched in cross-industry, socio-technical networks—so-called data ecosystems. At the same time, even if societal challenges are increasingly complex and require a global response, the concrete answers to the different pressing issues are very often to be found on the regional and local levels where that have concrete, manageable dimensions, and a clear context (Moulaert et al., Reference Morozov and Bria2007; Monge et al., Reference Monge, Barns, Kattel and Bria2022). Given those bounding conditions, the subnational territorial levels provide a fruitful environment for testing and adopting solutions that when scaled and spread across cities and regions would have a pan-European impact.
According to the European Commission, Europe is at present not utilizing the full potential of data that are generated by its citizens, industries, academia, and public authorities. The reasons are complex and relate to an intertwined set of organizational, technological, and legal barriers. In response, the Commission has put forward the European strategy for data. The overarching ambition of the strategy is to establish a single European market for data through data spaces in specific sectors (agriculture, mobility, health, banking, etc.). The legal instruments established in order to make this vision a reality include the Data Governance Act, the Data Act, the Implementing Act on High-Value Datasets under the Open Data Directive, as well as the Digital Markets Act, and Digital Services Act. Furthermore, on the initiative of the Council of the European Union, the Berlin Declaration (of 2020) asked for a value-based digital government and presents seven principles for any related policy action, the Lisbon Declaration (of 2021) adds the concept of digital democracy and promotes multi-stakeholder cooperation—also when it comes to data flows. Here, regional and local digital innovation is playing an increasingly important role in the implementation of the political agenda. There are already multiple good practices in different cities and regions that can be scaled, sustained, and further developed, for example, under the living-in.eu (Join, Boost, and Sustain) initiative.
For these reasons, the present article focuses on local data ecosystems, understood as complex socio-technical systems of people, organizations, technology, and policies that interact with one another and their surroundings. Such ecosystems evolve and adapt through a sustainable cycle of data gathering and sharing, data analytics, and value creation in the form of new products, services, or knowledge, which, when used, often produce new data feeding back into the ecosystem.
The study presented in this article examined the local data ecosystems of seven European cities and compared them to widespread expectations in policy and public discourses. The article presents the findings of a qualitative research that investigated the experiences and perspectives of local administrations regarding establishing data ecosystems. The study contributes to research on data-driven innovation in the public sector comparing high-level expectations associated with data ecosystems with actual practices of data sharing and innovation at the local and regional level. In this study, we examine practices through accounts made by city’s representatives explaining the practicalities they have gone through for establishing innovative data partnerships and projects, such as the technical, legal, social, and organizational facets they have addressed. These practices, in our view, lead to the real-world implementation of the abstract notion of data ecosystems.
The study examines experiences and obstacles to data sharing investigating what currently prevents establishing data ecosystems at local and regional levels. The empirical qualitative research on the practices of data innovation, including the obstacles and limitations, offers the lenses through which critically interrogate expectations on the free flow of data and its use to build better public services in Europe.
2. Local Data Ecosystems in the Literature
2.1. Defining data ecosystems
While the increased collection of data is a fact, their use for value creation is still just a possibility that depends on the interaction between the different organizations and individuals related to data in a given domain or sector with different roles. According to policymakers, the successful coordination among different actors and willingness to share data are crucial to generate value from them, a widely cited market report by IDC (for instance in Kambies et al., Reference Kambies, Mittal, Roma and Sharma2017), for instance, claims that almost 90% of unstructured data are never used and analyzed and a more recent report estimates that 68% of business data goes unleveraged (reported in Harris, Reference Hardinges2020). Furthermore, many analysts claim that unused data does not only bring no value, but also generates additional costs (e.g., Experian, Reference Giest2017). The valuable use of data requires simultaneously attracting participants, through lower barriers to entry, and generating benefits and dependencies for all involved actors that are often heterogeneous. The understanding of value creation, in the context of local data ecosystems, is twofold. It includes both monetary value, which could be generated from data exchanges that involve private sector actors, and especially social value, intended broadly as any kind of virtuous outcome that could be produced in the social world thanks to data (Muniesa, Reference Muniesa2017). In any case, we consider the creation of value as a possible consequence of data ecosystems and as a widespread assumption for policymakers, which often adhere the “data as an asset” discourse (Fussell, Reference Fussell2023). Yet, we are aware there is not a mechanistic linear logic between data ecosystems and value creation.
Despite the growing interest in data ecosystems at the policy level, research on this topic is still in its infancy and under construction (Harrison et al., Reference Harrison, Pardo and Cook2012; Zuiderwijk et al., Reference Zuiderwijk, Janssen, van de Kaa and Poulis2014, Reference Zuiderwijk, Janssen and Davis2016; Oliveira and Farias Lóscio, Reference Oliveira, Barros Lima and Farias Lóscio2018; Oliveira et al., Reference Osterwalder and Pigneur2019). Some significant work has been done on the notion of “data collaboratives,” an umbrella term that refers to a wide range of agreements and partnerships for data sharing among a wide range of actors from private sector, public sector, and civic society (Verhulst and Sangokoya, Reference Verhulst and Sangokoya2015). Although a key element of a data ecosystem, data collaboratives more narrowly refer to the various coordination mechanisms established between private and public organizations to leverage data to address a societal challenge (Susha et al., Reference Susha, Janssen and Verhulst2017). They allow matching data supply and demand, integrating data from different sectors, sources, and institutions, for the implementation of innovative solutions to social problems (Susha et al., Reference Susha, Grönlund and Van Tulder2019). For instance, data collaboratives in urban contexts could help improve air quality in cities by increasing governments’ access to key information, enabling informed research and forecasting to tackle air pollution as well as the monitoring and evaluation of policies (Verhulst, Reference Verhulst2021).
The problem with defining data ecosystems starts with the ecosystem concept itself, which according to some authors (Suominen et al., Reference Zubcoff, Vaquer Gregori, Mazón, Maciá Pérez, Garrigós, Fuster-Guilló and Cárcel Alcover2016; Hyrynsalmi and Hyrynsalmi, Reference Helderop, Grubesic and Alizadeh2019) runs the risk of becoming a “Zombie Category” in the sense specified by sociologist Ulrich Beck (Beck and Beck-Gernsheim, Reference Bass, Sutherland and Symons2002). A ‘Zombie Category,” in Beck’s definition, is a concept or term which is already dead in content, but still alive and in active use. The term ecosystem was first used in biology (Tansely, Reference Tansley1935) where it acquired the meaning of defining complex biological systems where all the biological organisms found in a given environment interact and co-evolve with each other and with the environment. Subsequently, the specific characteristics of the biological ecosystem concept have been transferred to other research contexts (Jacobides et al., Reference Latour and Lemonnier2018). The first use of the term outside biology is attributed to Moore and its concept of the “business ecosystem” (Moore, Reference Moore1993, Reference Moore1996). Later, Adner (Reference Adner2017) claimed that (a) the lack of boundaries and (b) the process of co-evolution and coopetition are the two characteristics that distinguish ecosystems from more traditional definitions of value chains, sectors, and industrial structures. The lack of clear boundaries of the ecosystem leads to different degrees of dependency and relationships between the actors forming a heterogeneous and alternating member base. As a result, classic competition is supplanted by co-evolution and competition. Ecosystems are characterized by processes of continuous, interdependent development of multiple actors, given their cooperative and competitive relationships (Nalebuff and Brandenburger, Reference Nalebuff and Brandenburger1997; Moore, Reference Moore2006).
These characteristics are also valid for data ecosystems (see, e.g., Harrison et al., Reference Harrison, Pardo and Cook2012; Zuiderwijk et al., Reference Zuiderwijk, Janssen, van de Kaa and Poulis2014, Reference Zuiderwijk, Janssen and Davis2016; Oliveira and Farias Lóscio, Reference Oliveira, Barros Lima and Farias Lóscio2018; Oliveira et al., Reference Osterwalder and Pigneur2019). As argued by Oliveira et al. (Reference Osterwalder and Pigneur2019), however, there is little agreement about nomenclature and definition of data ecosystems. By combining different definitions, Oliveira et al. (Reference Osterwalder and Pigneur2019) claim that data ecosystems may be defined as a complex socio-technical network that enables collaboration between autonomous actors to explore data (Pollock, Reference Pollock2011; Ubaldi, Reference Ubaldi2013; Lee, Reference Lee2014; Zuiderwijk et al., Reference Zuiderwijk, Janssen, van de Kaa and Poulis2014). Such ecosystems provide an environment for creating, managing, and sustaining data-sharing initiatives (Harrison et al., Reference Harrison, Pardo and Cook2012; Ubaldi, Reference Ubaldi2013; Lee, Reference Lee2014; Zuiderwijk et al., Reference Zuiderwijk, Janssen, van de Kaa and Poulis2014). Similarly, Zubcoff et al. (Reference Zubcoff, Vaquer Gregori, Mazón, Maciá Pérez, Garrigós, Fuster-Guilló and Cárcel Alcover2016, p. 251) state that a data ecosystem “is made up of many actors and small organizational structures that should recognize data like the raw material that is in a cycle and is capable of feeding the ecosystem, providing benefits to all parties.”
2.2. Understanding data ecosystems in a local and regional settings
The regional and local levels are often the places where concrete answers to complex challenges are provided. The subnational territorial level provides a fruitful environment for developing and analyzing data ecosystems, which then can be scaled and spread across different cities and regions with similar characteristics. Already several studies have shown how data can generate benefits in regional and local settings (Appio et al., Reference Appio, Lima and Paroutis2019), which have been often defined as “smart city,” as the collection and use of data are expected to result in more effective and efficient services (Del Bo and Nijkamp, Reference Del Bo and Nijkamp2011). For instance, governments might access sensors data on air or noise pollution to develop new policies, or data collected by ride-sharing companies and mobile phone operators to better plan urban mobility and related infrastructures (European Commission, 2020b; Verhulst, Reference Verhulst2021). Yet, the technocratic notion of a smart city has also been contested and reworked, as the various actors and stakeholders involved in managing urban issues have different goals, resources, and practices which might lead to very different outcomes (Kitchin, Reference Kitchin2018; Löfgren and Webster, Reference Micheli2020).
Although the rhetoric “wants” smart cities and big data together, a significant portion of data produced within cities is still underused or not used at all. Many public and private organizations are hesitant to share their data. Some authors point to the lack of knowledge about the actual benefits of inter-organizational data sharing as one of the main obstacles because organizations are currently not motivated to engage in data ecosystems (Oliveira et al., Reference Osterwalder and Pigneur2019; Gelhaar et al., Reference Gelhaar, Tan, Michael and Otto2021). Other factors include the lack of governance frameworks, which explains private companies’ reluctancy to share their data with public sector organizations on the grounds of commercial confidentiality, privacy, and security concerns (Helderop et al., Reference Edwards2019; European Commission, 2020b; Micheli, Reference Micheli2022), and lack of data culture or capacity, especially among public sector organizations (Giest, Reference Shelton, Zook and Wiig2017; European Commission, 2020b). More generally, the challenges result from the interaction between motivations and the structure of incentives, which opportune coordination/collaboration strategies and mechanisms could steer in the desired direction (Susha et al., Reference Susha, Janssen and Verhulst2017).
To understand what is at stake in the establishment of local and regional data ecosystems it is necessary to adopt a holistic view, which includes a discussion on the contextual conditions in which regions and cities operate (Meijer, Reference Mazzucato2018). Such broader view allows to consider management, social, and institutional challenges related to data innovation. Technological infrastructure is not the main component of a data ecosystem, it is the interaction and collaboration between the different actors. Government authorities, industry players, service delivery providers, and intermediaries are all involved in city data projects. Together, these organizations and their interactions are what has been conceptualized as city data ecosystem (Gupta et al., Reference Gupta, Panagiotopoulos and Bowen2020).
Coordination of different actors (stakeholders) and relational processes is strategic (Susha et al., Reference Susha, Janssen and Verhulst2017). Actors’ relationships range from a relatively simple supply–demand chain to a more complex network of multilateral relationships. The former follows a “one-way street” model, in which data producers, such as governments, publish data to be processed by intermediaries such as app creators or analysts, before finally being consumed by end users (Pollock, Reference Pollock2011). In a data ecosystem, each actor is connected to multiple actors by a set of interests or business models. The ecosystem organizational structure entails the way actors are connected and the properties of their relationships.
Governments’ access to private sector data is a particularly problematic issue. The European Commission’s Data Strategy (European Commission, 2020a) considers the limited access by public bodies to private sector data as one of the main barriers to improve evidence-driven policy-making and public services provisions. There often are conflicts of interest between the private and the public actor for what concerns data sharing due to the latter primary aim at creating public value rather than mere financial gains that tend to be sought by the former (Mercille, Reference Mercille2021).
Different kinds of relationships can be established between private and public actors for sharing data, which do not only influence data quality and availability, but are also informative of specific governance models and power balances (Micheli, Reference Micheli2022). In the current data landscape, private sector’s position is increasingly dominant and hard to pin down (Taylor and Broaders, Reference Löfgren and Webster2015; Mejias and Couldry, Reference Mejias and Couldry2019). Yet, local governments and public bodies could have a key role in fostering a more balanced data economy, for instance, accessing private sector data and using it for socially relevant purposes redistributing its value across society (Bass et al., Reference Bass, Sutherland and Symons2018; Morozov and Bria, Reference Gelhaar, Tan, Michael and Otto2018; Mazzucato, Reference Mazzucato2018; Verhulst, Reference Verhulst2021). Thus, the establishment of local or regional data ecosystems cannot only foster economic growth, but might also support public sector’s mission and the promotion of a fairer data economy.
Stemming from the above considerations, the article increases understanding of the “actually existing” (Shelton et al., Reference Shelton, Zook and Wiig2015) local data ecosystems, moving beyond the promises and expectations that surround this concept (Meijer, Reference Mazzucato2018). Drawing from the findings of a project that analyzed data ecosystems in seven cities, the article inquiries how the idealized notion of a socio-technical network of actors, which develops from data-sharing relations, is brought into practice. In other words, how local actors have operationalized data ecosystems through data sharing, management, and innovation, taking into account technical, legal, social, and organizational aspects. To do so, the article addresses the obstacles and complexities that underpin data sharing across actors and sectors within local and regional settings, considering the specific organizational contexts, as well as the perspectives of those directly involved.
The rationale of this research is inspired by science and technology studies (STS), which have been exploring the social construction of technologies occurring through negotiations among “relevant social groups,” and the consequences of technologies’ affordances (for a discussion on how materiality has been accounted in STS and media studies, see Lievrouw, Reference Lievrouw, Gillespie, Boczkowski and Foot2014). Local data ecosystem is a relatively new concept, embedded in narratives and imaginaries, that is still “seeking to became real” (Edwards, 1997). As our brief literature review highlights, there is not yet fully articulated consensus on what the concept means and how it is implemented. Heterogeneous actors might be involved in setting up a different range of data-sharing agreements and infrastructures. Practitioners are concerned with bringing progress to the development phase, to produce tangible results, to understand enablers, and to overcome obstacles. Yet, local data ecosystem is still a new “technological artifact” in the making, whose “closure” did not occur yet, and little is known about the socio-technical configurations it can take and their outcomes (Pinch and Bijker, Reference Pinch and Bijker1984; Latour, Reference Latour and Lemonnier1993). Our research is an attempt to empirically investigate the ongoing process of technological change concerning the establishment of data ecosystems in local and regional contexts, considering the intertwined involvement of social agents and artifacts (Lievrouw, Reference Lievrouw, Gillespie, Boczkowski and Foot2014).
3. Methodology
3.1. Selection of cities
The selection of cities for the study followed a pragmatic approach, without the aim of selecting a representative sample for comparison. As a first step, we identified 10 European cities which could provide significant qualitative insights on the current status of local data ecosystems. Cities were identified following four criteria: (a) Convenience: the availability of pre-existing relationships with cities’ representatives on the topic to be addressed. (b) Relevance: we considered cities that could ensure a successful collection of information, because it was more feasible to collect documents and organize interviews about recent initiatives on the use of data undertaken at the local level. (c) Geographical distribution: we aimed to identify a group of cities located in different European geographical areas. (d) Population: we aimed to identify a group of cities with different size, in order to investigate case studies coming both from small cities and from more heavily populated ones. (e) Engagement: we defined an initial list of 10 cities and representatives from each city were contacted in order to assess their engagement. All the cities were afterward invited to join one dedicated workshop aimed at understanding their interest, motivation, and readiness to get involved in the data ecosystem analysis. At least one policy officer and one technology specialist for each of the 10 cities attended the workshop. Design thinking was as primary engagement strategy used to understand how each city may contribute to the project and led to select the final list of seven cities in which we performed the data ecosystem analysis. As mentioned, the cities selected are not meant to be a representative sample for comparison at EU level. For each city, we captured the data ecosystem analysis through different case studies which were based on preferences expressed by cities’ representatives.
By using these criteria, the following cities have been included in the study (Table 1).
a Source: Eurostat—URB_CPOP1, latest data available at https://ec.europa.eu/eurostat/databrowser/view/urb_cpop1/default/table?lang=en.
b It should be noted that for Barcelona it was selected an instrument (Housing Observatory of Barcelona) of a supra-municipal scope led by various administrations. Namely, the City Council of Barcelona, the Province, and the Region of Catalunya.
For each city, specific projects and initiatives were reviewed and discussed with city representatives. For instance, the Milan’s Digital Transformation Plan (https://www.comune.milano.it/documents/20126/128206432/Piano+di+trasformazione+digitale.pdf/dd03211d-1a95-b528-778b-5a84dc3519f4?t=1595496672278), the Barcelona Metropolitan Housing Observatory (O-HB) (https://www.ohb.cat/?lang=en), the Smart Poznań Mobile App (https://www.poznan.pl/mim/smartcity/en/application-smart-city,p,25877,58168.html?wo_id=684), and the overall Helsinki’s Data Strategy (https://digi.hel.fi/english/helsinki-city-data-strategy/) which includes the project “Helsinki 3D” that uses data collected by sensors and private repositories to support decision-making. These and other ongoing projects have particularly influenced the choice of the authors, according to relevance.
3.2. Semi-structured interviews
The in-depth interviews conducted with 19 representatives of the seven selected European local administrations were complemented with documentation obtained from the cities. The interviewees were mainly policy officers, technology specialists, or head of unit of the digital or smart city departments, while no private actors were interviewed despite some of them were invited by the authors but refused to participate to the study. The interviews, which lasted around 45–60 min each, have been conducted online by the authors. We adopted a semi-structured approach. A general interview guide outlining the main topics was developed, instead of a detailed questionnaire, and it was used for all the interviews (Adams, Reference Adams2015). The interview guide covered different components of data ecosystem found in the literature (ecosystem actors, data governance models, i.e., ecosystem’s rationale, value proposition, incentives and business model, legal context, ecosystem maturity, technology used, coordination mechanisms, and future developments), which were incorporated in a framework by Martin et al. (Reference Monge, Barns, Kattel and Bria2021) derived from a business model canvas (Osterwalder and Pigneur, Reference Susha, Grönlund and Van Tulder2010). The interviews illustrate how cities’ managers and other representatives experience the creation, the development, and the future opportunities of data ecosystems and compare the findings with the literature and the expectations about how data ecosystems should be.
3.3. Analysis
We adopted an original conceptual framework to systematize the empirical insights gathered through the interviews. This approach is useful to examine complex socio-technical systems, in which knowledge spread across different bodies of literature needs to be pulled together to provide a holistic understanding of a given phenomenon. The process to analyze the evidence collected followed three steps. First, the full interview transcripts were structured using the five dimensions of the framework described below. Second, we conducted a cross-comparison of the cases for each of the dimensions to highlight similarities and differences among the cities analyzed. Third, we extracted three main cross-dimensional topics to draw the main conclusions illustrated in Section 4.
Our conceptual framework includes five dimensions, some of which are derived from the literature, that allowed a cross-comparison of the cases despite their differences. It is informed by recent literature on data governance and by the broader field of STS. Moreover, the interview guide used already includes these dimensions, so there is a direct link from the data collection to the analytical framework developed. Overall, the framework was a conceptual tool, adopted to systematize information related to the components of the data ecosystems collected through interviews and document analysis. The framework is composed of the following five pillars:
-
1. Actors of the ecosystem. Actors play different roles and are in charge of different responsibilities to enable the exchange of data. Our assessment of actors in each local context was based on different categories, such as data steward, intermediaries, and beneficiaries, which were derived from the Open Data Institute (ODI) methodology for mapping data ecosystems (https://www.theodi.org/wp-content/uploads/2022/04/2022_ODI_Mapping-data-ecosystems-2022-update.pdf).
-
2. Incentives. A data ecosystem can be developed and expanded if the appropriate political, economic, and organizational conditions are present. Such conditions enable the ecosystem and its proper functioning. As Zubcoff et al. (Reference Zubcoff, Vaquer Gregori, Mazón, Maciá Pérez, Garrigós, Fuster-Guilló and Cárcel Alcover2016) states, data ecosystems should provide benefits to all parties. Therefore, fair and reasonable incentive and revenue distribution mechanisms are important for reliable cooperation and sustainable ecosystem development.
-
3. Data governance models. In a data ecosystem, actors base their relationships on principles and coordination mechanism that may follow different data governance models identified in the literature (Susha et al., Reference Susha, Janssen and Verhulst2017; European Commission, 2020b; Micheli et al., Reference Micheli, Ponti, Craglia and Berti Suman2020). We have been informed by the social science-informed definition of data governance advanced by Micheli et al. (Reference Micheli, Ponti, Craglia and Berti Suman2020), that emphasizes power relations between actors that are affected or influence the way data is accessed, shared, and used.
-
4. Perspectives of social actors. Actors of an ecosystem have different opinions, motivations, and concerns regarding their relationships in the ecosystem and the different approaches used for data sharing and data governance. We include this aspect in the pillars because examining actors’ perspectives is a means to assess practices, as well as drivers and challenges for establishing data ecosystems in local contexts. As emerged in the literature, ecosystems are made of autonomous actors seeking collaboration (Oliveira et al., Reference Osterwalder and Pigneur2019), but actors involved in data sharing might also have diverging or conflicting interests (Micheli, Reference Micheli2022). Therefore, their perspectives may be conflicting and need to be investigated to understand how social agents currently shape the establishing of data ecosystems (Pinch and Bijker, Reference Pinch and Bijker1984; Gelhaar et al., Reference Gelhaar, Tan, Michael and Otto2021).
-
5. Technologies and interoperability mechanisms. Data ecosystems are socio-technical networks that rely on a technological infrastructure to work and to practically enable data sharing and interoperability among different actors. We include this pillar to examine the “materiality” of current data ecosystems at the city level (Lievrouw, Reference Lievrouw, Gillespie, Boczkowski and Foot2014). On the one hand, technical infrastructures originate from specific (power) relations between actors in certain contexts, which is a crucial aspect already included in the business canvas used for the interviews (Martin et al., Reference Monge, Barns, Kattel and Bria2021). On the other hand, they might have significant consequences in terms of ecosystems sustainability and of urban digital service delivery.
4. Findings: Expectations Versus Actual Practices
The experiences collected through the field explained above can be summarized along three main lines. First, the limited involvement of private sector organizations as actors in local data ecosystems through emerging forms of data sharing (primarily relating to pillars 1 and 3 outlined above, i.e., actors and data governance models). Second, the concern over technological aspects and the lack of attention on the social or organizational issues (above all relating to pillars 2, 4, and 5, i.e., incentives, perspectives, and technologies). And third, a widespread decision to apply a centralized and not a federated digital infrastructure (addressing pillars 2 and 5, i.e., data governance models, technologies, and interoperability). We elaborate on each of the three themes below.
4.1. Heterogeneity of actors and B2G data sharing
The literature posits that ecosystems lack clear boundaries, which leads to a heterogeneous, alternating, and fluid members’ base—with different degrees of dependency and relationships between the actors (Adner, Reference Adner2017). Similarly, the interviewees expected that different stakeholders are structurally involved in the data ecosystems and plan to develop “cooperation mechanisms” with external actors. They all expect to incorporate private sector’s data to develop further the data ecosystems. However, the heterogeneity of actors is not yet so evident and wide in the seven case studies analyzed, and local administrations are rarely establishing data governance models that include actors different than public bodies. Among the cases, there is one local data ecosystem with no private companies’ participation in data sharing, but only as intermediaries providing technical assistance; three with partial private sector membership and mostly limited to public utilities or chamber of commerce; and one regional data ecosystem with limited private sector participation (i.e., the involvement of private actors in the Barcelona Housing Observatory only concerns two companies, one of which does not even provide data to the ecosystem). Only the two local ecosystems of Helsinki and Bordeaux have a much wider and heterogeneous membership, as both cases show a coordination of various public sector organizations, or different departments. Given the large number of actors that populate any urban environment and multilevel governance, a greater heterogeneity is usually expected by policymakers within local and regional data ecosystems.
The findings show that building collaborations only within public sector boundaries is also a very challenging task. Especially where organizational and information silos legacy is strong, managing to link different departments is an important achievement, as testified by the name given to the City of Rome’s initiative during one interview (“ecosystems of ecosystems”). Silos have been described as “isolated databases that do not communicate and exchange data and information between departments.” Considering this, a cross-departmental joining up of data is understood by the interviewees as a considerable result. Indeed, as highlighted by interviewees in Barcelona, bringing together data and insights from different departments and/or administrative levels is an important and strategic first step toward evidence-based policymaking and public service provision. This enables having a more holistic picture of the problems of the city as a whole and of the needs of citizens, businesses, research institutions, and other societal actors. Furthermore, the cases of Helsinki and Bordeaux show that initiatives for the coordination of public sector organizations and different departments, launched to eliminate barriers for recipients of public services, were instrumental for the expansion of the ecosystem membership to external actors. In both cases, different departments of the city administration worked together and provided data to, and used data from, the data ecosystem.
The seven cases analyzed confirm that business-to-government (B2G) data sharing is considered by most of the participants in the interviews an obstacle to overcome for further development of city-level data ecosystems, capable of producing the expected benefits in terms of financial and public value and of citizens’ well-being. The lack of public–private partnerships suggests that public procurements remain the main instrument and that another embedded framework of incentives to attract private companies is yet to be developed. Often the contractual solutions for public–private agreements are not optimal, as they create transaction costs and do not bring with them the trust of less formal, but stronger, forms of collaboration. This is the case of the city of Poznan, where the concern is that the private data platform provider owns the code of the urban data platform, which creates limitations to the public administration which would like to scale up the project at the regional level.
To overcome the obstacles in B2G data sharing, the City of Milan is evaluating which policies and terms of services could be adopted to obtain data collected by private companies. The city managers are exploring different possibilities, both in the case of companies providing a public service on behalf of the Municipality—thus making a somewhat stronger case for the Municipality to get hold of the data—and also in case companies are collecting and managing data for purely commercial purposes. Collaborations attempted so far are limited to pilot projects in the mobility sector, where the Municipality of Milan has signed contracts with private companies with ad hoc tender clauses to maintain sovereignty over data having public value.
In the case of the Barcelona Housing Observatory (OH-B),Footnote 1 social and political motivations (corporate social responsibility and the objective of improving institutional relations) led a large private company, Airbnb, and other private rental portals to sign important agreements to establish a formal public–private collaboration. The rental portals provide data on the supply and demand of rental properties to the OH-B, while they obtain relevant analysis on the housing sector carried out by the O-HB. The agreement is not of an economic nature, as it is approached in a “win–win” data partnership (Micheli, Reference Micheli2022), as both organizations benefit from the collaboration. Airbnb, instead, signed an agreement, which only requires the company to take down illegal notices and does not include the provision of data to the city. Despite these isolated cases, as for other cities, also for the OH-B the collaboration with private companies is considered very difficult, as it was successful only in three instances. Interviewees explained that it is often difficult to align public and private objectives. The objective of the OH-B is to reduce information asymmetry vis-à-vis the private sector, by creating an extended data ecosystem joining together public bodies at different administrative levels. Small businesses can benefit in the same way as private individuals with more information and access to affordable rents. However, the main beneficiary behind a B2G data sharing would be the public administration, and the benefits for the private actor seem less immediately tangible and appealing.
It is precisely to overcome the skepticism toward data sharing that the City of Rome has experimented with a service-oriented approach in a few pilot interactions with private actors. Instead of simply asking for data, the city preferred to first enquire about the sort of insights that these private actors would have liked to extract from them. The city then developed on-demand solutions to address these needs, making the benefit of B2G data sharing for private actors more evident. While Rome has admittedly adopted this approach also due to budgetary constraints—since these pilot, on-demand initiatives certainly require lower investments when compared with fully fledged infrastructural developments—these pilots have certainly also been successful in increasing private actors’ willingness to share data.
4.2. Technological focus more than social and organizational
From the interviews conducted in the seven cities, it emerged that the technological perspective of the data ecosystems captures the attention of cities’ representatives more than the socio-technological one and the related relationships’ aspect, which are instead the most discussed in the academic literature. This was not a surprising finding, as the technical aspects associated with data sharing, while highly important for enabling data sharing in cities, are facilitated by the high abundance of software tools, open standards, architectures, and other IT solutions. Among the case studies, only the Helsinki Data Strategy is multidimensional and focuses equally on technology, the culture of sharing, and governance (including managing multi-actors’ relationships). The most frequently cited barriers for establishing data ecosystems across the seven cases are technological, related to the General Data Protection Regulation (GDPR) and other legal aspects, and to lack of in-house capacity and skills. Cultural barriers and relational issues are cited much less frequently. Among the social and organizational challenges considered by the study participants, the internal coordination among different departments was mentioned as a common barrier in the development of data ecosystems. Cooperation between different departments entails barriers in terms of different cultures and skills, but also internal political reasons. In some cases, there are also technological problems in terms of interoperability, availability, or quality of the data provided. In other cases, as in the City of Poznan, different levels of technological literacy among departments and public institutions are an important barrier for the development of the ecosystem. According to the representatives of the municipality of Santander, silos and working procedures hamper the implementation and improvement of the Open Data Portal and related services. Moreover, different departments have different levels of maturity regarding the use of datasets and the concept of open data. Therefore, despite the guidance provided by the Innovation Area of the city, which is a special unit part of the larger ICT department, not all departments are able to collect data and integrate them to the platform in the same way and with the same efficiency.
The Observatory of Barcelona (OH-B) has a similar problem because, according to the interviewees, the collaboration between different administrations creates inefficiencies and data incompatibilities. While the City Council of Barcelona is a very advanced data ecosystem, with a large amount of high-quality data, the other administrations involved in the Observatory are at a lower maturity level, with less and lower quality data. This asymmetry is problematic as it reduces the opportunity for advanced and homogenous data analysis. This challenge, which was raised directly by interviewees, has also been explored in-depth in the literature by Kitchin et al. (Reference Kitchin and Moore-Cherry2021), as the authors analyzed the complex case of Metro Boston (US), where they concluded that a fragmented administration has profound negative effects on the urban data ecosystem.
Participating cities often reported an unfavorable data culture based on a siloed approach. This is, for instance, the case of Bordeaux-Métropole, where initially they had different databases for every application and a very traditional culture of IT infrastructure in the administration which was hampering the development of a data warehouse (i.e., containing structured, processed data that is part of specific solutions). The development of a data ecosystem with different departments was launched to overcome these challenges. In that case, however, the problem is not only technological, as it was reported initially by the interviewees. In fact, the reality is that there are also organizational challenges, as the IT department requires alignment and coordination with several stakeholders and gate keepers (internal and external), which leads to very slow development and changes to the data warehouse.
4.3. Centralized versus decentralized data infrastructures
Given the multitude of European initiatives (ref. IDSA, GAIA-X, and European strategy for data) that promote federated data initiatives, we would have expected that existing city infrastructures would have been federated. In fact, the diversity of contexts and data-sharing practices in cities would be appropriate for the interconnection of several different components, each that reflects the legacy and particularities of specific actors of the local context. However, despite the specific differences, most of the cities reported a similar and opposite path in the urban digital transformation, which starts from the integration of data from different sources in a centralized rather than decentralized infrastructure.
Generally, most interviewees reported that the first step is the development of a data repository that pulls together data of different types and from different sources in a centralized infrastructure that represents the starting point for future analytical work. This is certainly beneficial for public authorities, but it can also benefit external stakeholders in case they gain access to those databases, as in the case of Barcelona. The initial goal of many data initiatives is the collection and storage of data in a single repository, overcoming data silos and sharing resistance from other departments or third party. As a second step, most of the cities are planning to carry out more advanced analytical work to “extract value from data,” as suggested by the interviewees in Barcelona. In many cases, representatives of the administrations revealed that the analytical work on the data has not started yet, or it is only in its infancy.
For instance, the Municipality of Milan is advanced in the development of a data lake infrastructure (i.e., containing unstructured, and raw data with no specific use identified), which will allow to make data available to all departments for processes of data analytics, data description, and data visualization. Similarly, in the case of Helsinki, data centralization is a featuring element of the data infrastructure. Interestingly, during its development, the city created various separate data lakes, one for each municipal departments, in which the collected datasets were stored and not related to the others, without a common metadata or cleaning and harmonization processes. Over time, however, this structure in silos was dismantled and a single data lake was created. The resulting benefits are evident, in terms of centralization of data for policymakers and in terms of transparency toward citizens, who can access a wide variety of datasets in numerous fields. Only the data collected via sensors (e.g., traffic sensors, security cameras, digital thermometers, etc.) are temporarily stored in a separate data lake, where they remain until the most relevant is selected, then cleaned and harmonized. Once these steps have been completed, the created datasets are integrated to the data lake.
The Bordeaux-Métropole data warehouse is bringing together approximately 600 datasets from various departments of the 14 municipalities in the region, and from several private/external services providers. Currently, the focus of the administration is building such data warehouse and developing an associated data governance strategy. The goal for the data warehouse is to store anonymized data that can be reused by different departments or ICT application developers. However, as in the other cases, the development of the data warehouse is not the ultimate goal. The interviewee reported that in the future the focus will be on identifying ways to leverage the data warehouse and provide analytic tools to assist decision makers. Similarly, Poznań Smart City team is currently gathering data from different use cases and departments. The future interest will be in analyzing this data and turning it into actionable information.
Another example is the City of Rome, where the City Data Platform represents a common repository for different departments, where databases are shared. The overarching goal of the data platform is to centrally administer and use data “trapped” in siloed vertical ecosystems, by creating a common space, which can enable services based on data analytics. As mentioned above, the ultimate goal, according to city representatives, is to develop a service-oriented platform capable of offering on-demand data analytics or visualization services to both internal departments and private stakeholders. In other cases, where a single data repository with multiple databases for a specific initiative has been developed, such as the Housing Observatory of Barcelona, the analytical work is conducted by third parties, as research institutions that collaborate with the administrations. The results then are used by policymakers to improve housing policies. Another relevant case is the City of Santander, which has developed an Open Data Platform where municipal departments and public utility companies have been integrating datasets, publicly available free of charge. As a second step, city representatives aim at transforming this platform in a Data Marketplace, where private businesses, citizens, and authorities can provide and exchange data.
5. Discussion and Conclusions
The article presents the findings of a qualitative study on local and regional data ecosystems involving seven European cities. From the analysis of the perspectives of city practitioners, directly involved in setting up new data-sharing approaches and infrastructures, three main themes have emerged that highlight key critical areas in which actual practices for establishing data ecosystem to a certain extent diverge from widespread expectations in this field. These issues deal with (a) the actors involved in data-sharing relations within city data ecosystems and the relations established with private sector entities, (b) the kind of concerns of public sector organizations vis-à-vis the establishment of data ecosystems, and (c) the type of infrastructures implemented to support data sharing and storing.
One of the key findings concerns the lack of heterogeneity among the actors involved in the cities’ data ecosystem. An overarching result from the current study, in fact, has been the prevalence of a one-sided perspective in local and regional data ecosystems. Although the research project had the goal to map all actors within the ecosystems and the relationships between them, the seven cases highlighted that the most active stakeholders were local governments. The study participants shared the view that with evolution and maturing of such ecosystems, the centrality of a single actor is expected to change, as more stakeholders will be involved. At the present time, however, city governments stand out as key stakeholders, who might act as promoters of innovative approaches for data sharing and use for the public interest. The result confirms the challenges local administrations face in establishing data partnerships with private sector entities (European Commission, 2020b), as well as the limited involvement of citizens and civic society in the data innovation practices of local administrations. The participants addressed in particular the first issue: they understood the challenges in forming business-to-government data-sharing relations as an obstacle, which they had to overcome for the development of city-level data ecosystems capable of producing economic and social benefits.
City representatives addressed their experiences concerning data sharing with private sector organizations in two divergent ways confirming and expanding previous knowledge on the topic (European Commission, 2020b; Mercille, Reference Mercille2021; Micheli, Reference Micheli2022). For most cases, these relations were based on public procurement and were seen as “not optimal” due to transaction costs, lack of frameworks, and budgetary constraints. Differently, cities were positively describing their experiences of “win–win” data partnerships based on mutual interests and collaboration with companies. These were less common but were praised for being more successful by aligning the interests of both partners and not weighing on the budget of local administrations. Therefore, cities aim at establishing win–win data partnerships with private sector partners, such as by experimenting service-oriented approaches providing “on-demand” solutions.
The evidence collected suggests that cities struggle also with building relations within public sector organizations and departments, which is understood as a challenging task by representatives of local administrations. Contrary to expectations, projected toward advanced and extended networks of actors from different sectors, the reality of most cities is instead a day-to-day struggle to break silos and to build bridges and relations around data among departments and public offices. The only two cities involved in the current study that managed to establish a wider and more heterogeneous network were those in which local administrations were able to build also an extended intra-municipal network, coordinating data sharing with departments and public offices. Although the results cannot be generalized, the association between a better internal data ecosystem (only among public sector actors) and the increased capability of establishing B2G data-sharing relations (between public and private sector actors) is worth noting. This finding suggests that increased coordination and cooperation, through data sharing, among a city public sector departments and organizations could be a prerequisite and a strategic “first step” toward establishing wider data ecosystems.
Another gap between expectations and realities concerns the approaches adopted for data management and integration. While federated data-sharing initiatives are suggested as preferable by different prominent European initiatives (ref. IDSA, GAIA-X, and European strategy for data), our analysis of the seven cities’ data ecosystems shows a different picture. Local administrations use different technologies and architectures for sharing the data, but all of them are, to a large extent, centralized and steered by the city authorities. This centralized approach has obvious advantages, especially for the authorities, who act as a gatekeeper and orchestrate the development of the city data infrastructures. At the same time, the benefits from this setting might be sub-optimal in addition to the risks of being locked into a particular proprietary technology or cloud infrastructure. The cities that took part in the study understood centralized data integration (such as setting up a data warehouse) as a first step toward performing more advanced analytical work. According to their view, the establishment of a single repository would allow to finally break silos, set the basis for turning data into actionable information, and prepare the ground for more members to join the ecosystems.
Our findings highlight that city governments understand local data ecosystems as the result of a step-by-step process. To be able to set up data ecosystems with a wider network of actors, cities feel the urgency to enhance data sharing internally first breaking technological silos across departments and public offices (for instance, with centralized data infrastructures). Informed by this finding, future empirical research into the practicalities of local data ecosystems could adopt and longitudinal perspective and explore the temporal dimension. For instance, by examining more in depth, the steps needed to progress toward establishment of wider and heterogeneous socio-technical networks for data sharing and use.
The actors’ focus on technical matters, over organizational one, is another remarkable finding of the study. The reasons for such emphasis could depend on several factors, for instance, technicalities are an easier area to concentrate on than the social issues that are connected to the appropriate use of technology. However, the observation that social and cultural aspects are not addressed as much as the technical ones does bear substantial risks. First, a lack of understanding of the underlying societal and behavioral system might result in wrongly perceived user requirements. Consequently, a technical solution might not address the actual issues at hand. Second, without rooting the practice of data sharing into societal structure and a cultural shift of all actors involved, there is a risk to fail in the adoption of the desired governance approach. Third, focusing on technical solutions might introduce a dependency on the technology used, and lead to vendor lock-in or lack of flexibility regarding technological changes.
To conclude, the findings of this empirical study could inform policymakers on the key areas to address and support with dedicated measures and interventions. An overall recommendation that derives from the current study is the need to enhance the capacity of local governments to deal with data, not only at the technical level, but also at the legal, organizational, and cultural ones, which tend to be overlooked by cities as well. Such improved capacity could be a prerequisite for setting up wider and sustainable data ecosystems that enable the creation of better public services and data-driven policies. As the findings show, “data governance skills” are needed for setting up data-sharing relations both within and across single organizations. Local administrations need to develop skills to and be supported in understanding the incentives of other stakeholders to share data with them, establishing collaborative “win–win” relations and coordinating with partners, setting up ad hoc legal instruments or on-demand solutions for data sharing, and understanding what opportunities data has to offer. Even if technical capacity is required for setting up the appropriate data infrastructures, it needs to be complemented with data skills in the legal, organizational, and cultural realms.
Acknowledgments
The article is part of a research project “Regional and local data-driven innovation through collective intelligence and sandboxing,” implemented by the research team of the consortium consisting of Open Evidence, Intellera Consulting, Technopolis Group, and Open & Agile Smart Cities (OASC), in collaboration with the European Commission Join Research Centre (Unit B.6 Digital Economy). The results of the article have benefited from the contribution of other researchers, to whom the authors are extremely thankful: Carlo Montino, Emanuele Rebesco, Cornelia Dinca, and Morten Rasmussen.
Funding statement
This research received no specific grant from any funding agency, commercial or not-for-profit sectors.
Competing interest
The authors declare no competing interests exist.
Author contribution
Conceptualization: M.M., S.S., A.K., C.C.; Data curation: M.G.; Methodology: C.C., G.L., M.G.; Reviewing and editing: M.M., S.S., A.K.; Writing original draft: G.L., C.C., M.G. All authors approved the final submitted draft and revised version.
Data availability statement
Data availability is not applicable to this article as no new data were created or analyzed in this study.
Comments
No Comments have been published for this article.