Policy Significance Statement
It is necessary to widen access to valuable data to contribute to the public good and address a persistent data asymmetry in society that disempowers consumers and consolidates the power of a few market players. Given that most data are generated by the private sector, a critical policy challenge is to increase trust in data sharing among firms. Yet, when offered opportunities to engage in data sharing, firms can hesitate, as they might perceive that they lack enough control over their data in the face of unmanageable risks. This paper addresses this question by proposing, analyzing, and assessing the trustworthiness of seven trust-enabling mechanisms.
1. Introduction
Society needs to urgently broaden access to valuable data generated in the private sector to contribute to the public good (Kirkpatrick, Reference Kirkpatrick2013; Alemanno, Reference Alemanno2018; Susha et al., Reference Susha, Rukanova, Ramon Gil-Garcia, Tan and Hernandez2019a). Fortunately, data can be exploited multiple times by different actors “without reducing the amount of data available to anyone else” (Tonetti and Jones, Reference Tonetti and Jones2020, p. 2819). Data sharing is expected to accelerate scientific research in key areas such as medicine and environmental sciences (Tenopir et al., Reference Tenopir, Allard, Douglass, Aydinoglu, Wu, Read and Frame2011), while also proving beneficial in addressing global challenges such as pandemics and migration (Spyratos et al., Reference Spyratos, Vespe, Natale, Weber, Zagheni and Rango2019; Iacus et al., Reference Iacus, Santamaria, Sermi, Spyratos and Vespe2020). From an economic perspective, the aggregation of data from different sources has the potential to generate economies of scope, which exist when the cost of production is lowered through diversification (Teece, Reference Teece1980). Instead of keeping data separated in silos, two or more parties can share complementary datasets to derive more insights and value while decreasing the total amount spent per dataset (Martens, Reference Martens2021). In short, data sharing is practically feasible, economically viable, and advantageous in numerous instances.
Most data sharing occurs in a bilateral Business-to-Business (B2B) context, as businesses possess the necessary technological means and a unique position in society to collect, retain, share, and profit from data generated by a large number and range of actors (Spiekermann, Reference Spiekermann2019). However, despite the abundance and value of datasets, widespread data sharing and trading remain limited (Koutroumpis et al., Reference Koutroumpis, Leiponen and Thomas2020). The status quo threatens to further widen a data asymmetry in society that disempowers consumers and consolidates the power of a few market players (Alemanno, Reference Alemanno2018; Zuboff, Reference Zuboff2019). Thus, unleashing the potential of private sector data sharing becomes a critical policy challenge (Koutroumpis et al., Reference Koutroumpis, Leiponen and Thomas2020).
Recognizing the risks and benefits at stake, the European Commission designed the European strategy for data, which includes the creation of common European data spaces for sector and domain-specific initiatives to spur competition and innovation while addressing societal challenges and improving public services (European Commission, 2020a). Essential to the strategy are the Data Governance Act and Data Act, which have entered the implementation phase and govern two regulatory models for data sharing—voluntary and mandatory.
While opening channels for businesses to share data is an integral tactic to this strategy, a significant barrier arises due to the distrust among firms that sharing data might harm their business interests. To overcome this trust deficit, policymakers can implement “trust-enabling mechanisms,” which can be succinctly defined as measures and tools to build business confidence around data sharing.
1.1. Trust as the glue for data sharing
In the official communication of the European strategy for data, lack of trust among businesses is cited as a major cause for the suboptimal levels of data sharing (European Commission, 2020a). The rationale behind the policy interventions of the Data Governance Act and Data Act is rooted in the recognition that without enough legal clarity about who can use the shared data for what objectives, businesses will fear that their data will be used for unagreed purposes by third parties against their business interests. This deficit of trust is present in both B2B and Business-to-Government (B2G) data sharing relations (Klievink et al., Reference Klievink, Van Der Voort and Veeneman2018; Spiekermann, Reference Spiekermann2019).
Trust is understood as a “relationship in which an agent (the trustor) decides to depend on another agent’s (the trustee) foreseeable behaviour in order to fulfil his expectations” (Taddeo and Floridi, Reference Taddeo and Floridi2011, p. 1). In this context, the trustor is a private firm that holds valuable data but is vulnerable to the behavior of a trustee, who might defy its expectations by exploiting data at its expense. Because data sharing is still a relatively new phenomenon, this lack of trust in data sharing could be attributed to the absence of opportunities to cultivate trust in the past. Yet, building trust will continue to be particularly challenging in scenarios where data sharing creates unmanageable conditions of uncertainty for the trustor.
Under such circumstances, the “perceived risk” by a firm, which refers to the perception of experiencing a loss as a result of an uncertain event, ought to be taken into account (Agahari et al., Reference Agahari, Ofe and de Reuver2022, p. 1582). Equally important is the notion of “usage control,” or a firm’s perceived capacity to influence how the data are used by others (Zrenner et al., Reference Zrenner, Möller, Jung, Eitel and Otto2019, p. 478). These two perceptions enable and disable trust, indicating that balancing the two is necessary. Nevertheless, the notion of “trust” transcends interpersonal relations. In this context, this one refers to the subjective beliefs of decision-makers in a company. Therefore, a more comprehensive understanding is required to account for the dynamics of trust production when an external agent, like a regulator, implements trust-enabling mechanisms to instil trust.
While understanding businesses’ risk and usage control perceptions toward trust-enabling mechanisms is key, assessing the trustworthiness of these and the regulatory attempts aimed at producing trust through them is also crucial. Thus, the paper considers whether such measures and tools can operate in a fair, equitable, and transparent manner, which opens up the space to examine the institutional dimension of trust production (Bodó, Reference Bodó2021).
1.2. Methodological remarks
Addressing the question of how to overcome the barriers to enable trust in data sharing scenarios involving businesses becomes crucial to contributing to the success of the European strategy for data and similar initiatives. In light of this challenge, the paper poses the following guiding research question: Can trust-enabling mechanisms to increase data sharing be trusted?
Section 2 presents the problem statement by illustrating how three types of common risks—competition, privacy, and reputational—generate distrust when businesses engage in three emerging approaches to sharing data: data marketplaces, data collaboratives, and data philanthropy, respectively. The selection of these exemplary cases has been made taking into consideration that, despite being perceived as avenues to unlock the value of private sector data, they have failed to scale to a satisfactory extent (Lev-Aretz, Reference Lev-Aretz2019; Spiekermann, Reference Spiekermann2019; Ruijer, Reference Ruijer2021).
Section 3 departs from the premise that, in order to increase trust, it is essential to first enable it by increasing the usage control perceived by firms and mitigating perceived risks. Given the multidimensional complexity of data sharing and the interplay of technological and social factors that shape trust in this practice, this is expected to be achieved by exploiting technological, legal, and organizational trust-enabling mechanisms. Seven diverse types have been selected and analyzed from grey and academic literature due to their potential to increase data sharing. From a critical angle, the capacity of these to operate in a trustworthy manner is assessed.
Finally, Section 4 discusses the regulatory context in the EU, applying the insights drawn from the previous sections to the Data Governance Act and Data Act and assessing the regulatory attempts to build trust around data sharing.
2. Perceived risks
Among the most common risks faced by businesses hindering data sharing are competition, privacy, and reputational risks. To exemplify how these risks can affect a firm’s behavior, they are, respectively, introduced in the context of three emerging models for sharing data: data marketplaces, data collaboratives, and data philanthropy. It is worth noting that the list of risks extends well beyond those discussed in this section and that there are important synergies among them—they can empower each other.
2.1. Data marketplaces and competition risks
As multisided platforms for connecting data holders with data users, data marketplaces facilitate the collection and aggregation of data from various sources, allowing it to be processed, refined, and traded (Stahl et al., Reference Stahl, Schomm, Vossen and Vomfell2016; Abbas et al., Reference Abbas, Agahari, van de Ven, Zuiderwijk and de Reuver2021). The well-functioning of data marketplaces is perceived as essential for unlocking the value of business data across sectors (European Commission, 2018). However, their unsatisfactory growth shows that, despite the possibility of making profits by monetizing their data, firms continue to be discouraged due to persistent competition risks (Spiekermann, Reference Spiekermann2019). Although significantly different across industries, data marketplaces in the automotive industry are relatively more developed (Bergman et al., Reference Bergman, Abbas, Jung, Werker and de Reuver2022), offering an example to analyze how competition risks work in practice.
Connected cars produce massive amounts of in-vehicle data that can be used by businesses to optimize performance, ensure safety, enhance automatization, etc. (Siegel et al., Reference Siegel, Erb and Sarma2017). Car manufacturers or original equipment manufacturers (OEMs) have exclusive access to the generated data as it can be directly transferred to their servers. As a result, other firms within the ecosystem are excluded from competing, leading to less consumer choice and innovation (Kerber, Reference Kerber2018). As a result of this market failure due to monopolistic dynamics, a policy intervention was deemed necessary. In response, the EU formulated the Sustainable and Smart Mobility Strategy and proposed a common European mobility data space to broaden access to these data, among others, through the promotion of data marketplaces (European Commission, 2020b).
A major reason contributing to the unsatisfactory growth of data marketplaces in the automotive industry is that OEMs are concerned about losing control over their data, as this one could be exploited against their business interests (Agahari et al., Reference Agahari, Ofe and de Reuver2022). Retaining access to the insights, their data might generate becomes a safer bet than trusting others with it, even if this entails renouncing profits. Furthermore, even if an OEM sells its data through a data marketplace to a selected third party that perceives it poses no competition risks a priori, there is no absolute guarantee that these data will not be leaked and fall into the hands of a direct business competitor. In other words, firms have strategic concerns about trusting others with their data, as there is a possibility of losing their competitive advantage if commercially sensitive data fall into the hands of business competitors (van den Broek and van Veenstra, Reference van den Broek and van Veenstra2015; Martens and Zhao, Reference Martens and Zhao2021).
2.2. Data collaboratives and privacy risks
Data collaboratives, which are understood as cross-sector partnerships involving private and public actors in the collection, processing and/or exchange of data to address a specific social issue, have been underscored by the expert group on B2G Data Sharing appointed by the European Commission (Susha et al., Reference Susha, Janssen and Verhulst2017a, p. 2691; European Commission, 2020c). Exploiting data collaboratives and similar forms of data-driven social collaborations that engage a variety of actors in the sharing of data is a major component of the European strategy for data (European Commission, 2020a). Businesses can be incentivized to participate in these partnerships through both financial and non-financial means (Susha et al., Reference Susha, Janssen and Verhulst2017b). Nonetheless, privacy risks are an important barrier to the growth of this novel approach to data sharing.
It makes a big difference for firms to share personal or non-personal data. Due to the presence of privacy regulations, when personal data enter the equation, companies adopt a more stringent and hierarchical control mechanism (even in situations where there are no apparent risks of privacy breaches) (van den Broek and van Veenstra, Reference van den Broek and van Veenstra2015). Processing and sharing personal data in the EU entail complying with the General Data Protection Regulation (GDPR), which establishes six legal bases for processing data, enforces seven data protection principles, and grants eight rights to data subjects (European Commission, 2016). To avoid going through the complex process of adhering to this regulation and not facing any penalties for non-compliance, a company might refrain from participating in data collaboratives. Moreover, even for non-personal data, re-identification risks continue to persist, given that the combination of data from various sources (e.g., geographical data containing addresses) can lead to such risks (van den Broek and van Veenstra, Reference van den Broek and van Veenstra2015).
When the sharing of data occurs across multiple entities, the likelihood of security breaches and unauthorized access that exposes the identity of individuals increases. Things get even trickier when a data collaborative involves the sharing of data across borders, as participant firms risk overlooking other countries’ privacy laws. Ultimately, privacy risks can result in regulatory risks (i.e., the legal and financial repercussions of failing to adhere to data protection regulations) as well as reputational risks.
2.3. Data philanthropy and reputational risks
Data philanthropy, also referred to as data donorship, is the donation of data by firms for a humanitarian purpose (Kirkpatrick, Reference Kirkpatrick2013; UN Global Pulse, 2014; Taddeo, Reference Taddeo2017). The purpose of a data philanthropy project is closely linked to that of data collaboratives, namely, contributing to a social cause (Susha et al., Reference Susha, Grönlund and Van Tulder2019b). Yet, a fundamental difference between the two approaches is that data philanthropy exclusively engages private firms in the sharing of data without a profit incentive (Taddeo, Reference Taddeo2017). In addition, data collaboratives are more encompassing and imply a greater degree of collaboration among participants than data philanthropy (Lev-Aretz, Reference Lev-Aretz2019).
The expert group on B2G data sharing considered data philanthropy to be a form of corporate social responsibility key for making data available during public emergencies (European Commission, 2020c). An example is Facebook’s Data for Good initiative, which contributed to the response to the COVID-19 emergency by making mobility datasets available to Italian health researchers (Kang-Xing and McGorman, Reference Kang-Xing and McGorman2020; Scotti et al., Reference Scotti, Pierri, Bonaccorsi and Flori2022). In this apparent win–win data sharing scenario, researchers could access valuable data and Facebook could expand its market reach and improve its brand’s reputation. Nonetheless, despite providing many marketing opportunities for firms, data philanthropy can also produce negative outcomes that damage their reputation (Lev-Aretz, Reference Lev-Aretz2019).
Firms are exposed to reputational risks when data sharing results in negative public perceptions of them. When sharing data with others, there is a risk that these data are used for purposes other than those that were agreed upon, including unethical ones. Such mishandling of data could produce unintended consequences that might result in a public backlash. In addition, firms might be hesitant to engage in data philanthropy because they might not want to disclose what type of data they collect and how much of it. Revealing this information could create negative perceptions and open the door for misconceptions. Lastly, a company may fear sharing inaccurate or incomplete data that lead to erroneous conclusions.
3. Trust-enabling mechanisms
The previous three examples show that the success of emerging forms of data sharing with the potential to contribute to tackling societal challenges (as envisioned by the European Commission) is dependent on mitigating associated business risks. Against this backdrop, trust-enabling mechanisms enter the picture due to their capacity to foster trust by mitigating perceived risks and increasing usage control.
Relying exclusively on technological solutions to solve social problems can fail to address the issue and lead to miscalculated outcomes (Morozov, Reference Morozov2014). Building trust should not exclude the more social and cultural aspects in which this form of social capital is intrinsically embedded; hence, technological solutions must be complemented by governance and external accountability mechanisms (Bodó, Reference Bodó2021). This more holistic approach aligns more adequately with the European Commission’s emphasis on creating a data sharing culture (European Commission, 2020a). Furthermore, the cost of certain trust-enabling technologies is often established by market dynamics and can entail significant expenses.
Without rigidly classifying them as either technological, legal, or organizational, seven trust-enabling mechanisms (privacy-enhancing technologies, data intermediaries, data exchange platforms, government support, data sharing agreements, regulatory sandboxes, and data stewardship) to increase data sharing are introduced, emphasizing their individual and collective contribution to mitigating perceived risks and increasing usage control, as well as their capacity to operate in a fair, equitable, and transparent manner.
3.1. Privacy-enhancing technologies
Privacy-enhancing technologies (PETs) consist of a range of cryptographic as well as non-cryptographic techniques and methodologies to protect data from malicious forces and safeguard privacy (Heurix et al., Reference Heurix, Zimmermann, Neubauer and Fenz2015). In a data sharing scenario, the involvement of a trusted third party may be required to mediate between two or more parties and to possibly act as a certification authority to handle user registration and authentication (Heurix et al., Reference Heurix, Zimmermann, Neubauer and Fenz2015).
PETs have varying degrees of applicability, so there is no one-size-fits-all solution to every data sharing scenario. The suitability of each PET is highly context-dependent, as each has its own specific strengths and limitations. For instance, secure multiparty computation,Footnote 1 federated learning,Footnote 2 and trusted execution environmentFootnote 3 regulate access to the data (Beaver, Reference Beaver1992; Sabt et al., Reference Sabt, Achemlal and Bouabdallah2015; Kairouz et al., Reference Kairouz, McMahan, Avent, Bellet, Bennis, Bhagoji, Bonawitz, Charles, Cormode, Cummings, D’Oliveira, Eichner, El Rouayheb, Evans, Gardner, Garrett, Gascón, Ghazi, Gibbons, Gruteser, Harchaoui, He, He, Huo, Hutchinson, Hsu, Jaggi, Javidi, Joshi, Khodak, Konecní, Korolova, Koushanfar, Koyejo, Lepoint, Liu, Mittal, Mohri, Nock, Özgür, Pagh, Qi, Ramage, Raskar, Raykova, Song, Song, Stich, Sun, Suresh, Tramèr, Vepakomma, Wang, Xiong, Xu, Yang, Yu, Yu and Zhao2021), while homomorphic encryptionFootnote 4 and zero-knowledge proofsFootnote 5 hide the data (Rivest et al., Reference Rivest, Adleman and Dertouzos1978; Goldwasser et al., Reference Goldwasser, Micali and Rackoff1989) and differential privacyFootnote 6 and anonymizationFootnote 7 camouflage it (Samarati and Latanya, Reference Samarati and Latanya1998; Dwork and Roth, Reference Dwork and Roth2013). Moreover, in a data sharing scenario, the use of multiple PETs may be required (even in combination with other technologies like blockchain) (Jia et al., Reference Jia, Zhang, Liu, Zhang, Huang and Liang2022).
PETs can be difficult and costly to implement (Eurich et al., Reference Eurich, Oertel and Boutellier2010) and may contradict ethical principles such as data minimization (i.e., limiting the collection of data to what is absolutely necessary). Because PETs cannot be flawless, they can produce a false sense of security (Renieris, Reference Renieris2021). Furthermore, it is unclear whether the mere presence of PETs would enable trust or if a sufficient understanding of the technique is necessary by the parties involved in the sharing of data. In addition, another persistent challenge in implementing PETs is securing information while maximizing the utility of the data. Ultimately, as PETs continue to evolve and be employed in combination with other technologies, questions about their efficacy remain, especially in light of the present and emerging instruments employed for re-identification purposes (Ohm, Reference Ohm2010).
3.2. Data intermediaries
Data intermediaries are an emerging type of actor in the data economy that can be broadly defined as mediators “between those who wish to make their data available and those who seek to leverage that data” (Janssen and Singh, Reference Janssen and Singh2022a, p. 2). In certain commercial and non-commercial settings, public and private actors can coordinate the supply and demand of personal and non-personal data more efficiently through a data intermediary that provides the necessary infrastructure to match data holders and users (Richter and Slowinski, Reference Richter and Slowinski2019).
Data intermediaries can reduce transaction costs (e.g., by guaranteeing interoperability), exploit economies of scale, and capture positive externalities through the application of new technologies and organizational techniques (Martens et al., Reference Martens, de Streel, Graef, Tombal and Duch-Brown2020). They can provide a robust contractual framework for enforcing obligations and facilitating compliance with data protection regulations (von Grafenstein, Reference von Grafenstein2022; Richter, Reference Richter2023). Regarding risks related to data security and privacy, neutral tools for managing data access and permissions can be supplied by an intermediary, which could also act as a trusted third party to supervise the use of PETs. Furthermore, intermediaries can incorporate in data sharing agreements contractual transactions absent in B2B contexts that mitigate post-contractual risks (Martens et al., Reference Martens, de Streel, Graef, Tombal and Duch-Brown2020, p. 29). For example, Advaneo Footnote 8 is a data marketplace for businesses to monetize and manage their data in a privacy-preserving way, setting rights and obligations for buyers and sellers. Thanks to its architecture, data providers always retain sovereignty over the raw data, which is only accessible through metadata.
Although the central position of data intermediaries like data marketplaces is advantageous for mitigating power and information asymmetries, a key informational challenge persists due to the intrinsic characteristics of data. The complexity of estimating its value prior to its utilization makes data an asset that can be more difficult to share and trade than tangible goods (Spiekermann, Reference Spiekermann2019). Buyers might not seek to acquire certain data without a clear vision of the benefits it can provide. Meanwhile, a seller might decide not to assume the abstract risks (e.g., competition, privacy, and reputational risks) involved in disclosing data without a sufficient understanding of its concrete value (von Grafenstein, Reference von Grafenstein2022). As a result, data can be perceived as overvalued for buyers and undervalued for sellers. Therefore, intermediaries cannot avoid that when data are priced according to market dynamics, in which buyers and sellers negotiate its cost, this process will reflect the relative bargaining power and information held by the parties.
While at first, a data intermediary may appear to be neutral in a way that avoids prioritizing the interests of the parties involved, its role, model of governance and, incentive structures are highly contextual, so the strategic aims of the organization will be reflected in its business model (Richter and Slowinski, Reference Richter and Slowinski2019). In other words, the growth objectives of the entity behind the data intermediary will influence the way it operates, possibly affecting its neutrality. How a data intermediary configures its services, pricing strategies, partnerships, technologies and other aspects of its business is likely to reflect its long-term goals, which can influence its capacity to enable trust.
3.3. Data exchange platforms
Data exchange platforms provide businesses with the software tools to share their data. To enable trust, their architecture ought to be designed in a way that maximizes usage control and minimizes perceived risks, with a data intermediary potentially serving as the organization responsible for overseeing the management of these data sharing systems.
Merely determining who can access specific data may be insufficient for a business that considers it crucial to also define the purposes for which its data must be used or the duration for which the data are made available (Pearson and Casassa-Mont, Reference Pearson and Casassa-Mont2011). To further increase usage control, data exchange platforms could allow a company to monitor data usage in real time. This way, a company will be more likely to trust other agents with its data, as it can reverse or amend its decisions if expectations are not met (Carballa Smichowski et al., Reference Carballa Smichowski, Duch Brown and Martens2021, p. 4).
An example of a data exchange platform is the Snowflake Data Exchange Footnote 9, which allows organizations to share data with business partners. This private platform gives users the possibility to supervise access, audit usage, and implement security controls. Regarding the latter, secure authentication mechanisms and intrusion detection systems can provide a greater sense of control to enable trust. Other platforms like Dawex Footnote 10 operate with distributed technologies such as blockchain to lessen the need to place trust in others by constraining data users’ actions. To ensure privacy and further protect data, PETs can be integrated into these platforms. Such a mix of technological solutions employed to enable trust in data sharing illustrates the societal shift from interpersonal trust relations mediated by humans to trust produced by technological intermediaries.
Even if a platform claims to have robust security measures, the lack of control over the data can be unacceptable for certain businesses, which must ensure business confidentiality and comply with relevant privacy and data protection regulations. Furthermore, purchasing the services of a platform can be costly. Hence, governments can intervene by constructing infrastructure to operate data exchanges and offering them as a public service alternative. Gaia-X Footnote 11 is the most prominent example in the European context. In addition, notable examples include the Asynchronous Data Exchange (ADEX) Footnote 12 in Singapore, the Amsterdam Data Exchange (AMdEX) Footnote 13, and the Shanghai Data Exchange Footnote 14 (which saw the trading of data products surpass $1 billion in 2023).
3.4. Government support
Government support for enabling trust in private sector data sharing can be implemented through regulatory frameworks that establish guidelines for sharing data within or across different sectors. This is crucial, as the lack of legal recognition might have contributed to the unsatisfactory growth of the three approaches previously discussed: data marketplaces, data collaboratives, and data philanthropy (Lev-Aretz, Reference Lev-Aretz2019; Spiekermann, Reference Spiekermann2019; Susha et al., Reference Susha, Grönlund and Van Tulder2019b). A regulatory framework can increase accountability, an important element to enable trust (Bodó, Reference Bodó2021).
A more structured regulatory environment with clearer guidelines can pave the way for government support to take the form of incentives, including subsidies, tax breaks, and grants. For example, Data4Industry-X Footnote 15 is a decentralized data exchange for Industry 4.0 that has been backed by the French government’s France 2030 initiative and the Next Generation EU funding program.
Nevertheless, unlocking the value of data is partly the result of constant engagement among stakeholders, trial-and-error experimentation and harmonizing conflicting interests (Günther et al., Reference Günther, Rezazade Mehrizi, Huysman and Feldberg2017). Taking shortcuts through government support is susceptible to several challenges. This one can backfire if it is perceived as overly stringent and an impediment to firms’ ability to engage in data sharing independently. Moreover, an enforceable legal framework like the Data Governance Act and Data Act can lead to risk aversion among firms that want to avoid complications. Government support can also lead to regulatory capture, where the interests of certain industry players are prioritized over others. Moreover, in highly evolving data-driven industries, support can arrive late due to the difficulties of reaching agreements swiftly. Ultimately, since public institutions can build and undermine trust, supporting data sharing can backfire and erode trust in public institutions (Bodó, Reference Bodó2021).
3.5. Data sharing agreements
An agreement can enable trust by increasing transparency and further guaranteeing accountability among the parties, thereby contributing to firms’ confidence that their rights are defined and backed by legal assurances. Such a framework clarifies the roles of the parties involved, the purposes of sharing data, the procedures to be followed and the standards to be met (Information Commissioner’s Office, 2021; Sitra, 2022). Regarding special categories of data, agreements are complemented with additional clauses that deal with relevant data protection principles. In the case of ambiguities, the data holder can impose specific restrictions, exceptions, or territorial limitations on the use of data as well (Association of Banks in Singapore, 2019).
The scope of the data sharing agreement can be delineated and any technical terms can be defined in a glossary. Responsibilities should be assigned to the parties and their rights and obligations should be defined. Additional clauses of the agreement can include elements such as confidentiality, intellectual property, liabilities, force majeure, auditing, termination, validity, and dispute resolution. For certain activities, agreements state the specific purposes for which the data will be used and the reasons why that particular data are needed to help ensure that the data will not be misused. Furthermore, a section stating the penalties for non-compliance can be added. Finally, any relevant laws and regulations should be invoked.
Nonetheless, even a well-defined agreement has important drawbacks to consider. Since contracts will always be open to interpretation, there will inevitably be room for disputes. Additionally, the framework cannot be future-proof because of its inability to account for unforeseen events. This situation creates a paradox: The agreement must strike a balance between clarity to enable trust while remaining flexible enough to accommodate the latest developments in the changing technological landscape of data-driven industries. Furthermore, as business strategies change, what was once a mutually beneficial agreement could turn into a situation where the interests of the parties diverge and, in cases where the contract is not adequately enforced, trust will be eroded.
3.6. Regulatory sandboxes
Regulatory sandboxes are a novel regulatory tool in emerging industries to foster innovation while ensuring consumer protection (Allen, Reference Allen2019). They provide a controlled environment for companies to test new products and services under the supervision of regulators (European Council, 2020). By granting exemptions from certain regulatory requirements, this tool offers an opportunity for firms to engage in less risky data sharing initiatives in order to understand how new data sharing technologies work in practice (Datasphere Initiative, 2022).
For instance, the government of Singapore invited firms interested in working with PETs to participate in a sandbox.Footnote 16 In a similar vein, to foster innovation, the government of Japan encouraged firms operating in the country to test Internet of Things (IoT), artificial intelligence (AI), and blockchain technologies.Footnote 17 The field of IoT can especially benefit from regulatory sandboxes, as there is an increasing number of data-driven social partnerships between public, private, and non-governmental actors based on the collection and aggregation of data from devices (Susha et al., Reference Susha, Rukanova, Ramon Gil-Garcia, Tan and Hernandez2019a).
Although sharing data in a supervised setting within a specified territory for a predetermined period can enable private sector trust, it should be noted that sandboxes can result in a kind of government-granted privilege in favor of participant organizations at the expense of non-participant competitors and newcomers (Poncibò and Zoboli, Reference Poncibò and Zoboli2022). A paradox arises when a sandbox achieves its goal of being attractive enough for firms to want to join but weakens those firms that did not join (Poncibò and Zoboli, Reference Poncibò and Zoboli2022). Despite lowering the entry barriers in innovation industries for newcomers, they can also weaken competition, creating a division between participants and non-participants. Since sandboxes can be considered a form of government support, it could be the case that, when not implemented properly, trust in both data sharing and public institutions is eroded.
3.7. Data stewardship
Data stewardship is an approach to the governance of data that integrates technical and organizational infrastructures in an organization to responsibly collect, store, use, and share data (Rosenbaum, Reference Rosenbaum2010; Stalla-Bourdillon et al., Reference Stalla-Bourdillon, Carmichael and Wintour2021). The hands-on management and care of data on a day-to-day basis can be taken on by the role of a data steward (GovLab, 2019), an agent responsible for the operational aspects of an organization’s data governance, that is, the actual policies, methods, and procedures to manage data (Rosenbaum, Reference Rosenbaum2010; Plotkin, Reference Plotkin and Plotkin2014). The expert group on B2G data sharing referred to data stewards as “champions of data sharing” (European Commission, 2020c, p. 37).
Lessons drawn from the field of medical data donation show that accountability and transparency are key normative trust-enablers to increase data sharing (Vayena et al., Reference Vayena, Haeusermann, Adjekum and Blasimme2018). To ensure accountability, a data steward can be in charge of a system to handle complaints, investigate incidents, and perform other functions for managing conflicts within or across organizations. Meanwhile, stewards can promote transparency by providing clear information about the data, including how it was collected, what it represents, and how it can be used. Furthermore, by adhering to the findability, accessibility, interoperability, and reusability (FAIR) principles for data management (Wilkinson et al., Reference Wilkinson, Dumontier, Aalbersberg, Appleton, Axton, Baak and Mons2016), data stewards can build trust and encourage firms hesitant to share data to engage in this practice (Ada Lovelace Institute, 2021).
The implementation of data stewardship in an organization can be complex and costly. It might require organizational restructuring and extensive training. Aside from a company’s own policies, stewards need to be knowledgeable about the latest data privacy and security regulations. As a new occupation, the role of data stewards can be unclear and confused with other ones present in an organization. Ultimately, it should be avoided that the responsibilities of a data steward are not ever-expanding in a way that they result in a single point of failure.
4. Regulatory context
The expert group on B2G data sharing emphasized the importance of promoting voluntary collaboration to the greatest extent possible, aligning the interests of involved parties for the common good (European Commission, 2020c). However, voluntary collaboration is not always adequate, as there are instances where the commercial interests of a firm cannot be prioritized over the general welfare of society, which forces the government to interfere and oblige organizations to share data (Mercille, Reference Mercille2021).
Despite the need to move beyond voluntary data sharing initiatives, aggressively advocating for mandatory B2G data sharing can be regarded as too interventionist in the EU, where industry self-regulation and market processes tend to set the standards for sharing data (Martens and Zhao, Reference Martens and Zhao2021, p. 8). Part of the European tradition of regulating the economy is a bottom-up, industry-driven approach, which, despite being slower than other approaches in producing results due to conflicting interests among stakeholders, is relatively more successful in protecting private property, privacy, and other individual rights (Martens and Zhao, Reference Martens and Zhao2021; Roberts et al., Reference Roberts, Cowls, Morley, Taddeo, Wang and Floridi2021a). Hence, in the EU, accessing private sector data must be justified on the basis of a well-defined public objective; in other words, mandatory B2G is often regarded by public authorities as a “last resort” among the policy options to increase access to private sector data (Richter, Reference Richter2021, p. 538).
This contrasts, for example, with the Chinese approach, which “seeks to maximize the social value of data” through channels that would be politically unfeasible in the EU, which place national interest and collective welfare above private interests and individual welfare (Martens and Zhao, Reference Martens and Zhao2021, p. 3). Yet, observations made by Chinese researchers suggest that top–down mandates are not enough to boost data sharing among the scientific community and that more guidelines based on principles as well as incentive mechanisms are needed (Li et al., Reference Li, Cheng, Wang, Wang, Ran, Che and Zhao2021). Thus, there is a need to find an adequate balance between the voluntary and mandatory models (Vigorito, Reference Vigorito2022).
Shkabatur (Reference Shkabatur2019, p. 354) proposes that to get access to private sector data, policymakers can adopt an “open, collaborative, and incentives-based stance.” Richter (Reference Richter2021) also emphasizes that governments can make use of soft law approaches that are less interventionist and more based on incentives. These incentives can better contribute to building a data sharing culture where more stakeholders are aware of the benefits of sharing data and are willing to engage in collaboration (European Commission, 2020c). In the long run, it is expected that the framework and procedures established to govern the sharing of data between private firms and public authorities will also contribute to the development of B2B data relations (European Commission, 2020c).
Broadening access to private sector data through voluntary and mandatory approaches shares similarities with other regulatory challenges where there is a need to find a middle ground between prioritizing public or private interests. The example of intellectual property rights highlights this tension. Like with data, their exclusive control limits public access to valuable information, reflecting the anticommons dilemma: “socially suboptimal information availability because of excessive privatisation” (Koutroumpis et al., Reference Koutroumpis, Leiponen and Thomas2020, p. 657). The future of private sector data sharing might be characterized by the pursuit of a middle ground that accommodates the interests of private and public actors. Ideally for the European Commission, this one would foster a data sharing culture.
4.1. Data governance act
As the first legislative act of the European strategy for data, the Data Governance Act recognized that “action at Union level is necessary to increase trust” (European Commission, 2022a, p. 4). The Data Governance Act sets a harmonized regulatory framework for facilitating voluntary data sharing, promoting the rise of data intermediaries and advancing data altruism, a principled approach to data sharing. The regulation entered into force in June 2022 and became applicable in September 2023.
4.1.1. Data intermediaries
Chapter III of the Data Governance Act introduces new harmonizing requirements for providers of data sharing services (data intermediaries) with the intention of ensuring “the trustworthy exchange of data” (European Commission, 2022a, p. 13). These new rules translate into a soft law approach to increasing data sharing by supporting the emergence and uptake of these promising actors in the data economy.
The European Commission defined data intermediaries as “technical enablers” to harness the potential of data (European Commission, 2018, p. 11). Yet aside from the EU, the governments in the United Kingdom and Singapore are directing efforts toward creating the right conditions for data intermediaries to thrive and exchange data in innovative ways (Personal Data Protection Commission, 2020; Centre for Data Ethics and Innovation, 2021). Because legal recognition might not be enough for data intermediaries to address the massive data asymmetry in society that benefits a few hyperscalers, public institutions are supporting their growth with financial incentives such as subsidies, grants, and tax breaks. Nevertheless, it should be kept in mind that over-focusing on data intermediaries and creating excessively high expectations can undermine the development of alternative instruments with greater potential to increase data sharing (Carovano and Finck, Reference Carovano and Finck2023).
As mentioned in Section 3.2, while an intermediary can provide the necessary technical and organizational infrastructure to enable trust, an important question to address is whether it will remain neutral and independent (Richter, Reference Richter2023). In a hypothetical future where data intermediaries have scaled, it should be avoided that there is a concentration of power within a limited number of intermediaries or that these become caught up in political agendas.
4.1.2. Data altruism organizations
Chapter IV of the Data Governance Act introduces data altruism as “the voluntary sharing of data… without seeking or receiving a reward… for objectives of general interest” (European Commission, 2022a, p. 20). The Data Governance Act does not provide a definition of “general interest” but offers a series of examples where the welfare of society is at stake in the areas of health care, sustainability, mobility, and scientific research. Registered data altruism organizations (DAOs) are expected to act as trustworthy data intermediaries in charge of managing donated data. Any entity that aims to register as one must operate on a non-profit basis and simultaneously meet special transparency and technical requirements. An example is DATALOG, Footnote 18 a platform for citizens to visualize and better manage their energy expenses.
Chapter IV also introduces a data altruism consent form to harmonize the collection of consent from data subjects. A similar procedure would be beneficial in the context of private sector data. More guidance on how to grant, modify, or withdraw access to data can be provided by the rulebook introduced in Article 22 of the Data Governance Act, which will be established through the adoption of delegated acts and will specify requirements to protect the rights and interests of data subjects and data holders (European Commission, 2022a).
However, like with data intermediaries, the neutrality and independence of a DAO will be crucial for enabling trust. In theory, DAOs operate without seeking a reward other than contributing to a social cause and are allowed to share data with third parties for altruistic purposes only, yet there is the possibility that data are used with other intentions. In this regard, it should be taken into account that DAOs could have ties with or even be directly or indirectly funded by big players in the data economy seeking to get access to donated data. For firms, this creates reputational risks that can make them more reluctant to donate data to a DAO.
A possible adverse result is that, first, data altruism fails to scale because of a lack of incentives among firms to donate data. Aside from the marketing opportunities that this practice can offer, firms would be more incentivized to participate if they could learn something about their own business or the industry in which they operate. For example, in the field of environmental sustainability, a DAO could allow donors to compare indicators and derive analytical insights in order to optimize their supply chains. Second, another detrimental scenario is that firms do not engage in data altruism due to a lack of trust toward DAOs. In this regard, designing a logo to be exclusively used by a certified DAO was a positive step to building trust (European Commission, 2023).
4.2. Data Act
The Data Act is the second legislative proposal of the European strategy for data and introduces mandatory rules for accessing and using industrial data (European Commission, 2022b). The Regulation aims to make data more accessible by giving access rights, addressing unfair contractual terms that arise from vendor lock-in effects and anticompetitive practices, establishing rules for public bodies to access private sector data, and allowing customers to switch between service providers. EU policymakers came to an agreement in June 2023 that will make the Data Act applicable in 2025.
4.2.1. Mandatory B2B data sharing
The Data Act establishes requirements to share data generated by connected devices across all sectors. It introduces rights for consumers and businesses to access and share data generated by their devices with third parties, including data intermediaries. Enacting these rights would oblige data holders to make data under their control available to other parties.
During the final negotiations of the Data Act, major European and US-based data-rich firms complained that the original draft lacked enough safeguards to protect their competitive interests from potential misuses of their data by third parties (Yun Chee, Reference Yun Chee2023). The concept of “trade secret holder” was included in the final text of the legislation to let data holders protect the confidentiality of data and decline sharing it if this was likely to result in serious economic damage. As a result, the Data Act gives trade secret holders the possibility to demand that data receivers guarantee the confidentiality of data, for example, by agreeing on contractual terms, confidentiality agreements, access protocols, standards, and codes of conduct. In the event of a disagreement or failure to implement appropriate measures to safeguard the data, the data holder would be entitled to cancel the sharing of the data.
Adding these safeguards to the final text of the Data Act underscores the significance of competition risks and the need to address them. Future research could monitor how frequently the trade secret holder exemption is employed to prevent data sharing and how often its use is justified—it is not uncommon for data-rich firms to invoke trade secrecy laws to justify their exclusive control over the data they hold (Cohen, Reference Cohen2019).
4.2.2. Mandatory B2G data sharing
Chapter V of the Data Act details under what exceptional circumstances it would be mandatory for data holders to make their data available to public bodies. This includes responding to a public emergency or fulfilling “a specific task in the public interest that has been explicitly provided and defined by national law” when none of the following three alternatives for obtaining access were viable: (1) requesting it voluntarily; (2) purchasing it on the market; or (3) relying on existing obligations (European Commission, 2022b, p. 48).
The issue of including or excluding personal data from the scope of Chapter V was a matter of debate during the trilogue negotiations of the Data Act. While the European Commission and the European Council pushed for including personal data in the scope, the European Parliament proposed excluding personal data altogether (Bertuzzi, Reference Bertuzzi2023a). A major EU trade association that represents the interests of several data-rich firms raised concerns about including personal data in the scope, highlighting the risks associated with data leakages and misuses (DIGITALEUROPE, 2023). Indeed, such risks extend beyond B2B contexts and can discourage firms from sharing personal data with public authorities due to privacy and reputational risks. In the end, personal data were only included for responding to public emergencies (Bertuzzi, Reference Bertuzzi2023b).
Lessons from the B2G data sharing initiative between the European Commission and European mobile network operators to predict and contain the spread of COVID-19 show that given the sensitivity of the issue and the type of data involved, it was crucial to guarantee that the reputation of the firms involved was not harmed due to potential misunderstandings about the use of the data (Vespe et al., Reference Vespe, Iacus, Santamaria, Sermi and Spyratos2021).
Competition risks transcend B2B relations. A public body may use a firm’s data to independently develop public services or to aid a business competitor (Klievink et al., Reference Klievink, Van Der Voort and Veeneman2018). Furthermore, public bodies themselves can also operate successful commercial services (Carballa Smichowski, Reference Carballa Smichowski2018; Martens and Duch-Brown, Reference Martens and Duch-Brown2020).
5. Conclusion
The upcoming 2024 European Commission will be occupied with implementing the policies that compose the European strategy for data. In order for the Data Governance Act and Data Act to achieve their goal of increasing private sector participation in data sharing, EU policymakers can deploy a targeted set of tactics that leverage the trust-enabling mechanisms discussed in this paper in addition to others (Farrell et al., Reference Farrell, Minghini, Kotsev, Soler Garrido, Tapsall, Micheli, Posada Sanchez, Signorelli, Tartaro, Bernal Cereceda, Vespe, Di Leo, Carballa Smichowski, Smith, Schade, Pogorzelska, Gabrielli and de Marchi2023). Government support (e.g., funding, tax reductions, regulatory sandboxes, etc.) for the growth of trustworthy data intermediaries and DAOs that host data exchange platforms, steward data, guarantee privacy, and security through the use of PETs and provide a data sharing agreement will be essential but not sufficient to satisfactorily boost participation. Increasing awareness about the benefits of this practice and creating incentive structures remain two other major challenges that require closer examination by academia. Future research could take an empirical angle through surveys and interviews to understand firms’ perceived risks and incentives toward different modalities of data sharing.
Furthermore, as more data policies enter into force, providing instructional materials for companies will also be key, especially for those companies with more limited resources, such as startups and small and medium-sized enterprises (SMEs), who might require assistance in understanding the requirements, scope, and interplay of regulations, as well as the processes and technologies involved in complying with them.
Considering that the majority of data are produced by businesses, enabling the creation of trust in the private sector can have a positive effect in other realms where data can be shared (European Commission, 2020c). Furthermore, it is essential to encourage the voluntary participation of not only firms but also individuals, non-governmental organizations, and public institutions.
In the short term, mandating extensive access to private sector data appears politically unfeasible. However, as the EU’s overarching goal of strengthening its technological sovereignty, competitiveness, and resilience intensifies, in the longer term, mandatory data sharing could become a more attractive option. It would be positive if this struggle to open new channels for data to flow also contributed to tackling societal challenges and reducing the pervasive data asymmetry in society.
As a highly strategic resource, data are increasingly the object of competition between sovereign states eager to control its flow (Chander and Sun, Reference Chander and Sun2023). As a result of this struggle, multiple data access regimes continue to emerge and compete with each other (Martens and Zhao, Reference Martens and Zhao2021), making distinct sovereignty claims and increasingly geopolitizing data sharing (Amoore, Reference Amoore2018). In the public discourse, the Chinese, European, and American approaches receive special attention (Bradford, Reference Bradford2023).
The latest digital policies pushed by the EU under the digital sovereignty agenda aim to assert greater control over critical infrastructure and reduce the region’s heavy reliance on external actors (Von der Leyen, Reference Von der Leyen2019; Roberts et al., Reference Roberts, Cowls, Casolari, Morley, Taddeo and Floridi2021b). Yet, in this technological era, the U.S.A has unparalleled leverage over the infrastructure of the European digital economy, not only facilitated by the presence of American companies in key industries such as AI, e-commerce, search engines, social media, and cloud computing, but also by undersea cables, data centers, and communication networks (Farrell and Newman, Reference Farrell and Newman2023). Furthermore, China continues deploying digital infrastructure and gaining ground in high-tech not just in Europe but across the world, which could further complicate the pursuit of digital sovereignty.
While data sharing leads to better intelligence, innovation, economic growth, and other key elements for states to remain competitive at a global stage, as explained in the introduction, it is also necessary for tackling the grand societal challenges of the 21st century, which are fundamentally global. To effectively address these challenges, society may need to envision frameworks for facilitating the flow of data between businesses and governments across international borders.
Acknowledgments
The author, Jaime Bernal, initiated work on this article during a scientific traineeship at the Digital Economy Unit of the Joint Research Centre in 2022, and wishes to express gratitude for the guidance provided.
Author contribution
Conceptualization: J.B.; Data curation: J.B.; Funding acquisition: J.B.; Investigation: J.B.; Methodology: J.B.; Project administration: J.B.
Funding statement
This work received no specific grant from any funding agency, commercial or not-for-profit sectors.
Competing interest
The author declares none.
Comments
No Comments have been published for this article.