Hostname: page-component-669899f699-vbsjw Total loading time: 0 Render date: 2025-04-25T02:11:56.966Z Has data issue: false hasContentIssue false

A comparative study on public interest considerations in data scraping dispute

Published online by Cambridge University Press:  06 September 2024

Lanfang Fei*
Affiliation:
Law School, Jinan University, Guangzhou, Guangdong, China

Abstract

New technologies and the new business practices that they bring often raise difficult questions about the application of the law. This often stems from the difficulty of clarifying the impact of new technologies on the interests of different groups in society and, in particular, the difficulty of measuring the public interest brought about by new technologies. In practice, in disputes arising from new technologies, the users of the new technologies often justify their actions in terms of the public interest. In this paper, we compare data collection cases in the US, EU and China and extract the types of public interests discussed in typical cases in different countries, including (1) data-related property rights protection and commercial principles to prevent free-riding; (2) privacy, personal data protection and data security; (3) competition and innovation interests related to the free flow of data; and (4) freedom of expression. The comparison shows that an appropriate focus on the public interest in data flow has led Chinese and US courts to rule in favour of scrapers in a few recent cases, in contrast to the judicial attitude of EU courts, whicht value privacy. The author applauds the attitude of the courts in the US and China and argues that free competition and innovative interests based on data flows are public interests that should be prioritized in data scraping cases.

Type
Article
Copyright
© The Author(s), 2024. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Andrew, P (2022) Unfair collection: Reclaiming control of publicly available personal information from data scrapers. Michigan Law Review 120, 913.Google Scholar
Aridor, G, Che, Y-K and Salz, T (2023) The effect of privacy regulation on the data industry: Empirical evidence from GDPR. The RAND Journal of Economics 54, 695730.CrossRefGoogle Scholar
Brown, RS (1948) Advertising and the public interest: Legal protection of trade symbols. The Yale Law Journal 57, 11651206.CrossRefGoogle Scholar
Dreyer, AJ and Stockton, J (2013) Internet ‘data scraping’: A primer for counseling clients. New York Law Journal. Available at ’’https://www.law.com/newyorklawjournal/almID/1202610687621/Internet-’Data-Scraping’%3A-A-Primer-for-Counseling-Clients (accessed 15 July 2013).Google Scholar
Elkin-Koren, N (2020) Contesting algorithms: Restoring the public interest in content filtering by artificial intelligence. Big Data and Society 7, 2053951720932296.CrossRefGoogle Scholar
Gold, Z and Latonero, M (2017) Robots welcome: Ethical and legal considerations for web crawling and scraping. Washington Journal of Law, Technology and Arts 13, 275. Google Scholar
Goldfein, S and Keyte, J (2017) Big data, web ‘scraping’ and competition law: The debate continues. New York Law Journal 258, 13.Google Scholar
Grover, V, Chiang, RHL, Liang, T-P and Zhang, D (2018) Creating strategic business value from big data analytics: A research framework. Journal of Management Information Systems 35, 388423. CrossRefGoogle Scholar
Hirschey, JK (2014) Symbiotic relationships: Pragmatic acceptance of data scraping. Berkeley Technology Law Journal 29, 897.Google Scholar
Ingrams, A (2019) Public values in the age of big data: A public information perspective. Policy and Internet 11, 128148.CrossRefGoogle Scholar
Ioannis, D (2019) Liability for data scraping prohibitions under the refusal to deal doctrine: An incremental step toward more robust Sherman Act enforcement. University of Chicago Law Review 86, 1901. Google Scholar
ITR (2022) 2022 Imperva Bad Bot Report. San Mateo, CA: Imperva, Inc.Google Scholar
Jamie, WL (2018) Automation is not hacking: Why courts must reject attempts to use the CFAA as an anti competitive sword. Boston University Journal of Science and Technology Law 42, 416.Google Scholar
Kerr, OS (2016) Norms of computer trespass. Columbia Law Review 116, 11431183. Google Scholar
Krämer, J and Wohlfarth, M (2018) Market power, regulatory convergence, and the role of data in digital markets. Telecommunications Policy 42, 154171.CrossRefGoogle Scholar
Krotov, V and Johnson, L (2022) Big web data: Challenges related to data, technology, legality, and ethics. Business Horizons 66, 481491. CrossRefGoogle Scholar
Lemley, MA (2005) Property, intellectual property, and free riding. Texas Law Review 83, 1031.Google Scholar
Luscombe, A, Dick, K and Walby, K (2022) Algorithmic thinking in the public interest: Navigating technical, legal, and ethical hurdles to web scraping in the social sciences. Quality and Quantity 56, 10231044.CrossRefGoogle Scholar
Mancosu, M and Vegetti, F (2020) What you can scrape and what is right to scrape: A proposal for a tool to collect public Facebook data. Social Media + Society 6, 2056305120940703.CrossRefGoogle Scholar
Martens, B and Zhao, B (2021) Data access and regime competition: A case study of car data sharing in China. Big Data and Society 8, 20539517211046374.CrossRefGoogle Scholar
Mayernik, MS (2017) Open data: Accountability and transparency. Big Data and Society 4, 2053951717718853.CrossRefGoogle Scholar
McRory, WC (2021) Let the bots be bots: Why the CFAA must be clarified to prevent the selective banning of data collection facilitating private social media information monopolization. Brooklyn Journal of Corporate, Financial and Commercial Law 16, 279. Google Scholar
Najork, M (2009) Web crawler architecture. Encyclopedia of Database Systems. New York: Springer.Google Scholar
Rahman, RU and Tomar, DS (2021) Threats of price scraping on e-commerce websites: Attack model and its detection using neural network. Journal of Computer Virology and Hacking Techniques 17, 7589.CrossRefGoogle Scholar
Riley, KC (2018) Data scraping as a cause of action: Limiting use of the CFAA and trespass in online copying cases. Fordham Intellectual Property, Media and Entertainment Law Journal 29, 245. Google Scholar
Sellars, A (2018) Twenty years of web scraping and the Computer Fraud and Abuse Act. Boston University Journal of Science & Technology Law 24, 381388.Google Scholar
Sobel, B (2020) A new common law of web scraping. Lewis & Clark Law Review 25, 147.Google Scholar
Tsang, KF (2022) International application of CFAA: Scraping data or scraping law? St. Louis University Law Journal 66, 127.Google Scholar
Udapure, TV, Kale, RD and Dharmik, RC (2014) Study of web crawler and its different types. IOSR Journal of Computer Engineering 16, 15.CrossRefGoogle Scholar
Vezyridis, P and Timmons, S (2019) Resisting big data exploitations in public healthcare: Free riding or distributive justice? Sociology of Health and Illness 41, 15851599.CrossRefGoogle ScholarPubMed
Wierzel, KL (2001) If you can’t beat them, join them: data aggregators and financial institutions. North Carolina Banking Institute 5, 457.Google Scholar
Xiao, G (2020) Bad bots: Regulating the scraping of public personal information. Harvard Journal of Law and Technology 34, 701.Google Scholar