Introduction
An alarming number of wildlife species and parts are traded globally, threatening conservation efforts (Wilson-Wilde Reference Wilson-Wilde2010). For example, 20 tonnes of pangolins and their parts are trafficked internationally annually, and c. 1 000 000 pangolins have been poached in the last decade (TRAFFIC n.d.a). Populations of black rhino have been decimated from c. 70 000 in 1970 to just 2410 in 1995, resulting in the species now being classified as Critically Endangered. Growing demand has intensified the numbers of threatened, vulnerable and endangered animal species (Butchart et al. Reference Butchart, Walpole, Collen, Van Strien, Scharlemann and Almond2010). The illegal wildlife trade has also increased the risk of pathogens, raising concerns about health security and infectious disease spread among humans (Daszak et al. Reference Daszak, Cunningham and Hyatt2000). Despite these dual risks to animal and public health, an estimated US$19 billion continues to be generated by this criminal enterprise yearly (Global Financial Integrity 2011).
With the rapid rise of information and communication technology, virtual marketplaces on the Internet are now conduits for the distribution, trafficking and sale of wildlife products (Rosen & Smith Reference Rosen and Smith2010, Herrel & Van der Meijden Reference Herrel and Van der Meijden2014, Guan & Xu Reference Guan and Xu2015). China has been identified as a primary consumer destination, especially for ivory and rhino horn (Guan & Xu Reference Guan and Xu2015, TRAFFIC n.d.b). In 2017, the revised law of the People’s Republic of China on the Protection of Wildlife came into effect and, combined with the efforts of the Coalition to End Wildlife Trafficking Online, the number of new wildlife product advertisements has reportedly declined (Xin & Xiao Reference Xin and Xiao2019). However, wildlife product trading continues among Chinese nationals and Chinese-language speakers in other countries, including a rise in social media-based marketing and sale (Xiao & Wang Reference Xiao and Wang2015, Xiao et al. Reference Xiao, Guan and Xu2017, Xin & Xiao Reference Xin and Xiao2019). Popular platforms include those operated in China (e.g., QQ, WeChat) and platforms outside of China that include Chinese-speaking populations (e.g., Facebook, Twitter, Instagram, etc.).
Mainland China, Hong Kong, Macao and Taiwan are the four main countries/territories populated by Chinese speakers. All of these jurisdictions have domestic policies addressing wildlife conservation. In mainland China, endangered species trading is forbidden by the Wild Animal Conservation Law of the People’s Republic of China, and all sanctioned species wildlife sales require a physical authorized store, with no wildlife allowed to be marketed or sold on e-commerce or social media. Other jurisdictions have similar legislation (e.g., Hong Kong’s Wild Animals Protection Ordinance, Macao’s Law No. 2/27 Enforcement of the Convention on International Trade in Endangered Species of Wild Fauna and Flora and Taiwan’s Wildlife Conservation Law). These policies prohibit the purchase, sale, export or offer for sale of any protected wildlife, except in accordance with special permit.
One of the world’s most popular social media platforms is Facebook, which now serves 2.27 billion monthly users globally (Abbruzzese Reference Abbruzzese2018). Facebook is a multi-language social networking site (SNS) that allows users and organizations to create profiles, share information and communicate through open and direct messages and has a number of embedded web-based applications (Harris & Dennis Reference Harris and Dennis2011). Recent investigations and news reports have detected wildlife trading advertisements on the platform (Bach Reference Bach2018, CNBC 2018, Sy Reference Sy2018). However, users located in China are generally blocked from using Facebook, along with platforms such as Twitter and Google.
In July 2009, Chinese authorities blocked Facebook following riots in Xinjiang, a special autonomous region in western China. The crackdown was aimed at curtailing communications among independent activists using social media and, as a result, only 0.1% of Facebook subscribers are currently identified as Chinese nationals (Internet World Stats 2020). These users may be located in another country or district outside of mainland China or use a foreign virtual private network (VPN) to access Facebook if residing in China. Although the majority of Chinese nationals are blocked from Facebook, this does not preclude Chinese-language advertisements, including those targeting Chinese-speaking populations living outside of the country.
Despite national and international laws prohibiting the marketing and sale of endangered wildlife online, there is a growing body of research documenting this activity, including on e-commerce sites, the dark web and social media platforms (Harrison et al. Reference Harrison, Roberts and Castro2016, Di Minin et al. Reference Di Minin, Fink, Tenkanen and Hiippala2018, Reference Di Minin, Fink, Hiippala and Tenkanen2019, Fink & Di Minin Reference Fink and Di Minin2018, Sy Reference Sy2018). Studies using manual searches to assess Chinese consumer wildlife purchasing behaviour and other studies specifically examining Facebook-based wildlife trading have been conducted (Xiao & Wang Reference Xiao and Wang2015, Xiao et al. Reference Xiao, Guan and Xu2017, WWF n.d.). To advance this area of research, this study used web scraping (i.e., automatically collecting data from websites) to identify and characterize Chinese-language wildlife sales on Facebook for elephants, rhinos and hawksbill turtles and analysed user engagement.
Methods
This study was conducted in two phases: data collection and then manual annotation for data analysis. We first used web scraping, a technique employed to extract large amounts of data from websites by writing an automated program, in the programming language Python™ to collect a set of Facebook posts containing wildlife-associated keywords in simplified and traditional Chinese (Mitchell Reference Mitchell2018). We then manually annotated all posts for: (1) individual posts suspected of illegal wildlife online marketing and sale; and (2) user interactions we suspected of being engaged in wildlife sale transactions (details of data collection and the analysis methodology are available in Supplement S1, available online).
Data collection
After the Facebook–Cambridge Analytica data breach in 2018, Facebook disabled, restricted or shut down many functions of its public application programming interface (API), which enables apps to programmatically query published content and set other limitations on public data collection. In order to collect an adequate sample of wildlife-related Facebook posts, we built a web scraper to simulate users searching for wildlife species and products on the Facebook homepage’s search bar from March to April 2019.
We selected elephants (Loxodonta spp.), rhinos (Rhinocerotidae spp.) and hawksbill turtles (Eretmochelys imbricate) as our target animal taxonomies, as these three animals had the highest proportions of advertisements in studies examining Chinese users’ behaviour on e-commerce and social media sites (Xin & Xiao Reference Xin and Xiao2019). TRAFFIC China found that, since 2012, ivory products had the highest share (63%) of all wildlife product advertisements online, followed by rhino horn (18%) and hawksbill shell (8%) (Xiao & Wang Reference Xiao and Wang2015, Xiao et al. Reference Xiao, Guan and Xu2017).
As wildlife selling posts could also use generic terms (e.g., ‘for sale’), involve content collapse (when discussion about marketing or sale is broken up into multiple messages/posts instead of one) or use emojis instead of actual wildlife product-related keywords, we also conducted additional analysis of Facebook user account pages (Xin & Xiao Reference Xin and Xiao2019). Using the permanent link collected from each selling post, the web scraper was also set to collect the last ten posts from a Facebook user’s profile page.
Data analysis
In order to accurately classify posts in Chinese language, we manually annotated all posts collected (including text and related images and videos) to confirm those we suspected of being engaged in illegal wildlife marketing and sale (referred to as ‘signal’ posts). There were two criteria for identifying a post as ‘signal’: (1) it included the name and description (including text and/or images) of a suspected wildlife product or live animal; and (2) it was a purported seller attempting to communicate with a buyer for a wildlife sale transaction. We did not include posts that purported to have a valid permit authorized to sell wildlife products.
After manually annotating posts for signals, we then characterized posts for their textual content, which included what specific wildlife products were being marketed and sold and the characteristics of any sale transactions. We first categorized all of the wildlife by claimed species, limited to animals or any aquatic species included in the Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES) appendices. We also recorded the metadata associated with the posts and the associated users’ profiles, as well as the social interactions and sharing of these posts (e.g., the number of comments and likes on a post, the number of followers for community accounts and the number of members in public groups). We also collected users’ self-reported geolocation information if available.
In addition to selling posts, we categorized the types of Facebook pages that included signal posts and their levels of user engagement. Facebook page types included: (1) personal user accounts; (2) public group pages; and (3) community pages. From these data, we calculated a user interaction engagement ratio per post based on comments from other users and the metadata associated with signal posts (Tables 1 & 2).
The first and second authors independently coded all of the posts in Chinese. For inconsistent results, both authors met and reviewed the posts together and conferred on the correct classification of the post.
Results
We collected 10 303 unique Facebook posts over a 45-day period. After coding all of these independently, the first and second authors achieved high inter-coder reliability for signal coding results (κ = 0.991; see Table S1). Of the total number of posts, 9079 (88%) were collected using 36 keywords related to ivory, 792 posts were collected using four hawksbill-related keywords and 432 posts were collected using three rhino-related keywords. After manually annotating all of the posts, we identified 639 signal posts (6% of the total dataset) from 268 unique Facebook users that we suspected of directly promoting or engaging in the sale of live animals or related wildlife products. The species breakdown included 451 posts for ivory, 147 for hawksbill shell and 41 for rhino horn. The oldest post collected was from December 2015 and the newest post collected was from April 2019.
Post-content analysis
Based on the content analysis of text in the Facebook messages, we identified three main types of selling posts: (1) direct sale posts; (2) auction posts; and (3) external sale posts (see Table 3). We also detected other species outside of our study wildlife keywords, including whales, tigers (Panthera tigris), walrus (Odobenus rosmarus), Tibetan antelope (Pantholops hodgsonii), bears (Ursus arctos/Ursus thibetanus) and African spurred tortoise (Centrochelys sulcata). We calculated the number of posts for each wildlife product or live animal detected. Overall, elephant ivory products had the highest number of posts in our dataset (66.5%) followed by hawksbill shell products (23.2%) and rhino products (6.9%) due to our purposeful sampling of these species. Among all species detected, Loxodonta africana, Rhinocerotidae, Cheloniidae, U. arctos (population in China), P. tigris and P. hodgsonii are listed in CITES Appendix I. Although the majority of our observed posts were for wildlife products, we also detected several posts for live animals, particularly for reptiles.
Among the 639 wildlife marketing and sale posts identified (see Figs S1 & S2 for anonymized screenshots), 638 contained both text and images, with only one post containing images only. When manually annotating the 638 posts for text, 301 (47%) were identified using code words instead of species-related keywords, with Tibetan antelopes (100%), elephants (64%), rhinos (46%) and tigers (27%) having the highest frequency of code words. No code words were detected for whales, walruses, bears and African spurred tortoises. Additionally, we detected five wildlife selling posts that did not contain product-related keywords, but only contained text stating “Check it out!” or “Needless to say!” and 21 posts contained an animal graphic emoji instead of a species or code word.
Facebook page interaction analysis
All of the three types of Facebook pages were used for wildlife product marketing and sales. The main differences among these three types of pages are: (1) the number of users that can post to the page; (2) the number of users exposed to messages; and (3) the ways by which to connect to other users and wildlife product selling posts for possible sales transactions (see Table 1). Based on our signal posts, we identified 183 wildlife selling posts from 67 unique personal account pages, 150 posts from 68 unique public group pages and 308 posts from 73 unique community pages. Community pages appeared to have the highest percentage (48.2%) of wildlife selling posts.
In total, we found 8538 user engagement comments in the 639 detected signal posts (see Table 4). Of these engagement posts, community account pages had the highest average engagement per post (16.46, n = 5071), but also received a high volume of weaker-level engagements (16.31, n = 5023). Personal account pages had the highest interest-level engagements (1.90, n = 348) compared to other pages, and public group pages had the highest average number of strong-level engagements (1.21, n = 182). Hence, it appears that Facebook page types may modulate the levels of user engagement and the potential for an interacting buyer entering into a wildlife purchasing transaction.
Based on the profile metadata self-reported by users and account managers on Facebook pages, we were able to conduct a descriptive analysis of the purported locations of users. A total of 268 unique accounts were detected, with 183 having location information in their profile (67%, n = 123 of personal accounts and 71%, n = 60 of community accounts included geolocation metadata). Users appeared to be located in 14 different countries and regions, with the most being in Taiwan (26%, n = 70), followed by Malaysia, Hong Kong, China and Thailand (Fig. 1). With the exception of two accounts located in the UK and the USA, all of the remaining 181 accounts were based in East and Southeast Asia.
Discussion
In spite of the number of Chinese wildlife e-commerce product advertisements dropping from 1500 to 1000 per month from October 2014 until December 2016 (Xiao et al. Reference Xiao, Guan and Xu2017, Xin & Xiao Reference Xin and Xiao2019), our study found that there remain hundreds of wildlife sales posts targeted for three specific endangered species and, more alarmingly, thousands of user interactions with these posts. Although wildlife trafficking in Chinese online markets (including other platforms, such as WeChat, QQ, Taobao, etc.) has received increased attention from conservation groups, law enforcement, technology companies and researchers, our findings indicate that online Chinese-language wildlife sales remain a blind spot in detection and enforcement (WWF, n.d.).
However, our study has certain limitations. First our study was limited to the Facebook platform and posts that contained Chinese language, and it focused on three specific animal species, so results are not generalizable to global wildlife trafficking trends. Future studies should conduct multilingual searches on Facebook in order to better estimate the overall volume of wildlife trafficking globally or examine other popular Chinese platforms such as WeChat. Furthermore, we were not able to confirm the accuracy of the geolocation data based on personal profile and self-reported metadata, although we did implement additional steps to cross-reference location information. This could impact the legal determination of whether sellers and buyers are engaged in illegal cross-border wildlife trafficking based on their actual location. Future studies should develop more robust approaches to accurately confirming geolocation metadata.
Despite uncertainty regarding the exact geolocation of users and the lack of Facebook access for Chinese nationals, there are large numbers of Chinese speakers residing outside of mainland China who may engage in online wildlife sale and purchasing. There is also the possibility that traders or consumers in China are using VPN services to participate in this trade. This activity occurs openly despite Facebook’s policy expressly prohibiting it. Facebook’s ‘Commerce policy section 3 Prohibited Content – 4 Animals’ specifically prohibits the sale of “any product or part … [from] endangered or threatened animals; live animals, livestock, pets, and prohibited animal parts, including but not limited to bone, teeth, horn, ivory, taxidermy, organs, external limbs, secretions, or carcasses on Facebook and Instagram” (Facebook Commerce Policies n.d.).
Our results indicate that despite these prohibitions, a variety of wildlife products (including those of endangered and threatened species) are being offered for sale on the platform. Although we were not able to purchase or test the purported wildlife products in order to verify their authenticity, we used visual assessment of images in combination with text analysis. However, it is possible that not all products were authentic wildlife products, and some offers may have been scams. Future studies should develop additional verification approaches such as using deep learning for image classification and assessing other product-specific features in order to confirm whether the products are authentic.
Furthermore, according to Facebook’s official help centre page, the company also states that they do not allow organizations that engage in terrorism or organized crime to be on the platform and will remove content that expresses support for groups involved in violent or criminal behaviour (Facebook Help Center n.d.). However, organizations including the UN Office on Drugs and Crime, USAID and INTERPOL have provided clear evidence of the link between illegal wildlife trade and professional criminal organizations (UNODC 2016, INTERPOL 2019). Hence, our findings provide clear evidence that Facebook’s self-regulation of this content, which likely violates applicable national law, international treaties and the platform’s own policies, is insufficient and represents a policy failure in the context of combatting online wildlife trafficking directly enabled by the platform.
Other investigations have also uncovered wildlife trafficking on Facebook (Enano Reference Enano2019). TRAFFIC, an international conservation non-profit organization, found 2245 advertisements in Tagalog and English associated with the live reptile trade on Facebook groups in the Philippines (Sy Reference Sy2018). CNN found that cheetahs were being offered for sale on social media platforms including YouTube and Instagram (owned by Facebook), including from Gulf nation countries in Arabic language, while the International Consortium of Investigative Journalists have collaborated on uncovering illegal wildlife trafficking of pangolins (Böhler Reference Böhler2019, Formanek et al. Reference Formanek, Karadsheh and Qiblawi2019). Our results provide additional evidence of the specific ways by which Facebook enables user engagement to facilitate sales between wildlife sellers and buyers.
From a methodological perspective, a growing number of studies have utilized data mining and machine learning to classify illegal wildlife product sales (Hernandez-Castro & Roberts Reference Hernandez-Castro and Roberts2015, Harrison et al. Reference Harrison, Roberts and Castro2016, Austen et al. Reference Austen, Bindemann, Griffiths and Roberts2018, Di Minin et al. Reference Di Minin, Fink, Tenkanen and Hiippala2018, Parham et al. Reference Parham, Stewart, Crall, Rubenstein, Holmberg and Berger-Wolf2018). Our study used web scraping to simulate user searches on Facebook both retrospectively and prospectively. This approach can be replicated for the further detection of wildlife trafficking in other languages on Facebook and other SNS platforms and be adapted for the detection of other online criminal activity. Manually annotated results can also be used to develop machine learning classifiers for future automated detection of Chinese wildlife marketing and sales content.
Text analysis is an important part of ensuring the accurate identification and characterization of wildlife sales posts and identifying changes in selling and code word strategies. Based on our findings, it appears that SNS users are starting to use emojis and new code words instead of species-related keywords (Alfino & Roberts Reference Alfino and Roberts2018). This change could increase the amount of false-negative posts when using machine learning and emphasizes the need for continued manual annotation to uncover new trends in the detection of covert activities. For example, some posts only contained text or generic terms, but did not mention specific animals. This may specifically impact the accuracy of algorithms in detecting content, particularly if it is created in non-English characters. Although we did not use machine learning approaches to content code, based on these results, we plan to develop a Chinese-language classifier to detect illicit online sales of wildlife products.
Furthermore, we observed that many wildlife selling interactions are initiated on Facebook, but have their transactions finalized in private messaging. In the public comments section of detected Facebook posts, we observed many conversations between seller and buyers. For example, in one detected post, a seller shared an image of a hawksbill shell bracelet, with a user asking what the product was in the comments section. After being told that it was a turtle shell, the buyer responded that they wanted elephant teeth instead, with the seller leaving a final comment to contact them directly.
Hence, buyers will often view a wildlife sale offer posted by a Facebook user, group or community page and then enquire for more information about the wildlife product, other products or transaction details (e.g., price and method of delivery) in the Facebook comments section. When the seller confirms interest from a potential buyer, the seller will then redirect the user to start a private conversation by using the ‘PC’ feature within the Facebook platform or direct them to another private communication application (e.g., WeChat). Hence, transaction data confirming the actual terms of the sale and exchange of payment often reside in private or encrypted messages.
Finally, we observed that Facebook community pages have more signal posts and the highest volume of interactions. This could be reflected by the special purpose of community pages (generally created by companies or other groups to connect with customers for marketing purposes). Our results show that these community pages exhibited characteristics of online selling pages where sellers could share and curate many wildlife posts and interact with customers who are followers. In this sense, these pages shared certain characteristics with other online criminal marketplaces, such as dedicated chatrooms and forums, where the platform allows for the development of a community of buyers and sellers and facilitates their direct social interaction, but does not directly enable e-commerce transactions (Kreibich & Jahnke Reference Kreibich and Jahnke2010, BBC News 2017, IACA 2018, Johnsen & Franke Reference Johnsen and Franke2018).
Conclusions
With the rapid advancement and diffusion of the Internet, cybercrimes including computer and network intrusions, ransomware, identity theft and illegal online trafficking of goods and services are now globalized phenomena that are growing in scope and severity (FBI, n.d.). Virtual marketplaces that leverage the Internet’s accessibility and anonymity are conduits for the distribution, trafficking and sale of a variety of prohibited services, products and commodities, including illicit drug trafficking, antiquities, human trafficking and illegal wildlife trafficking (Rosen & Smith Reference Rosen and Smith2010, Latonero Reference Latonero2011, Campbell Reference Campbell2013, Mackey & Liang Reference Mackey and Liang2013, Herrel & Van der Meijden Reference Herrel and Van der Meijden2014, Guan & Xu Reference Guan and Xu2015, Mendel & Sharapov Reference Mendel and Sharapov2016, Mackey et al. Reference Mackey, Kalyanam, Katsuki and Lanckriet2017).
Combating illegal online SNS wildlife sales is a race against time with criminal actors. By using big data and advanced data mining approaches, rapid detection of wildlife-related cybercrime is possible, particularly when leveraging computational methods of natural language processing, machine learning and deep learning informed by in-depth manual annotation to characterize content and interactions between sellers and buyers (Mackey et al. Reference Mackey, Kalyanam, Katsuki and Lanckriet2017, Li et al. Reference Li, Xu, Shah and Mackey2019, Xu et al. Reference Xu, Li, Cai and Mackey2019). Our study represent one phase of this solution through developing an approach to collecting data from one of the world’s most popular SNS platforms by simulating user searches and developing a manually labelled dataset that can be used for training in a machine learning inference phase. Importantly, this methodology can be used to eventually develop surveillance that occurs close to real time in order to detect, classify and report illegal wildlife posts for action by platforms in conjunction with law enforcement and conservation groups, as has been pursued for illicit online drug sales (Mackey et al. Reference Mackey, Kalyanam, Klugman, Kuzmenko and Gupta2018).
Such solutions are crucial in the face of an evolving and growing wildlife cybercrime landscape. However, these efforts need to be supported with meaningful collaboration from international organizations, government agencies, technology companies, academia and non-governmental organizations and a decrease in demand for these products from the general public. Coordination of efforts to address Internet and SNS-based wildlife trafficking should start with more robust surveillance in multiple languages, including incorporating localized keywords and vocabulary specific to wildlife trade, while also being adaptive to changes in selling strategies and language. Policy coherence is also necessary for developing national, regional or international policies that are specific to the unique technical challenges of online wildlife trafficking as distinct from those actions needed in the non-cyber field.
Future policies should include an explicit mandate and affirmative obligation for SNS platforms, e-commerce sites, domain registrars, Internet service providers and search engines to remove content that violates their own terms of use or obligations under national law or international conservation treaties, while also sharing data on illegal traffickers with law enforcement and conservation authorities in order to aid offline investigations. Researchers also need to continue to innovate on detection and classification methods in order to complement the work of conservation groups. Finally, the public needs greater education about the disastrous impacts of wildlife trafficking on the ecosystem in order to stem demand. Only with such a comprehensive approach can online wildlife trafficking become less ‘social’ and more criminal.
Supplementary material
To view supplementary material for this article, please visit https://doi.org/10.1017/S0376892920000235
Acknowledgments
We thank the World Wildlife Fund and Microsoft for inviting the authors to present their preliminary results at the ‘Detect and Prevent: AI Collaboration to End Wildlife Trafficking Online’ workshop, made possible by the Global Coalition to End Wildlife Trafficking Online.
Author contributions
This manuscript has been seen by all authors, who have approved its content. This piece is not under consideration in any other forum. We note that with respect to author contributions, QX, MC and TKM jointly designed the study, collected data and worked on the manuscript text. All authors contributed interpretations of the study findings.
Financial support
None.
Conflicts of interest
None.
Ethical standards
Facebook messages collected for the purposes of this study were in the public domain and web scraping was done solely for research purposes. Although certain websites and online platforms prohibit web scraping of content without permissions, recent court cases have found that web scraping online platforms without consent may be legally permissible in the context of commercial activity. Additionally, since using the Facebook public API is not feasible, other methods of data collection for the purposes of detecting illegal activity are necessary. Criminal activity is inherently difficult to detect; hence, methods such as web scraping can be used to identify potential platform violations and illegal activity. Our study does not disclose any personally identifiable information, as we have removed identifiers from our dataset, blurred images/screenshots and aggregated results. Hence, no ethics approval was required for this study as we relied on publicly available data, did not include any identifiable information, did not include any private messages between users and there was no interaction between Facebook users and researchers.