INTRODUCTION
The internet provides individuals with an essentially limitless amount of information and a high degree of choice in which elements of that information to consume. This has prompted concerns over whether the internet threatens societies’ abilities to establish common bases of reliable information and, by extension, the sustainability of democracy in the twenty-first century. One side of this debate argues that the internet fuels political polarization, as users may choose to avoid conflicting viewpoints (Sunstein Reference Sunstein2002; Reference Sunstein2017) or may be algorithmically steered toward politically favorable content (Pariser Reference Pariser2011). Conversely, an extensive empirical literature on news consumption shows that incidental exposure to politically diverse sources is extremely common (Bakshy, Messing, and Adamic Reference Bakshy, Messing and Adamic2015; Gentzkow and Shapiro Reference Gentzkow and Shapiro2011; Guess Reference Guess2021; Messing and Westwood Reference Messing and Westwood2014), and therefore suggests that concerns over “echo chambers” or “filter bubbles” are overblown.
We argue that networked curation processes lead information consumption on social media in particular to be more politically homogeneous than this empirical literature has thus far suggested. However, this is more a reflection of democracy than a threat to democracy—a product of individuals engaging with information, and each other, on their own terms—highlighting trade-offs between cross-cutting exposure and active participation (Kreiss and McGregor Reference Kreiss and McGregor2023; Mutz Reference Mutz2006; Stroud Reference Stroud2011). Users on social media platforms curate the information they share with others and simultaneously receive curated streams of information tailored to their interests (Davis Reference Davis2017). This involves “unbundling” discrete pieces of information from their parent sources and re-bundling them into user-level streams of content—transforming a hierarchical distribution of information (from sources to consumers) into a networked distribution of information (from users to users).
One implication of this process, which we develop in this article, is that source-level estimates of audience partisanship may mistake heterogeneity for moderation. Users choose other users to follow based on their tendencies to share useful or otherwise appealing information (Barberá Reference Barberá2015), and those accounts will in turn selectively share information from a given source with their network ties (such as followers, friends, or group members) based on the extent to which that content serves social as well as informational functions (Epstein et al. Reference Epstein, Sirlin, Pennycook and Rand2023; Marwick and boyd Reference Marwick and boyd2011). That is, users share information not only to inform others, but also to perform their identities, advance their interests, and generate social returns (such as likes, retweets, or followers). Since individual stories are subject to these networked curation processes, source-level estimates do not reflect cases where individual stories from a given source are useful for different partisan audiences.
We test this implication of networked curation by comparing common measures of audience partisanship at the source and story levels. Virtually all the literature in this area aggregates partisan consumption to the level of the domain, or source, rather than examining the partisan audiences of individual stories (Eady et al. Reference Eady, Nagler, Guess, Zilinsky and Tucker2019; Guess Reference Guess2021; Peterson, Goel, and Iyengar Reference Peterson, Goel and Iyengar2021; Robertson et al. Reference Robertson, Jiang, Joseph, Friedland and Lazer2018). Bakshy, Messing, and Adamic (Reference Bakshy, Messing and Adamic2015) and González-Bailón et al. (Reference González-Bailón, Lazer, Barberá, Zhang, Allcott, Brown and Crespo-Tenorio2023) are notable exceptions, discussed below. Source-level aggregation implicitly assumes that every story from a given source is drawn from a consistent distribution of partisan appeal that attracts a stable ratio of Democratic to Republican users. By contrast, we find evidence of partisan curation bubbles, defined as sets of users who share and consume content with consistent appeal from a variety of sources.Footnote 1 When users in curation bubbles are able to identify and circulate congenial information from a variety of sources, individual stories may reach audiences atypical of the sources that produced them—introducing heterogeneity into the source’s aggregate audience that doesn’t necessarily reflect heterogeneity in the partisan valence of the information users consume.
We use two large-scale datasets to test for the presence and extent of partisan curation bubbles. First, we analyze sharing patterns on Twitter in 2017 and 2018 using a panel of over 1.6 million user accounts linked to a commercial voter file. We then examine sharing and exposure patterns on Facebook between 2017 and 2021 using data made available through Social Science One (King and Persily Reference King and Persily2020). We consistently find evidence of partisan curation bubbles. The fact that we find substantively similar results in all three analyses (of sharing on Twitter and on Facebook, and of exposure on Facebook) suggests a robust pattern.
CURATION ON SOCIAL MEDIA
The online information ecosystem in the early twenty-first century is characterized by unbundling and abundance. An individual’s news consumption near the end of the twentieth century would typically be clustered in a small number of sources offering packages of information. As one could not read a story in a newspaper without buying at least a single copy of the whole newspaper, information search was largely a search for preferred packages, or sources, from which to habitually consume a variety of information. This could take the form of a subscription to a newspaper that covered news, opinion, sports, and culture—or an opinion magazine that offered a particular editorial direction. Contemporary information consumption presents a fundamentally different proposition as it is largely unbundled at the story level, such that it is practical for individuals to consume information à la carte from a wide range of sources. The task of information search is now less about identifying the most desirable sources and is instead about identifying content of interest from a functionally infinite set of options.
The internet offers individuals several different strategies to manage the task of wading through an ocean of information to identify what they want to see. Centralized aggregators such as search engines and news portals (Fischer, Jaidka, and Lelkes Reference Fischer, Jaidka and Lelkes2020; Robertson et al. Reference Robertson, Green, Ruck, Ognyanova, Wilson and Lazer2023) are perhaps the most obvious and widely used, allowing individuals to input queries ranging from general (“political news”) to specific (“2024 Nevada caucus results”) and receive relevant information in return. Here, we focus on a different, commonly used setting: feed-based social media, in which users follow accounts and posts from these accounts are aggregated into a flow of content. We take curation to be the processes through which people are matched with content that appeals to them. We consider curation to encompass both platform architecture, such as a ranking algorithm, and user choice within that architecture. Following Davis (Reference Davis2017), this includes both consumptive curation, or users’ selection of accounts from which to receive information, and productive curation, or users’ choice of what to share with others. Importantly, consumptive curation effectively delegates the search for relevant information to others. Rather than actively searching for specific information, and rather than choosing news sources to habitually consume information, users choose other users from which to habitually consume information and then scroll through whatever those users choose to post.
Online curation is analogous to prior accounts of the “two-step flow” of information from radio and print media to opinion leaders, and from opinion leaders to ordinary citizens (Katz and Lazarsfeld Reference Katz and Lazarsfeld1955; Lazarsfeld, Berelson, and Gaudet Reference Lazarsfeld, Berelson and Gaudet1948). However, on social media, this has the potential to happen with more structure and on a far larger scale. Rather than opinion leaders (perhaps haphazardly) recounting news they read earlier in the day to another individual, users on social media can immediately and directly share news with hundreds or thousands of other users at a time. Moreover, opinion-leading relationships as envisioned by the Columbia school are largely formed incidentally, as a consequence of proximity within one’s local community (e.g., Lazer et al. Reference Lazer, Rubineau, Chetkovich, Katz and Neblo2010; Minozzi et al. Reference Minozzi, Song, Lazer, Neblo and Ognyanova2020). By contrast, social media allows users much more choice in who they form ties with and why, potentially including the provision of information.
These affordances of social media renewed long-standing concerns over how much choice in information consumption is too much. The ability to pick and choose individual accounts from which to receive political information carries the potential for users to select into politically homogeneous “echo chambers” (Sunstein Reference Sunstein2002; Reference Sunstein2017). The increased reliance on platforms that algorithmically filter, sort, and recommend content prompts parallel concerns over “filter bubbles” (Pariser Reference Pariser2011; Ribeiro et al. Reference Ribeiro, Ottoni, West, Almeida and Meira2020) in which consuming partisan information begets exposure to more partisan information. These related concerns involve the same outcome: politically homogeneous information diets that, in theory, frustrate democratic societies’ abilities to make collective decisions using common bases of reliable information.
Empirical research regarding the extent these potentially undesirable outcomes manifest is mixed (Barberá Reference Barberá, Persily and Tucker2020; Dahlgren Reference Dahlgren2021; Prior Reference Prior2013). This is in part because individuals’ tendencies to engage in selective exposure within their information environments are not as straightforward as early theories regarding the concept predict—in line with early skepticism (Freedman and Sears Reference Freedman and Sears1965; Reference Freedman and Sears1967). While some individuals do select pro-attitudinal sources (Stroud Reference Stroud2011), this does not necessarily mean that they are actively avoiding counter-attitudinal information. Indeed, individuals are especially likely to seek (and subsequently share) pro-attitudinal information when they are exposed to counter-attitudinal information (Garrett Reference Garrett2009; Weeks et al. Reference Weeks, Lane, Kim, Lee and Kwak2017). This dynamic is less obviously concerning, and can take place in the context of healthy deliberative exchange. In addition, people often rely on heuristics other than partisanship when deciding which information to consume, such as topical relevance (Kobayashi and Ikeda Reference Kobayashi and Ikeda2009; Mummolo Reference Mummolo2016) or social endorsements (Messing and Westwood Reference Messing and Westwood2014). As a result, partisan segregation in aggregate news consumption online is typically found to be relatively low (Flaxman, Goel, and Rao Reference Flaxman, Goel and Rao2016; Gentzkow and Shapiro Reference Gentzkow and Shapiro2010; Guess Reference Guess2021).
This finding initially extended to social media. Early research on Facebook showed that since friendship ties formed for a variety of reasons—many of which were incidental to politics—Facebook users were frequently exposed to politically distant sources (Bakshy, Messing, and Adamic Reference Bakshy, Messing and Adamic2015; Bakshy et al. Reference Bakshy, Rosenn, Marlow and Adamic2012). However, when extending this analysis on Facebook to Pages and Groups, which form for more specific reasons, González-Bailón et al. (Reference González-Bailón, Lazer, Barberá, Zhang, Allcott, Brown and Crespo-Tenorio2023) find stronger evidence of political segregation in information consumption—consistent with other work finding evidence of political homophily on social media (Conover et al. Reference Conover, Ratkiewics, Francisco, Goncalves, Menczer and Flammini2021; Reference Conover, Goncalves, Flammini and Menczer2012).Footnote 2 Put simply, one may be friends with a politically distant acquaintance or relative on Facebook in spite of their politics but follow a Page because of its politics, which will have consequences for the diversity of information to which one is exposed. Moreover, pro-attitudinal information spreads more quickly, is consumed more frequently, and is received more approvingly within political communities on social media sites than counter-attitudinal information (Garz, Sörensen, and Stone Reference Garz, Sörensen and Stone2020; Halberstam and Knight Reference Halberstam and Knight2016). This imbalance is likely attributable to the political information users choose to share on social media sites. Sharing information with one’s followers is inherently more public than consuming it oneself, and can be used to signal (or, from the opposite perspective, infer) political identities and commitments (Marwick and boyd Reference Marwick and boyd2011; Settle Reference Settle2018). In the rare instances, in which users share political information from opposing partisans, it is often accompanied by negative comments that indicate disagreement (Cinelli et al. Reference Cinelli, Morales, Galeazzi, Quattrociocchi and Starnini2021; Wojcieszak et al. Reference Wojcieszak, Casas, Yu, Nagler and Tucker2022).
The Facebook Page, in the above example, is acting as a curator—an account that shares or reshares content. An account that posts a link to a story in the New York Times is identifying that content as worthy of attention. Consumers are accounts on social media that are exposed to content. Users have the ability to act as both a consumer and curator, though in practice the vast majority of productive curation is done by a small number of accounts (Grinberg et al. Reference Grinberg, Joseph, Friedland, Swire-Thompson and Lazer2019; Hughes et al. Reference Hughes, McCabe, Hobbs, Remy, Shah and Lazer2021; Wojcik and Hughes Reference Wojcik and Hughes2019) who are more politically active offline (and exhibit more partisan extremity) than users who do not post about politics themselves (Hughes Reference Hughes2019). These curators, in turn, take an active role in purposively identifying individual stories to share, and how to frame those stories for their followers (Billard Reference Billard2021; Park and Kaye Reference Park and Kaye2018). Importantly, curators do not necessarily share information solely for information’s sake—the act of sharing specific information (as opposed to other information one could potentially share) is a means by which users can signal aspects of their identity that are important to them (e.g., Osmundsen et al. Reference Osmundsen, Bor, Vahlstrup, Bechmann and Petersen2021; Van Bavel et al. Reference Van Bavel, Harris, Pärnamets, Rathje, Doell and Tucker2021). Political information sharing on social media will therefore likely feature partisan curators—that is, users who selectively share information that promotes their political in-group or detracts from political out-groups. These users are, in a sense, performing “hidden labor” for their preferred party, attempting to shape the character of online discourse by selectively sharing politically favorable information.Footnote 3
The underlying logic and architecture of information sharing on social media is therefore likely to produce curation bubbles, or sets of users who share and consume content with consistent appeal from a variety of sources. We view curation bubbles as a general property of social media not limited to partisanship. For example, Taylor Swift fans will curate information related to Taylor Swift from a variety of sources. This will include atypical sources, such as ESPN, when ESPN publishes stories about Taylor Swift. Here, we are interested in partisan curation bubbles, or users who tend to share (and, through homophilous tie formation, see) politically consistent information from a variety of sources.
If and when politically neutral or distant sources publish individual stories that are useful for promoting partisan identities and interests, partisan users will share them with their followers (who are in turn likely to be co-partisans themselves), introducing heterogeneity into those sources’ audience for their constituent stories. By extension, partisan curation bubbles are formed via co-partisan users sharing information favorable to their party. The breadth and variety of information available on the internet allows partisan users to easily find politically favorable information (Peterson and Iyengar Reference Peterson and Iyengar2021). This information can originate from a variety of sources, and indeed partisans tend to overestimate the extent to which mainstream outlets perceived as ideologically distinct offer substantively different coverage (Peterson and Kagalwala Reference Peterson and Kagalwala2021). Furthermore, politically favorable information may be most useful for promoting one’s party precisely when it is attributable to a source perceived to be politically neutral or distant, as this can increase its credibility (Baum and Groeling Reference Baum and Groeling2009). Temporal variation in whether the news is broadly favorable to the political left or right can also introduce selective engagement with the news itself (Kim and Kim Reference Kim and Kim2021), which would lead to variation in which partisan curation bubbles are circulating more or less raw information at any given time.
Figure 1 provides an illustrative characterization of the partisan curation process, in comparison to a process solely driven by users consuming information directly from sources. In both cases, there are three sources that, based on users’ overall consumption behavior, appear to be left-leaning, neutral, and right-leaning, respectively. In Figure 1a, this is reflected by two left-leaning users consuming the left-leaning source, two right-leaning users consuming the right-leaning source, and all four users consuming the neutral source. In Figure 1b, there are two curators mediating these users’ consumption. Curator A only shares blue stories with the two left-leaning users who follow them and Curator B only shares red stories with the two right-leaning users who follow them, irrespective of the sources that produced those stories. The pattern of consumption is quite integrated at the producer level; yet is completely segregated at the story level. Source B, in particular, appears neutral overall not by producing stories that all users consume, but by producing stories that are curated by either left-leaning or right-leaning users.
Partisan curation bubbles carry implications for how researchers understand the political valence of the information being shared on social media. Empirical researchers frequently quantify source-level partisan slant using estimates of the partisanship of news outlets’ overall audiences (e.g., Eady et al. Reference Eady, Nagler, Guess, Zilinsky and Tucker2019; Garimella et al. Reference Garimella, Smith, Weiss and West2021; Guess Reference Guess2021; Robertson et al. Reference Robertson, Green, Ruck, Ognyanova, Wilson and Lazer2023). These estimates typically represent a normalized ratio of how often URLs from the given domain were shared by Democrats compared to Republicans. For instance, a domain shared exclusively by Democrats would receive a score of−1, a domain shared exclusively by Republicans would receive a score of 1, and a domain shared by equal numbers of Democrats and Republicans would receive a score of 0. The major exceptions that construct scores at the URL as well as domain level are Bakshy, Messing, and Adamic (Reference Bakshy, Messing and Adamic2015) and González-Bailón et al. (Reference González-Bailón, Lazer, Barberá, Zhang, Allcott, Brown and Crespo-Tenorio2023). The former evaluates exposure to cross-cutting partisan content on Facebook; the latter examines segregation in news consumption on Facebook. Both sets of results are consistent with the possibility of partisan curation bubbles, but neither directly studies their presence.
The core assumption of this approach—common to many approaches for quantifying political valence on social media (e.g., Barberá Reference Barberá2015)—is that behavior reflects revealed preferences. This is agnostic to the substance of the content in question, in contrast with methods that infer the slant of a given story or source based on its text (Gentzkow and Shapiro Reference Gentzkow and Shapiro2010; Ho and Quinn Reference Ho and Quinn2008).Footnote 4 This can make reputations self-fulfilling. If, for example, Republican users avoid the New York Times because it is regarded as left-leaning, then the New York Times will garner left-leaning audience regardless of what the newspaper publishes (Peterson and Kagalwala Reference Peterson and Kagalwala2021)—which will carry through to its location on the [−1,1] scale. Similarly, a score of 0 doesn’t mean that the domain is “neutral” in any sense deeper than that it was shared by Democrats and Republicans at equal rates. In other words, the average partisanship of sources’ audiences is a relative measure of partisanship, not an absolute one (Guess Reference Guess2021; Robertson et al. Reference Robertson, Jiang, Joseph, Friedland and Lazer2018).
As the stylized example in Figure 1 suggests, partisan curation bubbles have the potential to distort estimates of partisan appeal at the source level because they can introduce substantial heterogeneity at the story level. Importantly, this distortion is unlikely to be uniform—there will be more story-level heterogeneity in audience partisanship within sources that carry more moderate overall estimates. The reason for this is mechanical as well as theoretical: there is only one way for individual stories to aggregate to an extreme domain score (circulation among consistently partisan audiences), but there are two ways to produce a moderate score. A moderate domain can produce stories that are consistently circulated by both Democrats and Republicans at relatively even rates, or it can produce stories that are disproportionately circulated by either Democrats or Republicans. The moderate domain-level average will only reflect the individual stories that produced it in the former case. However, as partisan curators selectively share stories that are socially useful for them, the latter will frequently occur (we expand on this point in Appendix A of the Supplementary Material).
Finally, we note that our theoretical framework is agnostic as to the potential role of social media platform’s recommendation algorithms. While algorithmic curation is undoubtedly important for determining which information users see, algorithms themselves do not inevitably lead to the consumption of politically homogeneous information. Algorithms can optimize on a variety of criteria, some politically salient and others not (Bandy and Diakopoulos Reference Bandy and Diakopoulos2021; Fischer, Jaidka, and Lelkes Reference Fischer, Jaidka and Lelkes2020), and different platforms may make different design choices that could (either intentionally or incidentally) encourage or discourage exposure to counter-attitudinal information (Garrett and Resnick Reference Garrett and Resnick2011). For example, algorithm-based recommendations from centralized news aggregators such as MSN or Google may be more likely to direct users toward large, mainstream sources than they are to direct users toward niche, ideological sources (Guess Reference Guess2021). The best evidence in this area on social media in particular comes from a platform-wide experiment on Twitter, which found that its algorithmic timeline led users to be exposed to more political content than users who remained on chronological timelines (Huszár et al. Reference Huszár, Ktena, O’Brien, Belli, Schlaikjer and Hardt2022). However, that same study found inconsistent effects with respect to whether the amplification of political content was disproportionately in favor of left- or right-leaning content. While we are unable to isolate the potential contributions of platforms’ algorithms to our empirical findings, we view it as exceedingly unlikely for individual stories to circulate among politically atypical audiences in the absence of users intentionally curating those stories for their social ties.
Hypotheses
Our theoretical framework carries a set of empirical implications that we test in this article.
First, because networked curation occurs at the story level, we expect to observe audience heterogeneity within sources.
-
Hypothesis 1a: Productive curation. The partisan composition of sharing behavior will exhibit story-level heterogeneity within sources.
-
Hypothesis 1b: Consumptive curation. The partisan composition of viewing behavior will exhibit story-level heterogeneity within sources.
We further expect that this heterogeneity corresponds with substantive differences in the latent partisan appeal of the information being circulated and that it is not, in expectation, due to idiosyncrasies such as “hate-sharing” or noise.
-
Hypothesis 2: Partisan audience scores estimated at the story level will reflect the substantive partisan appeal of those stories.
Finally, we expect systematic variation in the extent to which partisan curation bubbles pose a challenge to interpreting and using source-level estimates of audience partisanship. Specifically, moderate domain-level estimates of audience partisanship are more likely to mischaracterize the partisan audience for any given story. Relatively more extreme source-level estimates, by contrast, will more frequently reflect the audiences for each individual story.
-
Hypothesis 3: Moderate source-level scores will more frequently mischaracterize the partisan appeal of their constituent stories.
We illustrate these general points in Table 1 that reports the stories (URLs) from the Wall Street Journal with the 10 most Republican audience scores and the 10 most Democratic audience scores. Domain-level scores often identify the Wall Street Journal as “neutral” (Bakshy, Messing, and Adamic Reference Bakshy, Messing and Adamic2015; Gentzkow and Shapiro Reference Gentzkow and Shapiro2010), and it carries a relatively centrist domain-level audience score of –0.34 in our Twitter data. However, we see that this score, if applied to every story produced by the Wall Street Journal, fails to adequately describe its cross-cutting content. The stories disproportionately circulated by Republicans are largely conservative opinion pieces from the editorial page. In contrast, none of the stories disproportionately circulated by Democrats are opinion pieces; they are mainstream reporting with content that is good news for Democrats and bad for Republicans. In other words, the Wall Street Journal does not have a consistent moderate audience; it produces individual stories with differential partisan appeal that reach different partisan audiences. This is the dynamic we will explore and test through our curation bubbles framework.
Note: On the left, the ten headlines with the most left-leaning audience scores; on the right, the ten headlines with the most right-leaning audience scores. Audience scores drawn from Twitter sample.
DATA AND METHODS
We use data from Twitter and Facebook to examine both domain- and story-level partisan curation bubbles in the United States. This cross-platform comparison allows for validation of our key results, but comes with challenges. People use different social media platforms for different reasons (Evans et al. Reference Evans, Pearce, Vitak and Treem2017), meaning that we would expect variation in engagement and exposure across the two platforms. However, both platforms are of scientific interest, with Twitter being particularly influential among journalists and Facebook being the social media platform the general public most frequently uses for news consumption overall (Jurkowitz and Gottfried Reference Jurkowitz and Gottfried2022; McGregor and Molyneux Reference McGregor and Molyneux2018; Molyneux and McGregor Reference Molyneux and McGregor2021). While these platforms’ user bases differ in size and composition, leading us to expect variation in the precise stories and domains which circulate on each site, users on both platforms engage in similar styles of networked curation.
Perhaps the bigger challenge to cross-platform analysis is methodological. Data on Twitter and Facebook are collected and structured differently, requiring slightly different approaches for estimating partisanship, as discussed in detail below. These differences, however, also come with opportunities. Our Twitter data contain information for individual users with fine-grained measures of their likely partisan affiliation. Our Facebook data do not contain such individual-level data, but it does include clicks, reactions, and views in addition to shares. Each platform therefore allows us to test phenomena that the other does not.
In total, our Twitter data consist of 405,531 unique URLs shared on that platform between January 1, 2017 and December 31, 2018. These URLs originated from a total of 8,378 domains. Our Facebook data consist of 218,395 unique URLs from 908 domains shared between January 1, 2017 and February 28, 2021. We analyze less content on Facebook because we focus on domains and URLs less affected by privacy-preserving noise. Each of these datasets and their partisanship measures are described in detail below.
Dataset 1: Twitter Users with Matched Voter Data
For this study, we collected tweets from a panel of Twitter users matched to U.S. voting records. Taking a user-focused approach to data collection allows us to identify a consistent population over time and to bring in user-level demographic information, including measures of party affiliation. A pilot version of this dataset was described in Grinberg et al. (Reference Grinberg, Joseph, Friedland, Swire-Thompson and Lazer2019) and more descriptives are provided in Hughes et al. (Reference Hughes, McCabe, Hobbs, Remy, Shah and Lazer2021) and Shugars et al. (Reference Shugars, Gitomer, McCabe, Gallagher, Joseph, Grinberg and Doroshenko2021). For purposes of this analysis, two details from those papers are relevant: first, although slightly more white and female than the population of American Twitter users, our panel is otherwise generally representative of Twitter users (Hughes et al. Reference Hughes, McCabe, Hobbs, Remy, Shah and Lazer2021); second, our vendor for voter data (TargetSmart) provides a modeled estimate of party identification that correlates well with aggregate electoral results, allowing us to avoid the vagaries of interpreting party registration across states (Shugars et al. Reference Shugars, Gitomer, McCabe, Gallagher, Joseph, Grinberg and Doroshenko2021).Footnote 5
Panel users were identified in 2017. Starting with 290 million profiles retrieved from Twitter’s 10% Decahose sample, we searched for profiles in which the Twitter names (display name or handle) and locations matched entries in the voter file that were unique at the city level (or state level, if the Twitter profile does not list city). We successfully matched 1.6 million accounts corresponding to registered U.S. voters. Because some users may go inactive, this represents an upper bound on our population size. Once identified, we retroactively collected panelists’ past tweets dating back to 2010. Since 2017, we have regularly collected all new, publicly posted panelists’ tweets.
In this analysis, we analyze URLs shared or retweeted by our panelists between January 1, 2017 and December 31, 2018. We include retweets in our analysis because this is the primary means of sharing content authored by others. While it is certainly the case that, on the margins, some sharing (and by extension in our Facebook data, clicking, reacting, and viewing) behavior is done with disapproval of the underlying content, past work has found that on Twitter this sort of “hate sharing” is concentrated in quote tweets (Wojcieszak et al. Reference Wojcieszak, Casas, Yu, Nagler and Tucker2022), and so we exclude URLs shared through this mechanism from our analysis.Footnote 6 We restrict our focus to URLs shared a minimum of ten times, giving us an initial set of 1,404,035 unique URLs originating from 82,293 domains. After excluding URLs not likely to contain political content (see discussion below) and domains with fewer than one thousand total shares, we focus our analysis on a subset of 369,675 politically relevant URLs originating from 718 domains.
Dataset 2: Facebook URLs
To analyze partisan circulation of content on Facebook, we use the Facebook Open Research & Transparency (FORT) URLs Shares dataset which is available to researchers through a collaboration with Social Science One (King and Persily Reference King and Persily2020; Messing et al. Reference Messing, DeGregorio, Hillenbrand, King, Mahanti, Mukerjee and Nayak2021). The dataset counts the number of people who viewed, clicked, reacted to, or shared any given URL. A URL must have received at least one hundred public shares to be included in the dataset, but data from private individuals are included in the released counts. The dataset fulfills differential privacy guarantees by adding Gaussian noise to all counts (Messing et al. Reference Messing, DeGregorio, Hillenbrand, King, Mahanti, Mukerjee and Nayak2021). Because the Gaussian function is constant across URLs, the signal-to-noise ratio is highest for high-engagement URLs.
Specifically, we begin by collecting all URLs from domains shared on Facebook at least 1 million times in the US between January 1, 2017 and February 28, 2021. This initial sweep yields 5,545,381 URLs from 1,132 domains. We then impose three filtering processes. First, to avoid irregularities introduced by the added statistical noise (Buntain et al. Reference Buntain, Bonneau, Nagler and Tucker2023), we only consider URLs that have been shared at least 1,000 times, viewed 10,000 times, clicked 5,000 times, and reacted to 5,000 times. Second, we remove all URLs not classified as political. Finally, we remove domains with fewer than ten unique URLs. These three filtering processes trim the Facebook dataset to 214,995 unique political URLs from 780 domains.
Classifying Political Content
Since we are interested in the curation of political content, we filter both datasets to URLs we classify as political. To do this, for every URL, we retrieve the title and “blurb”—the short text which is displayed for a URL on social media. For Facebook, this information is directly available through the FORT URLs Shares dataset. For Twitter, we scrape this information.Footnote 7 For both platforms, we classify each URL as related to politics or not politics using a convolutional neural network and word vectors initialized with the GloVe pretrained embedding (Pennington, Socher, and Manning Reference Pennington, Socher and Manning2014). The final classifier is trained on New York Times, Wikipedia, and Facebook data and achieves a precision of 99% and a recall of 92%.Footnote 8
Estimating Partisanship
We estimate a URL’s partisanship as the average partisanship of interactions with that content. This means slightly different things on different platforms, though we have conducted our analysis so as to make the platforms as closely comparable as possible. We elaborate on relevant measurement considerations here.
Our Twitter panel is matched to a commercial voter file that includes a reliable modeled estimate of each user’s likelihood of identifying as a Democrat on a 0–100 scale.Footnote 9 Where possible, we use this numeric representation of likely Democratic identification rather than trichotomizing the measure to partisan categories in order to preserve as much information about model uncertainty as possible. For the purposes of capturing the partisanship of sharing behavior in a manner comparable to traditional approaches, we implement a linear transformation of this score by subtracting it from 50 and then dividing by 50 to put this modeled estimate on $ [-1,1] $ scale running from most Democratic to least Democratic. This is preferable to relying solely on party registration, which is not collected in all states.
Our Facebook data do not include a direct measure of user partisanship, but aggregate interaction counts into five categories of political ideology: −2, −1, 0, +1, and +2, from very liberal to very conservative. These labels are included with the FORT URL Shares dataset and are estimated based on the political pages a user follows, similar to Barberá et al.’s tweetscores (Barberá et al. Reference Barberá, Jost, Nagler, Tucker and Bonneau2015; Messing et al. Reference Messing, DeGregorio, Hillenbrand, King, Mahanti, Mukerjee and Nayak2021). While typically interpreted as ideology, this measure anchors Democratic politicians on one side and Republican politicians on the other. For consistency with our Twitter data, we therefore refer to this measure as partisanship.
Added Gaussian noise in the Social Science One data can make this calculation difficult: popular but hyperpartisan URLs in our Facebook dataset are shown as having negative share counts among out-partisans. Constructing audience scores with these negative counts could lead URLs to fall outside of the $ [-1,1] $ range when normalizing. To avoid this issue, for URLs with any negative counts in any political categories, we add the largest absolute value of category-level negative counts to all categories. This coerces the minimum category-level count to zero and constrains the resulting audience score to the $ [-1,1] $ range. This allows us to calculate political scores in a straightforward manner without substantively altering our methodological approach or eventual results.
For both platforms, we then calculate a URL’s partisan audience score as the average partisanship of its interactions. For Twitter, this means assigning each sharing event the modeled partisanship of the user who shared the URL, and then averaging those scores. For Facebook, we construct this average based on the counts of shares in the five partisanship categories, weighting interactions by the partisan values of −2 to +2, before normalizing to the common $ [-1,1] $ scale. Consistent with past work finding that Twitter has a more left-leaning user base than Facebook (Wojcik and Hughes Reference Wojcik and Hughes2019), the total average audience score on Twitter is −0.39, whereas on Facebook, the analogous score (based on sharing behavior) is −0.12.
We use sharing behavior on both Twitter and Facebook as our primary estimate of content partisanship on these platforms. In order to preserve information regarding individuals who share the same content multiple times, we use the total number of shares rather than the unique number of sharers which has been used in past work (Bakshy, Messing, and Adamic Reference Bakshy, Messing and Adamic2015; Robertson et al. Reference Robertson, Jiang, Joseph, Friedland and Lazer2018).Footnote 10
In addition to this cross-platform comparison, we calculate three more estimates of content partisanship on Facebook using measures of views, clicks, and reactions. While only available for our Facebook dataset, these measures establish the robustness of our main findings and give additional insight into the multifaceted curation process of social media. “Views” refer to the number of times a piece of content appeared within a user’s feed; “clicks” capture the consumption choices of what users click on once they are exposed; and finally, “reactions” indicate users’ public responses of “like,” “love,” “haha,” “wow,” “sorry,” or “anger.”
Statistical and Substantive Evidence of Curation Bubbles
We use a number of strategies to test the empirical implications of our curation bubbles framework. Our tests of H1a and H1b begin with comparing URL- and domain-level audience scores on each platform. If story-level partisan composition follows source-level composition, these distributions will be similar. However, if story-level partisan composition is heterogeneous as we hypothesize, these distributions will differ. Specifically, the distributions of URL-level scores should exhibit more extremity than the distributions of domain-level scores.
On Facebook, we are able to test both productive (H1a) and consumption (H1b) curation. While the sharing data of Twitter only allow for testing H1a, the user-level data from this platform allow for further tests by user partisanship. For example, we can compare the average domain-level audience score of URLs Democrats share with the average URL-level audience score of URLs Democrats share. This allows us to further test H1a by examining possible partisan drivers of our results.
Differences between URL and domain-level distributions are important because we expect that scores estimated at the story level will reflect the substantive partisan appeal of those stories (H2). To test this, we had a team of hand coders evaluate the partisan appeal of a sample of one thousand news stories drawn from our Twitter data. These hand coders, a collection of graduate students and postdocs, were asked to evaluate the appeal of selected stories to Democrats (−1) or to Republicans (1), or both equally (0). The full instructions are included in Appendix B of the Supplementary Material. We sampled URLs for coding with probability proportional to the absolute deviation between the URL-based audience score and domain-based audience score; that is, we oversampled stories in partisan curation bubbles. Each coder evaluated five hundred stories (Krippendorf’s $ \alpha $ = 0.673), and we averaged the results to produce a hand-coded score of partisan appeal on the same scale as the URL-based audience score.
Finally, we expect systematic variation in how well domain estimates capture the substantive partisan appeal of their constituent stories (H3). Specifically, we expect that moderate domain scores are more likely to mischaracterize the partisan appeal of stories (URLs) from that domain. We test this by estimating the extent to which we can statistically distinguish URL-level audience scores from their parent domains’ audience scores. These tests also serve as an important robustness check for H1a and H1b, demonstrating that differences in distributions are not merely due to partisan variation in the volume of URLs associated with different types of sources.
More formally, we test the extent to which individual URLs have partisan audiences that are statistically and substantively distinguishable from the aggregate audience of their associated domain. This assumes that, in the absence of partisan curation bubbles, URL-level estimates of audience composition would be sampled from a normal distribution centered at the domain-level audience score.Footnote 11 We can then test, for each constituent story, whether its observed audience score is statistically distinguishable from a story generated under this null hypothesis.
By way of example, consider individual URLs associated with the New York Times on Twitter. The mean partisanship (and therefore the domain score) for nytimes.com is $ -0.59 $ and the standard deviation is $ 0.59 $ . Under the null hypothesis of no partisan curation bubbles, we would expect most URLs to have audience scores that fall within an interval characterized by the standard error of the mean, with the width depending on our chosen confidence level. For the New York Times, the interval will be centered on $ -0.59 $ , with the width depending on our chosen confidence level, the domain-level standard deviation of shares $ (0.59) $ and on the square root of the number of URL shares. One such URL, an editorial criticizing a court decision on voter-registration policies,Footnote 12 was shared 75 times and has a URL score of−0.72. This point estimate is to the left of the domain-wide average, but is not statistically distinguishable from it at the 99% confidence level. as this score falls within the interval $ -0.59\pm 2.57\frac{0.59}{\sqrt{75}}=[-0.76,-0.41] $ . On the other hand, an editorial denouncing the shooting of Representative Steve ScaliseFootnote 13 (175 shares, URL score $ 0.25 $ ) falls well outside of the interval, $ -0.59\pm 2.57\frac{0.59}{\sqrt{175}}=[-0.70,-0.47] $ , allowing us to infer that the audience for this specific story does not reflect a data-generating process in which every New York Times story’s audience is sampled from an identical distribution.
One concern with this approach is that, given the large share volume for many stories, we could reject the null hypothesis of no difference despite trivial substantive differences. To address this, we conduct a further test for substantive significance by widening the confidence intervals by 0.1 in each direction, analogous to the use of two one-sided tests in equivalence testing (Rainey Reference Rainey2014). Testing for differences of at least 0.1 on the [−1,1] scale (i.e., 5% of the full possible range) corresponds to perceptible differences in audience composition on the left and the right. For example, the difference in Twitter-based domain scores between breitbart.com and nationalreview.com is 0.11 and the difference between thenation.com and theatlantic.com is 0.09.
For each domain, we then calculate the proportion of constituent URLs that have audience scores statistically and substantively distinct from the domain-level average. For example, for the New York Times, only 5% of stories are substantially different from the domain score of –0.59. Per H3, we expect this proportion to be higher for more moderate domains (e.g., those with a score close to 0), and lower for more extreme domains (e.g., those closer to –1 or 1).
RESULTS
We evaluate: whether story-level heterogeneity in partisan appeal is reflected in sharing behavior (H1a) and viewing behavior (H1b); whether partisan appeal is recognizable at the story level (H2); and whether this heterogeneity leads moderate source-level estimates to mischaracterize the partisan appeal of their constituent stories more frequently than extreme source-level estimates (H3).
Testing for Productive and Consumptive Curation
We first show evidence of productive curation (H1a) by plotting the distributions of domain- and URL-level audience scores based on sharing behavior on Twitter and Facebook in Figure 2. Here, we focus on domains within the top quartile by number of political URLs, though in Appendix I of the Supplementary Material, we show that these results are consistent across a variety of thresholds. In both cases, the distributions of URL and domain score differ from one another. The differences are statistically significant under a Kolmogorov–Smirnov test (Facebook: $ D=0.12;\hskip0.3em p=0.0101 $ ; Twitter: $ D=0.23;\hskip0.3em p=0.0003 $ ). When visually comparing the distributions between platforms, it may seem counter-intuitive that the Twitter distributions have a larger test statistic. The KS test is comparing the largest absolute deviation between the empirical cumulative distribution functions rather than the whole shape of the distribution. If you use a distance measure that compares the full distribution—Wasserstein distance—the larger substantive difference between the distributions on Facebook is apparent ( $ D=0.18 $ for Facebook; $ D=0.08 $ for Twitter). Here, what is relevant is that in both cases the Kolmogorov–Smirnov test rejects the null hypothesis of no differences between the domain- and URL-level distributions. As we show below, the relative visual similarity between the domain- and URL-level distributions on Twitter masks substantial heterogeneity across domains.
In our Twitter sample, there is considerably more political news sharing on the political left than on the right (the platform-wide audience score is –0.39), which is at least partially attributable to the partisan composition of Twitter’s user base and our sample of Twitter users (Hughes et al. Reference Hughes, McCabe, Hobbs, Remy, Shah and Lazer2021), while Facebook is more balanced (its corresponding score is –0.14). However, on both platforms—and especially on Facebook—the distributions of audience scores at the URL level exhibit more extremity than they do at the domain level —providing preliminary support for H1a.
To better understand productive curation in the context of a platform’s user base, we further examine variation in domain- and URL-level audience scores by users’ modeled partisanship (Figure 3). For ease of visualization, here we trichotomize our user-level measure of modeled partisanship into likely Democrats ( $ p(Dem)>0.65 $ ), unlikely Democrats ( $ p(Dem)<0.35 $ ), and users for whom likely partisanship is uncertain ( $ 0.35\le p(Dem)\le 0.65 $ ). The figure shows that domain-level scores mischaracterize the sharing profiles of a significant number of users in all three groups,Footnote 14 but especially those who are unlikely to be Democrats. Under a domain measure, 28% of unlikely Democrats’ political information sharing is, on average, to the right of 0; and 1.5% is to the right of 0.5. Using a URL measure, these respective percentages are 39.1% and 7.3%. This indicates that there are a substantial number of unlikely Democrats who tend to share political information from sources with generally left-leaning audiences when specific stories from those sources are disproportionately shared by right-leaning users, further supporting H1a.
We illustrate this dynamic in Figure 4, which plots individual URLs’ audience scores and share volume on both Twitter and Facebook for the New York Times, the Wall Street Journal, Mediaite, Fox News, RT, and Reason. These cases are illustrative in that they vary in size and overall audience partisanship. The domain-level audience score for each is shown with a vertical dashed line; individual stories with audience scores statistically and substantively distinguishable from the domain score at the 99% level are shown with greater opacity, and in blue (more Democratic than expected under the null) or red (less Democratic than expected under the null), relative to those that are within this uncertainty interval. Stories that could be statistically significantly distinguished from the source under a null hypothesis of no difference, but whose difference did not meet our threshold of substantive significance, are shown in yellow. This figure shows that, even for sources with more extreme overall audience scores and a relatively lower share of stories in partisan curation bubbles, atypical partisan audiences do often find specific information from those sources to circulate at high volume, consistent with H1a.
This figure also previews the dynamic we will systematically test in H3. For sources with neutral domain scores, story-level fluctuations between different partisan audiences are in some cases the norm—especially for the stories that circulate at high volume. The Wall Street Journal’s well-known divide between its “hard news” and editorial content, discussed above, is further apparent here—as is Mediaite’s idiosyncratic audience.Footnote 15 While we further test H3 across all sources, these results provide preliminary evidence that domain-level scores near zero cannot be straightforwardly interpreted as indicating reliably neutral content. Moreover, it is precisely the domains with the most neutral audience scores that exhibit the most within-domain partisan heterogeneity. These domains are not garnering neutral audience scores solely by producing content that is consistently shared by Democrats and Republicans at equal rates; they often produce content that is alternately shared by either Democrats or Republicans disproportionately.
We first test whether the heterogeneity we observe in productive curation (H1a) extends to consumptive curation (H1b). Using Facebook’s FORT URLs dataset, we recalculate domain and URL-level scores using the consumptive measures of clicks, reactions, and views, along with the productive measure of shares for comparison. The results are shown in Figure 5. While we find that public-facing behaviors (shares and reactions) exhibit more extremity than private behaviors (clicks and especially views), we again find that the domain-based approach consistently understates the partisan extremity of engagement, regardless of how engagement is measured. This supports H1b, showing that story-level heterogeneity in sharing behavior (productive curation) is carried through to story-level heterogeneity in viewing behavior (consumptive curation).
Substantive Differences in Partisan Appeal
We argue that this heterogeneity is attributable to substantive differences in story-level partisan appeal (H2). To test this, we use the subset of one thousand URLs for which we have hand coded partisan appeal. Specifically, we test whether variation in humans’ assessments of whether Democrats or Republicans would view a story more favorably is better explained by URL-level or domain-level audience scores. We find there is a strong correlation ( $ r=0.75 $ ) between the URL audience scores and the evaluated partisan appeal, and that this relationship is much stronger at the URL level than at the domain level ( $ r=0.55 $ ). This is further illustrated in Figure 6, which plots human-evaluated partisan appeal against URL-level audience scores, with domain-level audience scores reflected in the color gradient. When partisan appeal is regressed against URL and domain scores together (Table E.1 in the Supplementary Material), an F-test supports the inclusion of domain scores as improving model fit ( $ F=41.01 $ ), but the substantive improvement is minimal, shifting the adjusted $ {R}^2 $ from 0.554 to 0.571. Put simply, we find support for H2: variation in URL-based audience scores reflect variation in the substantive partisan appeal of a story, and this is not a product of source cues.
Differential Implications for Domain-Level Estimates
Finally, we test systematic differences in the extent to which curation bubbles distort estimates of partisan audiences across the [−1,1] scale at the source level (H3). Figure 7 shows, for our Twitter data, the proportion of each domain’s stories that significantly differ from the domain-level audience score under a null hypothesis of differences less than 0.1 (i.e., the proportion of stories that are substantively distinct from the domain-level average) by the domain-level audience score, with illustrative sources labeled. We supplement this analysis in Figure 8, showing the same dynamics across both productive and consumptive measures using our Facebook data.
While Facebook’s larger user base and higher volume narrows confidence intervals such that larger proportions of stories are statistically distinguishable from their domain-level averages in general, both platforms show a clear trend. While every domain at least occasionally produces stories that circulate among atypical partisan audiences, this is significantly more common as domains’ audience scores approach zero (see OLS regressions in Appendix H of the Supplementary Material). That is, the more neutral the domain-level audience score, the more frequently that domain’s constituent stories have partisan audience scores that are substantively different than the domain’s audience as a whole. This supports H3, showing that moderate domain scores tend to mischaracterize the partisan appeal of their constituent stories at the highest rates.
Results for specific domains, shown on the plot, reflect qualitative understandings of those domains’ audiences as well. For example, hyper-partisan outlets such as Daily Kos and the Huffington Post on the left, and Breitbart and Fox News on the right, have extreme domain scores and fewer stories that circulate among atypical partisan audiences. By contrast, the Wall Street Journal’s domain-level audience score of –0.34 is frequently ill-suited to describe individual stories the newspaper publishes, as previously indicated in Table 1. Furthermore, the domains with audience scores that least frequently capture the partisan appeal of their constituent stories are those with audience scores near zero, such as Mediaite or the New York Post. It is also worth noting that domains with audience scores near zero and relatively less within-domain heterogeneity are often outlets that are ideological in ways that do not neatly reflect partisanship in the US, such as the Russian state-sponsored RT.
DISCUSSION
We find evidence of partisan curation bubbles across our analyses, as users share and consume information with consistent partisan appeal from a variety of sources. These partisan curation bubbles frequently lead to story-level heterogeneity within sources for both productive (H1a) and consumptive (H1b) curation. This audience heterogeneity likely reflects a heterogeneity of within-source partisan appeal, as audience scores estimated at the story level do reflect the partisan appeal of content (H2). Furthermore, we find systematic variation in this heterogeneity—with more moderate estimates of domain-level audience partisanship more frequently mischaracterizing the partisan valence of individual stories (H3). This suggests that relatively moderate domain-level scores are often the result of different stories circulating among different partisan audiences, rather than every story reaching a consistently balanced audience.
It is likely that elements of these curation processes predate the internet and social media. For example, opinion leaders who subscribed to a given newspaper may have tended to read and talk about particular stories that matched their prior political preferences. However, observing this process prior to the internet would have required an impossible scale of instrumentation. There is a sense in which the internet has merely made this process visible. For example, recent work examining user behavior within Google Search indicates that even though users’ search results do not systematically vary by partisanship, their choice of which search results to click on does (Robertson et al. Reference Robertson, Green, Ruck, Ognyanova, Wilson and Lazer2023). The theoretical mechanisms underlying these findings could be extended in further work—such as by experimentally manipulating the pairing of politically (in)congruent stories with politically (in)congruent sources to directly test the extent to which users are willing to share politically favorable information from ideologically distant sources.
However, we argue that the internet has dramatically changed the structure of supply and demand to make networked processes of curation far more important. The fundamental logic of the internet is competition for attention at a granular story level, and the ratio of available information to human attention has increased by many orders of magnitude. This is in contrast to pre-internet competition at the outlet level, with consumers choosing stations to watch and newspapers to subscribe to. The networked curation processes of social media allows individuals to delegate the task of navigating a functionally infinite amount of information to other users who regularly share information that appeals to their identities and interests. One of the natural results of this process are partisan curation bubbles.
Purely with respect to measurement, our findings suggest that source-level measures of audience partisanship should be used with caution as they risk overestimating the partisan diversity of information consumption. All but the most extreme sources have a meaningful amount of partisan heterogeneity at the story level, and for some sources this is the rule rather than the exception. There will be times when source-level aggregation is theoretically warranted or practically necessary. In settings outside of social media where information consumption is not characterized by networked curation, source- and story-level estimates may generate similar results. Our point is to emphasize that source-level aggregation is a measurement choice that must be considered on a case-by-case basis.
More broadly, these findings shed new light on the macro structure of information consumption on feed-based social media. While we empirically demonstrate that information consumption on these platforms is more politically homogeneous than prior empirical accounts, we view the networked curation processes that produce these results as a feature of democratic participation as much as others might view the resulting polarized consumption as a normative concern. Social media is, at its core, social, allowing users to use information to perform their identities and advance their interests in the context of democratic participation. To the extent to which these identities and interests diverge—particularly among the most politically engaged, who are the most likely to perform opinion-leading functions on social media (Hughes Reference Hughes2019)—so too will the information that circulates among different audiences. While much of the literature takes polarized information consumption as distressing for democracy, it is not obvious that this, in and of itself, is a problem to solve (Kreiss and McGregor Reference Kreiss and McGregor2023). In this sense, these findings underscore the long-standing trade-offs between exposure to opposing views and democratic participation (Mutz Reference Mutz2006; Stroud Reference Stroud2011)—with different sites at which individuals express themselves and exchange their views being better suited for one or the other.
While the analyses here focus on audience partisanship, our theoretical framework problematizes source-level analyses of information consumption on social media more generally. For example, with respect to the study of political misinformation, preliminary evidence indicates that users interested in promoting false or misleading narratives often strategically repurpose factually true information from reliable sources in order to do so (Goel et al. Reference Goel, Green, Lazer and Resnik2024). Domain-level measures of political information cannot detect this behavior, but it naturally follows from individuals engaging with and using information on social media to perform their identities and advance their interests. Furthermore, individuals who report low levels of trust in mainstream sources on surveys may base these evaluations more on the sources’ reputations than their specific interactions with information those sources produce (Peterson and Kagalwala Reference Peterson and Kagalwala2021), and likely still recognize that such sources are perceived as credible by others (see also Pennycook and Rand Reference Pennycook and Rand2019). Despite their stated distrust in mainstream sources overall, these individuals may nevertheless find specific information from these sources useful when it suits their purposes (Baum and Groeling Reference Baum and Groeling2009). Accounting for networked curation is crucial for aligning theory and measurement on large-scale platforms where such affordances are available.
Finally, it is important to note that socio-technical systems are elastic, and that different design choices may lead to different outcomes (Bail Reference Bail2021). For example, newer platforms such as TikTok have de-emphasized the curation influence of followed accounts to rely more directly on the estimated relevance of specific pieces of content. The variation of platform features and affordances suggests a promising line of future research in examining the dynamics and democratic outcomes of networked curation across different platforms. We believe that content choice for information consumers is permanently expanded relative to the twentieth century. While curation bubbles may pop, the process of networked curation connecting people to content they want to see from a set of vast choices is a permanent feature of the information landscape.
SUPPLEMENTARY MATERIAL
To view supplementary material for this article, please visit https://doi.org/10.1017/S0003055424000984.
DATA AVAILABILITY STATEMENT
The raw data underlying this article cannot be shared due to privacy concerns arising from matching data to administrative records, data use agreements, and platforms’ terms of service. Research documentation (including all code) and secondary Twitter data that preserves user anonymity are openly available at the American Political Science Review Dataverse: https://doi.org/10.7910/DVN/1ONKDX.
ACKNOWLEDGMENTS
The authors would like to thank Taylor Carlson, Bruce Desmarais, Shannon McGregor, and C. Daniel Myers; participants at the 2021 meetings of the American Political Science Association, Midwest Political Science Association, and Society for Political Methodology; participants at the Duke Behavior and Identities Workshop; and three anonymous reviewers for helpful feedback on earlier versions of this manuscript. The authors would also like to thank Ronald Robertson for making reference data available and John Harrington for research assistance.
FUNDING STATEMENT
S.M. was supported by the John S. and James L. Knight Foundation through a grant to the Institute for Data, Democracy & Politics at the George Washington University. S.C. is supported by a Bloomberg Data Science Ph.D. Fellowship. D.L. acknowledges support from the William & Flora Hewlett Foundation and the Volkswagen Foundation.
CONFLICT OF INTEREST
The authors declare no ethical issues or conflicts of interest in this research.
ETHICAL STANDARDS
Facebook data in this study were obtained from Meta, as part of Facebook Open Research & Transparency (FORT), an initiative to facilitate the study of social media’s impact on society. Researchers seeking permission to use the FORT platform must (1) apply to become an approved partner and (2) sign the Research Data Agreement (RDA), a publicly available legal agreement. The RDA prohibits sharing Facebook data with any third party. Researchers may request access to Facebook data at https://socialscience.one/rfps. Collection of Twitter data and linkage to administrative records was approved by the Institutional Review Board at Northeastern University (#17-12-13). The authors affirm that this article adheres to the principles concerning research with human participants laid out in APSA’s Principles and Guidance on Human Subject Research (2020).
Comments
No Comments have been published for this article.