Algorithmic recommender systems underlie much of the technology we interface with today, influencing everything from what products we buy, to what news sources or political viewpoints we’re exposed to, to how much income we earn from content that we create. In the context of social media, recent attention has turned to how these algorithmic systems can increase the reach of certain types of content relative to the reach they would have gotten under some other neutral baseline—a phenomenon called “algorithmic amplification.” While the concept of algorithmic amplification could be applied to any type of content—for example, we could study the excess reach of cat memes on a platform—by and large, existing efforts to study, measure, or regulate algorithmic amplification have sprung from concern that algorithmic amplification acts as a vector for specific societal problems or harms.
One societal problem that has been central to the discourse and scholarship around algorithmic amplification is the capacity of algorithmic systems to expose users to overtly harmful content, such as extremist or radicalizing material (Whittaker et al. 2021) and misinformation (Fernández, Bellogín, and Cantador, n.d.). For example, in 2021 Reps. Anna Eshoo (D-CA) and Tom Malinowski (D-NJ)- introduced the Protecting Americans from Dangerous Algorithms Act, which intended to hold companies “liable if their algorithms amplify misinformation that leads to offline violence,” according to a press release from Representative Eshoo’s office (“Reps. Eshoo and Malinowski Reintroduce Bill to Hold Tech Platforms Accountable for Algorithmic Promotion of Extremism” 2021). More recently, the Supreme Court heard a pair of cases, Gonzalez v. Google and Twitter v. Taamneh, which considered whether social media companies could be held liable for allowing extremist content from ISIS on their platforms (Granick 2023).
When it comes to overtly harmful content, amplification is undesirable in an absolute sense—any amplification is unwanted. By contrast, the second concern about algorithmic amplification is relative; it hinges on disproportionalities or unfairness in amplification, especially as it relates to disparities in amplification among groups. In this case, the views on what amplification is justified can vary; some argue that amplification is not an issue if it aligns with user interests, while others might believe that amplification should act as a positive social force to amplify traditionally marginalized voices. The interest in disproportionalities in amplification is founded in the concern that social media platforms unfairly allocate more influence to some types of people than others or tip the scales of public opinion by amplifying certain viewpoints over others, for example the U.S. political left over the political right (Frenkel 2018). Additionally, as people increasingly gain economic opportunities via their social media presence, disparities in algorithmic amplification could unfairly allocate the economic spoils that follow from having a robust social media presence to already privileged groups.
Although the topic of algorithmic amplification has been prevalent in the public discourse, the concept remains murky. In particular, although algorithmic amplification as an abstract concept is clear, operationalizing this definition to detect and measure algorithmic amplification remains challenging. In our view, this largely stems both from imprecision about which models and systems are included in “the algorithm” and a lack of clarity about an appropriate “neutral baseline” against which to measure amplification. In this essay, we will disentangle how various components of a typical social media platform beyond what is commonly considered “the algorithm” can contribute to amplification and simultaneously confound its measurement. In doing so, we enumerate the various components that go into producing curated feeds of content on social media beyond just the ranking algorithm itself, including content moderation, account recommendation, search, business rules, auxiliary models, and user behavior. We then challenge the notion of a “neutral baseline” and illustrate how, in practice, the most common choice of baseline fundamentally depends on the state of some components of the system. By establishing a baseline dependent on the state of any individual system component, we assume away the effects of past bias and amplification that brought the system component to its current state. This leads to unsatisfying and potentially misleading conclusions about whether and to what extent certain content is amplified. We conclude by returning to the underlying concerns spurring the discussion of algorithmic amplification and propose concepts and measures we believe may more effectively address those concerns because they do not rely on the assumption of a neutral baseline for comparison.
In order to study algorithmic amplification, we must first define it. We adopt the working definition: “The extent to which the algorithm gives certain content greater reach than it would have gotten with some other neutral baseline algorithm.” Operationalizing this definition requires us to further define “the algorithm” and the neutral baseline against which it is measured. We take these in turn.
Defining “The Algorithm”
Algorithmic amplification is ultimately interested in the different levels of attention and exposure content receives. Thus, we begin our tour of “the algorithm” by defining the surfaces through which the algorithm delivers content to users and the different technical components that drive each of the surfaces. The pieces we define here are not specific to a particular social media platform but are instead amalgamations of features we have observed or that have been publicly documented across the industry. This general framework can be compared to the more technical documentation recently released by Twitter regarding the structure of their system.
Surfaces of Exposure
When envisioning a typical social media system, we often first think of the “main content feed”—for example, Twitter’s For You timeline, Facebook’s Feed, TikTok’s For You Page. This is usually the first surface that users encounter when they log in, and it contains content curated to their tastes. This feed can include content from accounts that the user has specifically opted into (which we will call “followed” or “in-network” accounts), as well as content from accounts that the user has not opted to follow (which we will call “unfollowed” or “out-of-network” accounts). The policies as to what kind of out-of-network content can be included in a user’s feed vary by platform, with some platforms displaying mostly “in-network” content and others relying almost entirely on machine learning algorithms to infer what—from the universe of all content—a user would most like to see (Narayanan 2023).
In addition to the main feed, users have other ways of finding content. Most platforms include some space primarily intended for exploration of out-of-network content. For example, Twitter and Instagram both have an “Explore” page. These pages contain trending content, but are often still curated to a user’s inferred preferences. Another path to a particular account’s content is through the search bar. By searching for a particular topic or a particular account name, users can discover content relevant to their search query; Twitter and TikTok both have examples of this functionality. Additionally, content creators often collect the links to their social media pages on a single page—for example, linktr.ee provides this capability, and external websites often link to specific authors or content when citing sources. As such, it is possible for a user to find a creator’s content through a direct profile view or directly landing on a particular piece of content as well. Finally, there are some app-specific features, such as notifications, curated lists and topics, hashtags, etc. that give users pathways to different kinds of content.
Primary Ranking Models
The surfaces enumerated above are all driven to some extent by recommendation and ranking models—machine learning models that decide how relevant a particular piece of content is to a user. This model takes in a set of candidates and scores them according to how likely the content is to be relevant to the user. The candidate pieces of content are then sorted by their score and displayed. Because it would take an infeasible amount of computation to score every piece of content for every user, the ranking model is usually fed a smaller set of candidates by a computationally cheaper model or set of heuristics. This first step is called the candidate generation step. When a recommendation system works in this way, it is called a two-stage recommender system (Wang and Joachims 2023). In practice, the rules and criteria used to generate the set of candidates can have a great impact on what content is ultimately seen by the user (Bower et al. 2022). Also, depending on the design of the system, different surfaces may have different two-stage recommenders trained for them, depending on the objective of the surface. For example, an Explore page likely has a different model driving it than the main content feed. This ultimately depends on what each surface is optimized for and how much model reuse can happen. It is a technical design choice for the platform.
Peripheral Ranking Models
In addition to the models deciding how to order content on different surfaces of exposure, there are other models that drive various interactions on the platform. One prime example is the account recommendation model, which suggests other users that you may be interested in following (Twitter Account Suggestion Help Page; Facebook People You May Know Help Page). Another is the search ranking algorithm, which will rank both content and user accounts based on their relevance to a user’s query. Your past activity, inferred interests, and existing connections can all factor into what these models rank highly. Generally speaking, a peripheral model is any model that is ranking content of some kind that does not appear on one of the main exposure surfaces. This could include ad ranking, trending news, and many other types of content.
On top of models that do ranking, there are also many machine learning models that compute scores or vector representations for content or accounts. The most prominent of these are related to content moderation, or the process of monitoring potentially harmful content on platforms. This monitoring can include both content that explicitly violates platform terms of service or rules and content that is not clearly violative of terms of service but does detract from a healthy platform, often referred to as “toxic.”
Models are used at many different stages of content moderation. Some are used to decide what content to refer to human reviewers (Lai et al. 2022). Other models assign a score to determine how toxic or marginally abusive a particular piece of content is, and this score may be used in determining overall ranking scores (Bandy and Lazovich 2022; Yee et. al. 2022). In other cases, users might even be prompted to reconsider their content before posting it based on a model’s toxicity score (Katsaros et. al. 2021).
In addition to content moderation scoring, there are often models which seek to extract representations of items, like content or accounts, that can be fed into downstream models. Some approaches include extracting joint “embeddings”—numerical representations that capture semantic similarity—for accounts, advertisements, and posts (El-Kishky et al. 2022). Others label content as pertaining to different topics or interests that are relevant to a user.
In this work, we refer to these models as auxiliary models. While they are usually not directly responsible for ranking or suggesting content, they do greatly affect what content is seen, removed, or amplified on the platform. These are also auxiliary in the sense that the scores or vectors these models produce are often used as inputs to other models.
Manual Components or “Business Logic”
In addition to all of the data-driven pieces of the system, there are often also manually coded policy and business decisions that influence and moderate the display of content. For example, “brand safety” policies might be implemented to ensure that certain types of content are not shown next to particular brands’ advertisements. Other heuristics might be used to ensure a user’s feed is not too repetitive, such as limiting the number of times ads or posts from the same user can be shown consecutively. Other business rules might decide to highlight a particular new feature by displaying it prominently in the content feed. While the existence of such policies is often opaque to the end user, these too have a profound effect on what content gets shown and where.
How Everything Is Connected
When it comes to algorithmic amplification, which part of this complex system is “the algorithm”? Typically, the content ranking model underlying the main feed is thought of as “the algorithm”. However, the content that is ultimately shown to users is impacted by every part of this system, either directly or indirectly. For example, Figure 1 presents a bird’s eye view of the interconnected “model spaghetti monster” that is a social media platform. This shows the different surfaces of exposure and various models that drive them, along with other encapsulations of the state of the system such as the follow network. Gray arrows show how various models and algorithms feed into one another to ultimately deliver content to a user. Colored errors show the pathways by which a user’s interactions with the content they are shown impact system components, either by serving as new training data for an underlying model or directly altering the state of the follow network. One obvious pathway by which all models ultimately contribute to the main content ranking model that populates the main feed is via the follow network. All system components ultimately are involved in displaying content to users in one way or another. The content the user is shown (or not shown) all ultimately impacts what they choose to follow. The follow network then serves as an input to the main content ranking models, thereby creating an indirect effect of every system component on what appears in the main feed. Because of the inter-connected way in which all of the system components can ultimately impact what content is served to a user, “the algorithm” is, in fact, the entire system.
Figure 1: A schematic diagram of a typical social media platform.
Defining the baseline
Per our working definition, amplification is defined relative to some “neutral baseline” algorithm. More broadly, this baseline could be defined as exposure that would have been obtained without the existence of the platform at all, though in practice, the neutral baseline is typically thought of as the outcome that would have occurred if some other, more “neutral” algorithm had been used on the platform as it otherwise exists. What, then, is a reasonable neutral baseline against which to compare?
Recently proposed legislation also points towards some candidates for what could be considered a “neutral baseline.” While the previously mentioned Protecting Americans from Dangerous Algorithms Act did not venture into technical definitions of algorithmic amplification, it did specify what types of systems the proposed bill would and would not apply to, strongly implying what types of algorithms they might consider neutral by exclusion. Specifically, this bill designated that the proposed law would apply to any interactive computer service that “used an algorithm, model, or other computational process to rank, order, promote, recommend, amplify, or similarly alter the delivery or display of information” and specifically excludes systems that are easily understood by humans, such as displaying content in reverse chronological order, in order of overall popularity, or cases in which a user explicitly seeks out the content. While this does not specify how amplification ought to be measured, by stating what types of rank ordering are ineligible, it provides useful guidance on what the authors might consider a neutral baseline against which to measure excess transmission.
Of those options, the most commonly proposed baseline algorithm is “reverse chron,” i.e., an algorithm that displays in-network posts in reverse chronological order. This baseline was recently used in (Huszár et al. 2021) to compare amplification of different elected officials on Twitter in what is—in our opinion—the best example to date of measuring algorithmic amplification on a real social media platform. In light of this, going forward we focus on analyzing amplification with reverse chron as the baseline, and the implications of this choice of baseline on our measurement of algorithmic amplification. Then, in mathematical terms, our working definition of amplification becomes
where x defines the group of users or content type for which we are calculating algorithmic amplification and an impression is the event of a user viewing a piece of content.In practice, this value could be calculated counterfactually by keeping track of how many impressions would have gone to x under each algorithm, regardless of which was actually used. Or, it could be calculated by comparing the number of impressions to x by users using the current system to the number using reverse chron and normalizing appropriately for the relative number of users in each condition.
The implicit assumption in adopting reverse chron as a neutral baseline is that the accounts that a user follows are a neutral representation of the content that user wants to see. But, is this necessarily the case? Suppose the peripheral account recommendation model preferentially recommends accounts of a certain type. Or, similarly, suppose that the search function preferentially places certain types of accounts near the top of the returned results, making it easier for users to find some accounts than others. This amplification of the “preferred” accounts by search or account recommendation would cause those accounts to be over-represented among the accounts users follow relative to other accounts that received no such amplification from peripheral models. Thus any “algorithmic amplification” that would be detected by comparing the exposure the preferred accounts received to the exposure those accounts would have gotten under reverse chron would only account for the marginal amplification due to the ranking model alone. The amplification due to the biased search or account recommendation would be subsumed by the “neutral baseline” and subtracted away.
Figure 2: Example of reverse chronological feed (left) vs. ranked feed (right).
Extending this argument, adopting a reverse chronological baseline bakes in the assumption that the past behavior of the system, including the ranking model, was neutral. For example, suppose the system has historically preferentially placed content produced by certain accounts near the top of the ranking, allowing those accounts more opportunities to amass followers. If we adopt a baseline built upon the follow graph, we bake the historical advantage enjoyed by some accounts into the calculation, thus under-estimating the advantage they currently receive relative to what we would have calculated if the past had truly been neutral. By establishing a baseline dependent on the state of any individual system component, we assume away the effects of past bias and amplification that brought the system component to its current state.
An illustrative simulation
To concretize the dynamics described above, we present an illustrative simulation. First, suppose we have a population of n = 100 accounts, all of which are "identical” in the sense that their interests are all drawn from the same distribution. That is, at each time point t, each account draws a value from a vit ~ N(0,1) which represents the interests of user i at time t. They broadcast this value in a post. Under our model, posts that are most similar to a user’s posted value at each time point are the most relevant to that user. Specifically, for dijt = |vit - vjt | the absolute difference between the value user i posted and the value user j posted at time t, we define the most relevant posts for user i at time t to be those for which dijt is smallest.
In this simulation, we are interested in calculating the extent of algorithmic amplification of a set of m users that “the algorithm” treats preferentially. We select the users to be treated preferentially by the system to be the first m users—the ordering is arbitrary, so this is essentially a random selection.
In this case, content is ranked as follows. For each i, we sort the values dijt - bIj from smallest to largest and display the first KT. Here, b is the amount by which the advantaged accounts are artificially boosted, and Ij is an indicator of whether the jth account is one of the accounts that get the unfair benefit. In a nutshell, under our biased ranking algorithm, each user is shown the accounts that are most relevant to them, with the exception that some accounts get a boost in the ranking that is unrelated to the relevance of the content they produced to the other users. Note that when b = 0, the algorithm does not preferentially amplify any accounts. At each time point, each user then elects to follow each of the accounts it was shown with probability p = 0.05. After an initial period of 50 iterations, we then calculate algorithmic amplification over the subsequent five iterations by dividing the number of times the advantaged accounts appeared on the generated timelines using “the algorithm” by the number of times the advantaged accounts appeared in timelines using reverse chron. Referring back to our mathematical definition of algorithmic amplification, x in that equation refers to the advantaged users in this simulation.
We study this under two scenarios. In the first, we set the bias to be b = 1 during the initial 50 iterations. Then, we calculate algorithmic amplification for several values of b. This is shown by the dark blue line in the left figure. For convenience, we also provide a comparison to what the calculation of algorithmic amplification would have been if in the past, the algorithm had not shown preferential treatment to the advantaged accounts, i.e. what we would have calculated if our “neutral baseline” were, in fact, “neutral.” This is shown in light blue.
Figure 3: Simulations of biased past with b=1 (left) and constant b (right).
Two things stand out in this figure. First, for small values of b, we estimate amplification with a value less than one, meaning de-amplification. This measurement would imply that the preferred accounts are actually getting less exposure than they should under the chosen neutral baseline. Not only would we conclude that the preferred accounts did not get amplified by the system, we may infer that they are being unfairly treated. Even for very large values of b (i.e. when the preferred accounts have a very large advantage), we still estimate amplification of those accounts as barely greater than one, indicating negligible amounts of amplification of those accounts. The second notable facet of this figure is that as the preferred accounts’ advantage grows (i.e. as b gets larger), so does the gap between the true level of amplification under an unbiased past and what we actually measure: for the largest amounts of algorithmic amplification, we underestimate it the most.
In the second scenario, we allow “the algorithm” to maintain the same level of bias in the initial 50 iterations as it has in the subsequent iterations during which amplification is calculated. The results of this simulation are shown in the right figure. Here, we see that across all values of b, we measure no amplification. This is because the follow network on which the baseline (reverse chron) is based is imbued with exactly the same bias as “the algorithm.” This fundamentally limits our ability to test for algorithmic amplification. By comparison, if we had not had a biased content recommendation algorithm in the past (light blue), we would have correctly been able to infer that the advantaged accounts enjoy a significant amount of amplification for large values of b—the algorithm allocates about four times as many impressions to the advantaged accounts as does reverse chron.
This simulation clearly falls short of representing the complexity of link formation and content delivery on social media platforms. For example, in reality people would not follow accounts they are shown at random, but rather, would likely select those they find most interesting or relevant to them. Similarly, in the real world all users are not identically distributed, and heterogeneity of preference complicates the measurement of algorithmic amplification even further. We have chosen this very simple setting because only in an extremely simplified universe can we easily develop a non-controversial truly neutral baseline against which to compare measures of algorithmic amplification.
We’ve given some examples of how measuring algorithmic amplification is complicated by the realities of a complex system that evolves over time. In the absence of a neutral baseline that has not been influenced by the past behavior of the system, directly measuring algorithmic amplification is difficult if not impossible. In light of this, what can we do?
It is useful to return to the underlying concerns motivating the desire to study and address algorithmic amplification in the first place. First, when it comes to overtly harmful content, any exposure to the harmful content is undesirable. Perhaps the question is less about how many times people using different algorithms for sorting content see the harmful content, but rather, how many people see it at all, regardless of whether they follow its creator or not. This suggests it may be useful to track impressions on harmful content without baselining to any algorithmic comparator or neutral baseline.
When it comes to the unfair allocation of impressions, the underlying concern is that algorithmic amplification causes some groups or individuals to receive an unfair amount of exposure relative to others. We can further break down what is meant by an “unfair amount of exposure”. One interpretation might be that there is some component of the system that is unjustifiably biased towards or against some group or individual. Or, reversing that, we might say that the system is not unfair if all of its components are operating in such a way that no group or individual is unjustifiably advantaged by any component. This suggests that an audit be performed to ensure that each of the components of the system is behaving “fairly”—that is to say, does not exhibit predictive bias towards or against any socially salient group or individual. Accompanying metrics might be things like measures of group-wise model performance disparities for each system component. Mitigating unjustified disparities in amplification under this approach would then be equivalent to minimizing each system component’s performance disparities, however defined.
While this approach is appealing, it fails to account for the ways in which all of the system components interact. In a complex system like a social media system, it is possible that the constituent models, policies, and heuristic rules could interact in unpredictable ways leading to “unfair” allocation, even if each of the components is “fair” in isolation. In light of this, one potential alternative definition of unfair allocation of exposure is if content from some groups or individuals is given higher levels of exposure relative to the size of that content’s receptive audience than is given to content from other groups or individuals. In some sense, this is already at the heart of our operationalized definition of algorithmic amplification—where we compare the exposure that was given by the algorithm (the numerator) to a baseline (reverse chron) that serves as a proxy for the size of the receptive audience (the denominator). Even in this case, however, it is clear that aligning to the receptive audience is not a neutral baseline, because the measure of the audience (the “follow graph”) is itself influenced by the algorithmic system.
Conceptually, comparing algorithmic amplification across groups or individuals is a way of considering disparities in exposure (via the numerators) while simultaneously accounting for whether the content is reaching a receptive audience (via the denominator). To address the same desiderata, we could instead turn to measures of exposure inequality across socially salient groups—a venture akin to comparing the numerators directly. Separately, we can use a counter-balancing metric that is designed to account for how satisfied the consumers are with the content they have been presented to account for the competing need to ensure that flattening the distribution of impressions does not come at the expense of delivering readers content they actually enjoy. Several methods for calculating inequality of exposure or groupwise disparities in exposure have been proposed recently, and there exist many metrics for inferring the quality of the content that has been displayed to users (Saint-Jacques et. al. 2020; Lazovich et. al. 2022). This could be implemented via experimentation or A/B testing by tracking both inequality metrics and metrics for reader-side satisfaction to try to address the problem of the system allocating disproportionate influence to some users relative to the size of the audience that is receptive to their content.
Its connection to algorithmic amplification aside, this approach to addressing the concerns underlying discussion of disparities in algorithmic amplification may be beneficial in its own right. Reducing inequality can create a platform where more voices can be heard and can help avoid concentrating influence and exposure in fewer people and perspectives. Given the increasing reliance on social media for building professional networks and marketing, reducing algorithmic amplification could also be a useful tool for flattening real world access to economic and career opportunities.
Defining algorithmic amplification to account for all of the ways in which a social media system can increase exposure to certain types of content requires us to expand our definition of “the algorithm” from simply a single recommender system to a complex web of interacting models, policies, and actors. In doing so, we are forced to grapple with the fact that commonly proposed baselines against which to measure algorithmic amplification are not as neutral as they first appear—they condition on the state of system components that have been influenced by the past behavior of “the algorithm” or could arguably be considered part of “the algorithm” themselves. Indeed, any neutral baseline that conditions on the state of the system will suffer from this same issue, confounding our ability to measure amplification as it is currently defined.
At its heart, the term algorithmic amplification is used in relation to concern about two related but distinct issues: exposure to overtly harmful content and unjustifiable disparities in exposure between groups and individuals. In the interest of clarifying what is already a complicated and politically fraught issue, going forward it may be useful to discuss “algorithmic exposure” in the context of overtly harmful content. In this case, any algorithmic exposure is undesirable, whether the algorithm amplified it or not. In the context of disparities in exposure on a platform, “algorithmic inequality” may more appropriately address the underlying concern that these systems disproportionally allocate influence and benefits. Semantics aside, further careful study and regulation of algorithmic amplification are critical for ensuring equitable benefits from online platforms and reducing the damage that can result from exposing and promoting harmful content.
© 2023, Kristian Lum and Tomo Lazovich.
Cite as: Kristian Lum and Tomo Lazovich, The Myth of "The Algorithm": A System-level Overview of Algorithmic Amplification, 23-07 Knight First Amend. Inst. (Sept. 13, 2023), https://knightcolumbia.org/content/the-myth-of-the-algorithm-a-system-level-view-of-algorithmic-amplification [https://perma.cc/55JV-R2YA].
“About Our Approach to Recommendations.” n.d. Twitter Help Center. Accessed April 13, 2023. https://perma.cc/557T-8QG4.
“About Twitter's Account Suggestions.” n.d. Twitter Help Center. Accessed April 13, 2023. https://help.twitter.com/en/using-twitter/account-suggestions.
“About Your For You Timeline on Twitter.” n.d. Twitter Help Center. Accessed April 13, 2023. https://perma.cc/4ZCC-MSMF.
Bandy, Jack, and Tomo Lazovich. 2022. “Exposure to Marginally Abusive Content on Twitter.” 17th International AAAI Conference on Web and Social Media (ICWSM 2023), (August). https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4175612.
Bower, Amanda, Kristian Lum, Tomo Lazovich, Kyra Yee, and Luca Belli. 2022. “Random Isn't Always Fair: Candidate Set Imbalance and Exposure Inequality in Recommender Systems.” (September). https://doi.org/10.48550/arXiv.2209.05000.
“Brand Safety @Twitter.” n.d. Twitter for Business. Accessed April 13, 2023. https://business.twitter.com/en/help/ads-policies/brand-safety.html.
“Discover and Search.” n.d. TikTok Help Center. Accessed April 13, 2023. https://perma.cc/3K28-UPS4.
Eckles, Dean. 2021. “Algorithmic Transparency and Assessing Effects of Algorithmic Ranking,” Testimony before the Senate Subcommittee on Communications, Media, and Broadband. https://www.commerce.senate.gov/services/files/62102355-DC26-4909-BF90-8FB068145F18.
El-Kishky, Ahmed, Thomas Markovich, Serim Park, Baekjin Kim, Ramy Eskander, Yury Malkov, Frank Portman, Sofia Samaniego, Ying Xiao, and Aria Haghighi. 2022. “TwHIN: Embedding the Twitter Heterogeneous Information Network for Personalized Recommendation.” KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, (August). https://doi.org/10.1145/3534678.3539080.
“Explore Tab.” n.d. Instagram Help Center. Accessed April 13, 2023. https://perma.cc/BK2R-YY2N.
Fernández, Miriam, Alejandro Bellogín, and Iván Cantador. n.d. “Analysing the Effect of Recommendation Algorithms on the Amplification of Misinformation.” arxiv. https://arxiv.org/abs/2103.14748.
“For You.” n.d. TikTok Help Center. Accessed April 13, 2023. https://perma.cc/V763-MXM5.
Frenkel, Sheera. 2018. “Republicans Accuse Twitter of Bias Against Conservatives.” The New York Times, September 5, 2018. https://www.nytimes.com/2018/09/05/technology/lawmakers-facebook-twitter-foreign-influence-hearing.html.
Granick, Jennifer S. 2023. “Is This the End of the Internet As We Know It? | ACLU.” American Civil Liberties Union. https://www.aclu.org/news/free-speech/section-230-is-this-the-end-of-the-internet-as-we-know-it.
“How Feed Works.” n.d. Facebook. Accessed April 13, 2023. https://perma.cc/CZ9G-FDBN.
“How to Use Twitter Search – Search Tweets, People, and More.” n.d. Twitter Help Center. Accessed April 13, 2023. https://perma.cc/NZF5-ND65.
Huszár, Ferenc, Sofia I. Ktena, Conor O'Brien, Luca Belli, Andrew Schlaikjer, and Moritz Hardt. 2021. “Algorithmic Amplification of Politics on Twitter.” Proceedings of the National Academy of Science 119, no. 1 (December).
Katsaros, Matthew, Kathy Yang, and Lauren Fratamico. 2022. “Reconsidering Tweets: Intervening during Tweet Creation Decreases Offensive Content.” 16th International AAAI Conference on Web and Social Media (ICWSM 2022), (May). https://ojs.aaai.org/index.php/ICWSM/article/view/19308.
Keller, Daphne. 2021. “Amplification and Its Discontents | Knight First Amendment Institute.” | Knight First Amendment Institute, June 8, 2021. https://knightcolumbia.org/content/amplification-and-its-discontents.
Lai, Vivian, Samuel Carton, Rajat Bhatnagar, Q. Vera Liao, Yunfeng Zhang, and Chenhao Tan. 2022. “Human-AI Collaboration via Conditional Delegation: A Case Study of Content Moderation.” Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, (April). 10.1145/3491102.3501999.
Lazovich, Tomo, Luca Belli, Aaron Gonzales, Amanda Bower, Uthaipon Tantipongpipat, Kristian Lum, Ferenc Huszár, and Rumman Chowdhury. 2022. “Measuring Disparate Outcomes of Content Recommendation Algorithms with Distributional Inequality Metrics.” Patterns 3, no. 8 (August). https://doi.org/10.1016/j.patter.2022.100568.
Luca Belli, Kyra Yee, Uthaipon Tantipongpipat, Aaron Gonzales, Kristian Lum, and Moritz Hardt. 2023. “County-level Algorithmic Audit of Racial Bias in Twitter's Home Timeline.” arxiv.
Narayanan, Arvind. 2023. “Understanding Social Media Recommendation Algorithms | Knight First Amendment Institute.” | Knight First Amendment Institute. https://knightcolumbia.org/content/understanding-social-media-recommendation-algorithms.
“People You May Know.” n.d. Facebook. Accessed April 13, 2023. https://www.facebook.com/help/336320879782850.
“Reps. Eshoo and Malinowski Reintroduce Bill to Hold Tech Platforms Accountable for Algorithmic Promotion of Extremism.” 2021. Congresswoman Anna G. Eshoo. https://eshoo.house.gov/media/press-releases/reps-eshoo-and-malinowski-reintroduce-bill-hold-tech-platforms-accountable.
Saint-Jacques, Guillaume, Amir Sepehri, Nicole Li, and Igor Perisic. 2020. “Fairness through Experimentation: Inequality in A/B testing as an Approach to Responsible Design.” (February). https://arxiv.org/abs/2002.05819.
“Topics on Twitter | Twitter Help.” n.d. Twitter Help Center. Accessed April 13, 2023. https://perma.cc/Q9SC-H8FK.
“Twitter Notifications Timeline and Quality Filters.” n.d. Twitter Help Center. Accessed April 13, 2023. https://perma.cc/RAD5-S42X.
Wang, Lequn, and Thorsten Joachims. 2023. “Uncertainty Quantification for Fairness in Two-Stage Recommender Systems.” (February). https://arxiv.org/abs/2205.15436.
What is Linktree? | Linktree Help Center. Accessed April 13, 2023. https://intercom.help/linktree-ff524ba1864c/en/articles/5434130-what-is-linktree.
Whittaker, Joe, Seán Looney, Alastair Reed, and Fabio Votta. 2021. “Recommender Systems and the Amplification of Extremist Content.” Internet Policy Review 10 (2).
Yee, Kyra, Alice Schoenauer Sebag, Olivia Redfield, Emily Sheng, Matthias Eck, and Luca Belli. 2022. “A Keyword Based Approach to Understanding the Overpenalization of Marginalized Groups by English Marginal Abuse Models on Twitter.” (October). https://arxiv.org/abs/2210.06351.
This definition is based on that given by Dean Eckles in testimony to the U.S. Congress. (Eckles 2021)
This definition is very similar to that used in (Huszár et al. 2021) to calculate political amplification for different politicians. In their analyses x was either individual politicians or groups of politicians defined by their political party.
Kristian Lum is an associate research professor at the University of Chicago’s Data Science Institute.
Tomo Lazovich is a senior research scientist at the Institute for Experiential AI at Northeastern University.