Abstract

Divisiveness appears to be increasing in much of the world, leading to concern about political violence and a decreasing capacity to collaboratively address large-scale societal challenges. In this paper, we aim to articulate an interdisciplinary research and practice area focused on what we call bridging systems: systems that increase mutual understanding and trust across divides, creating space for productive conflict, deliberation, or cooperation. We give examples of bridging systems across three domains: recommender systems on social media, collective response systems, and human-facilitated group deliberation. We argue that these examples can be more meaningfully understood as processes for attention allocation (as opposed to “content distribution” or “amplification”) and develop a corresponding framework to explore similarities—and opportunities for bridging—across these seemingly disparate domains. We focus particularly on the potential of bridging-based ranking to bring the benefits of offline bridging into spaces that are already governed by algorithms. Throughout, we suggest research directions that could improve our capacity to incorporate bridging into a world increasingly mediated by algorithms and artificial intelligence.

1 Introduction

Imagine a platform that gave people status not for clever takedowns of political opponents but for producing content with bipartisan appeal. ... Instead of boosting content that is controversial or divisive, such a platform could improve the rank of messages that resonate with different audiences simultaneously.

— Chris Bail, Breaking the Social Media Prism. [10]

Division impacts cooperation and conflict. We face compounding global challenges including climate change, pandemics, and transformative artificial intelligence, all of which are likely to require significant cooperation to navigate. At the same time, there is significant public concern around increasing societal division [96, 63] and the resulting increase in destructive conflict [35]—which can increase the likelihood of large-scale political violence. [108] Destructive conflict and the violence that can result from it not only harm innumerable lives directly—it may also make addressing those global challenges exceedingly difficult. [71]

The systems that allocate our attention can impact division. Increases in societal division may be related to the incentives of the systems which guide people’s attention—what we call attention allocators. We are all attention allocators in that we have some agency over how we allocate our own limited attention. But before we can even choose among what to attend to, upstream attention allocators such as the recommender systems on social media platforms, search engines, news media, and even human facilitation have already done much of that allocation for us. From a firehose of potential information, they choose a much smaller set of items for us to attend to in our limited time.

In this paper, we focus on these upstream attention allocators as they shape the incentives of our attention economy by directing attention to some kinds of behavior over others. [110] Because attention can translate to money and power, such systems help determine what kinds of behaviors are rewarded in many spheres of life.

Systems that directly reward attention may have “bias toward division.” Many attention allocators reward behavior that seeks to maximize attention toward themselves. Recommender systems, for example, largely seek to maximize measures of attention (what is commonly referred to as engagement-based ranking), and most news media entities need to attract attention for their work out of financial necessity. Similarly, politicians often aim to attract attention in order to win elections, and anecdotally, those who craft online messages for politicians report needing to use inflammatory language in order to be competitive. [94] Incentives such as these reward engagement-bait—content and ideas intended to generate engagement, which are often misleadingly sensational and hyperbolic. In practice, engagement-bait can crowd out good faith efforts to communicate across divides, decrease understanding and trust among people of diverse viewpoints, and thus increase division. [58, 19]

Two common ways of defining division are ideological polarization (differences in policy positions) and affective polarization (emotional dislike of those from the other party). Measurements of affective polarization, for example, show increases in many parts of the world. [18] This increase appears to be correlated with the increasing adoption of ubiquitous digital communication, but the extent to which this relationship is causal remains contested. [58, 47]

This paper explores how to incentivize bridging. The goal of bridging is to increase mutual understanding and trust across divides, creating space for productive conflict, deliberation, or cooperation. Every system that involves human attention—from social media recommendations, to search engine ranking, to governance processes—will, to some extent, reward or punish bridging. We explore two core questions: What do systems that reward bridging look like? How might they be designed?

Our goal is to clearly articulate an interdisciplinary research area that can inform the development and accelerate the adoption of bridging systems across domains. In other words, just as there are research areas devoted to other biases we aim to articulate a research agenda focused on overcoming the “bias toward division.”

Bridging is not about eliminating conflict or creating homogeneity. The goal is to check the default tendencies of many environments (including much of social media), which can potentially push us toward extremes when combined with our psychological predilections. As articulated by Stray, [93] the intent is “conflict transformation” [57, 17]: not to remove divisions or interfere with the substance of civic debates, but to “[make] conflict better in some way.” In other words, to mitigate the risks of “high conflict” [85] while supporting “healthy” or “constructive” conflict. When we use the terms bridging, or reducing division, we are thus not referring to “making everyone believe the same things.” Those terms are instead shorthand for enabling mutual understanding and respect across divides—in other words, supporting pluralism. [109] Figure 1 provides a speculative causal loop diagram [30] illustrating the potential ideal impacts of bridging systems.

Figure 1 A causal loop diagram illustrating how bridging systems might impact society. The goal of this diagram is not to make strong, precise claims about causality but simply to provide intuition on how a proliferation of bridging systems could have important and beneficial societal consequences and reduce both deliberate and indirect harms. Significant work is required to determine which causal relations hold (including those not drawn on this diagram), and under what conditions.

Moving from engagement-based ranking to bridging-based ranking. By definition, optimizing more for bridging means optimizing less for engagement, but the extent to which these two goals are in tension is an open question. It may be possible to include bridging impacts within the objective function of a recommender system without undermining financial sustainability. Figure 2 gives a simple example of what such bridging-based ranking [70] might look like in the context of a recommender system on a social media platform.

Figure 2 A simple example of bridging-based ranking. Under engagement-based ranking, posts are ranked highly for a user if they are liked by similar users, regardless of the stance of dissimilar users. In contrast, under bridging-based ranking—formalized here using a diverse approval motif (Section 4.1)—the post ranked highest is liked by both parties, and the lowest-ranked post is the most divisive. This illustration conveys important intuition but note that it oversimplifies both the sophistication of status quo engagement-based ranking and the potential sophistication of bridging-based ranking.

1.1 Background

Throughout this paper, we will use examples from three domains to illustrate the concept of a bridging system: (1) recommender systems on social media platforms, (2) software for supporting large-scale civic discussion, and (3) facilitated in-person deliberation and mediation. Each of these domains is briefly introduced below.

Example 1 (recommender system)

A recommender system is an algorithm that selects which items of content, from a large pool of available items, should be shown to a user. [2] Recommender systems are commonly used on social media platforms—prominent examples include the algorithms behind the Twitter (now X) timeline, the Facebook feed, and the YouTube homepage.

Their implementations vary, but most recommender systems on social media operate using the same basic logic: They “optimize for engagement,” selecting the items of content that are most likely to elicit clicks, likes, reactions, comments, reshares, and other behaviors that the platform can measure. [99] Such behaviors are often an effective proxy by which to select what people want [102] but can also incentivize the creation of content that is misleading, sensational, outrageous, or addictive. [15]

Example 2 (collective response system)

Governments, civil society organizations, and others that act on behalf of large groups of people often need to elicit the views of their constituents and stakeholders. A variety of tools for coordinating such large-scale dialogues have been created, including some that explicitly seek to elevate common ground. In this document, we call these collective response systems [69] and focus on two illustrative examples.

The first example, YourView, [112] was a not-for-profit collective response system that operated in Australia during the lead-up to the 2013 federal election. [91, 43, 44] YourView provided concise explanations of policy proposals, under which participants could contribute comments for or against the proposal, as well as vote on the comments of other participants, and on the issue overall. From this voting data, YourView derived two measures of public support for the proposal: The raw percentage of participants in favor and a “Public Wisdom” percentage in which the participants with the highest “credibility score” were given the most weight.

The second example, Polis, [77] is an open-source collective response system that has been used to conduct public consultations informing digital policy in Taiwan, among other applications. [50, 92] Polis instances are context-specific forums where participants can contribute comments and vote on the comments of others. From this voting data, Polis clusters participants according to their voting patterns and generates visualizations of the support for each comment among the participants in each cluster.

Example 3 (human facilitation)

While there is a very broad range of potential examples of human facilitation, here we focus specifically on mini-publics. [38] They involve the convening of a diverse group of people to deliberate over a particular policy issue, guided by impartial facilitators. The group is selected through sortition, providing a representative random sample (roughly analogous to the selection of jury candidates in the criminal justice system, though more similar in process to representative polling). Increasingly, such mini-publics are being used to make progress on contested political issues, including abortion laws in Ireland [33] and climate change policy in France. [46] There is an emerging profession of facilitators who are skilled at convening and coordinating such groups. [36, 26] Minipublic facilitation is of particular interest because, unlike many multistakeholder forums, participants are not just elites and represent a wide range of personal experiences and viewpoints in a single facilitation environment.

The term “bridging” has been used by others. For example, literature on social capital often draws distinctions between three kinds of social ties: bonding, bridging, and linking. [97] Of these, both bridging and linking are used to refer to connections between individuals who differ in some way (“horizontally” and “vertically,” respectively). We use bridging more broadly and generically to describe any process that increases mutual understanding and trust across divides (see Section 2.4), though we agree with the claim that bridging ties are a core part of a connected society. [3] Bridging has also been used by others to describe strategies for connecting across differences during offline, face-to-face interactions, [90, 79] the effects of which have been extensively studied in the research area of “intergroup contact theory” and found to be quite robustly positive. [76] Many of these strategies may form the basis for formal, quantitative signals used to identify bridging algorithmically (Section 4).

There are a number of existing research efforts that seek to use technology in ways that satisfy our definition of “bridging,” and on which our work builds. These include work on the use of recommender systems for depolarization, [53, 93, 7, 9, 8, 114, 27, 80] work that seeks to build insights from conflict mediation and peacebuilding communities into algorithmic settings, [53, 87, 88, 98, 94, 23] and work that seeks to quantify the deliberative quality of an argument or discussion. [42, 106] More broadly, the design of bridging systems draws on work from many disciplines that inform our understanding of sociotechnical systems. This includes work on:

articulating what “good” public discourse looks like for attention allocators generally, from domains including political theory, philosophy (e.g., epistemology), economics (e.g., social choice theory), communication, anthropology, sociology, rhetoric, and science and technology studies;
the collection of accurate social data, from domains across the behavioral and social sciences including, quantitative history, measurement theory, and survey design;
modeling social phenomena, from domains including computational social science, opinion dynamics, game theory, and political science; and
the design of interfaces and environments, from domains including human-computer interaction (e.g., data visualization), economics (e.g., mechanism design, choice architecture), science and technology studies, organizational design, facilitation design, urban planning, and architecture.

Even this broad set of disciplines is far from comprehensive, and we do not attempt a thorough review of all relevant work here. However, in many cases, there are direct precursors to particular bridging system components. We will cite many of these in the examples discussed throughout the paper.

1.2 Contribution

Our goal in this paper is to articulate a research and practice direction around bridging systems. Section 2 provides a framework for thinking about attention allocators, illustrated using the examples of recommender systems, collective dialogue systems, and human-facilitated deliberations, and defines bridging as a property of such systems (2.4). Section 3 describes how relationships in a population can be formally modelled, and sections 4 and 5 give concrete examples of signals and metrics that might be used to instantiate the bridging goal. Considerations for evaluating bridging systems are described in Section 5.2, and Section 6 discusses the challenges, limitations and risks of bridging. Throughout, we propose intellectually compelling and societally impactful open problems, to help support both cross-domain collaboration and rapid beneficial experimentation and deployment.

Depending on their own goals, readers may wish to allocate their attention to particular sections.
Academics interested in helping progress our understanding of bridging might focus on the technical discussion in sections 2 to 5, and on the blue boxes throughout which list open questions.
Platform employees seeking to experiment with bridging should in the first instance focus on the discussion of signals and metrics in sections 4 and 5, and the examples given throughout in green boxes.
Regulators and civil society actors might benefit most from the attention allocation framework in Section 2, and the discussion of risks and limitations in Section 6.

We also provide an accompanying glossary at bridging.systems/definitions/, [75] containing the concepts we introduce and use throughout the paper.

2 Attention Allocation Systems

If a tree falls in a forest and no one is around to hear it, does it make a sound? If content is distributed and no attention is paid to it, does it even matter?

We all have control over how we allocate our personal attention, but we do not do this alone. Out of necessity, much of this allocation work is first partially delegated to upstream social and technological systems. These include our three recurring examples—recommender systems, collective response systems, and human-facilitated mini-publics—but also books, media organizations, search engines, group chats, large language models, and many other systems.

In this section, we formally define allocation and attention in order to provide a general model of attention allocation systems (or attention allocators). Our goal is to facilitate the translation of insights across domains (for example, from offline human facilitation into algorithmic recommender systems), and to help identify points of intervention, in order to improve attention allocators. Finally, we frame bridging as a property of attention allocation systems.

Why allocation? While it is common to speak of systems for “content distribution” or (algorithmic) “amplification,” we believe that “attention allocation” is often a more useful frame and term because it acknowledges that attention (as we will define it) is scarce. In particular, human attention is finite and any given individual will only be able to attend to a small fraction of the content to which they have access. In many situations, economic allocation of attention is quite literally what occurs, [51] and speaking in terms of allocation avoids the ambiguities of the term “amplification.” [103]

What is allocation? Allocation is the filling of slots with objects. A slot is a finite resource, and an object is anything that can consume that resource. In this way, the two terms (slot and object) are defined circularly, but it should be clear from context which is which. For example, an object might be an item of content on a social media platform, and the corresponding slots might be discrete positions within a recommender feed. Alternately, slots might be continuous intervals of first-person experience, in which case the corresponding objects would be anything that can be attended to.

The simplest form of allocation occurs when one slot is filled with one object. We describe these simple allocations as atomic. Atomic allocations are represented as: (slot, object, properties).

The third element, properties, is a catchall for data that describe the nature, qualities, and context of the allocation, and which are formalized differently in different contexts. For example, if the allocation represents a post appearing in a ranked recommender feed, properties might contain the context in which the feed is viewed (time of day, type of device), engagement that resulted, and so on. More complex allocations can be described as sets of atomic allocations. For example, an allocation of attention where a thousand people attend to the same object would be represented as the set of atomic allocations describing each individual attending to that object.

Example 4 (recommender system)

Recommender systems (and more broadly, the user interfaces of social media platforms) create atomic allocations each time a particular item of content (an object) is used to fill a particular position within a recommender feed (a slot). The properties include engagement data such as how long the user subsequently paused on the item (dwell time), but also the social context such as reactions and comment counts which were shown alongside the content.

Example 5 (collective response system)

Collective response systems often have two kinds of allocations, both involving the allocation of attention. The first occurs when users are shown items for evaluation (e.g., dialogue, voting, etc.), and the second occurs when the most widely supported items (e.g., the results of the votes) are presented to everyone.

Example 6 (human facilitation)

In facilitated mini-publics, allocations of attention occur fluidly as people engage in dialogue. Here, the object might be an idea being expressed by another person, and the additional properties of the allocation include the tone, facial expressions, and body language with which it is communicated.

A potential allocation is one that has not yet happened (and may or may not happen), and a realized allocation is one that has already happened.

2.1 Allocation

An allocation system (or simply allocator) is a process that determines which of many potential allocations will actually occur, taking as input a set of potential allocations and outputting a set of realized allocations.

Allocators consist of an allocation process and, optionally, a learning process.

2.1.1 Allocation Process

The allocation process is the core of an allocation system. Its purpose is to determine which of a set of potential allocations to realize—that is, which objects to use to fill a finite set of available slots. The process takes as inputs a set of potential allocations and predicts how each allocation would, if realized, change the state of the world along several dimensions. These predicted impacts are aggregated into a measure of the “worthiness” or “utility” of each allocation using a normative value model. Because global predictions may not be possible or accurate, local heuristics or other signals are also often used for impact prediction. Finally, the most valuable allocations (according to the value model) are selected and realized.

Figure 3 The allocation process in an allocation system. A bridge icon indicates where bridging can be incorporated. Not shown are the ways in which the process itself is optimized.

2.1.2 Learning Process

The purpose of the learning process is to improve the predictions generated by the allocation process. When a trigger indicates that models need updating, relevant data is retrieved from storage, collected, or elicited from the people interacting with the system. This data is used to update (or fine-tune) both state models—static descriptions of the current state of the world—and predictive models—models that predict the impacts of allocations, optionally conditioned on the current state. These updated models are then substituted into the allocation process.

Figure 4 The learning process in an allocation system. A bridge icon indicates where bridging can be incorporated. The dashed lines indicate that many technical systems will update state models after each attention allocation but update predictive models less frequently. Human facilitators update their implicit state models as they facilitate (noticing how different events impact people) and also update their predictive models slightly as they work but might do a much larger “update” of their predictive models (improving their facilitation skills) during a post-facilitation retrospective. Not shown are implicit inputs such as the previous models, or how the act of data elicitation can itself change the state of the world.

2.2 Attention Allocation

We define attention as the selective processing of information by a system, such that the state of that system might materially change. We use the terms attend to or pay attention to to describe the enacting of that “potential material change” to the system—systems that can only create outputs (without any internal changes) are not attending. As an example, when a person (a kind of system) attends to a news article (a kind of information), their beliefs and attitudes may change as a result of them attending to that article. Because people (and other systems) face limitations in their access to information and have only a finite amount of processing capacity, they can benefit from external systems that help allocate that limited attention.

We can now define an attention allocation system (or attention allocator) as an allocation system that is involved in the allocation of attention. Crucially, an attention allocator does not itself need to attend (and many do not attend under our definition, for example a recommender system that has been trained once on existing data and is not updated).

Our three examples can be viewed as attention allocators, to which people (partially) delegate the allocation of their attention.

Example 7 (recommender system)

For a given social media account, every item of content corresponds to a potential atomic allocation or, more accurately, to multiple potential atomic allocations, one for each context and position in which the content could appear in the user interface. A set of impacts of each of these potential allocations is predicted. Commonly considered impacts include engagement behaviors (e.g., clicks, comments, shares), whether the user will be entertained (e.g., what they would rate the content), or whether they are likely to be harmed (e.g., whether the content is a financial scam). The predictions are then aggregated using a formal value model, [99] and the resulting scores are the primary factor that determines how items are ranked within the recommender feed, influencing which allocations take place. For the most part, recommenders currently direct attention to optimize engagement [99] but could target other goals. [93]

Example 8 (collective response system)

As mentioned, collective response systems often have two kinds of attention allocation. For example, Polis only presents one item at a time for voting, and it tries to choose which item to show in order to learn as much as possible about the overall structure of the views and perspectives present in a population. Thus the main “predicted impact” in the allocation process is the information about participants’ views provided by the allocation. Polis then uses that information to create a nonpersonalized ranking and visualization, showing which items are agreed with the most across those divides—which it calls “group-aware consensus.” The allocation process here allocates the most prominent attention slots to items that have the most agreement across divides, highlighting those items and encouraging people to riff on them, suggesting variations that attract yet broader support. [92]

Example 9 (human facilitation)

Facilitators can promote or discourage certain kinds of attention allocation through the way they structure the deliberations—such as by giving certain people the floor at certain times—with goals (a qualitative value model) including the maintenance of baseline civility and ensuring the group delivers on its remit.

Often, multiple attention allocators are required to meaningfully model a given situation. For example, consider a person browsing a social media feed. In this scenario, there are at least two attention allocators involved. The recommender system is an upstream attention allocator, algorithmically determining which of many potential atomic allocations (that is, positions of content within the feed) to realize. But the positioning of content within a recommender feed does not wholly determine what the person pays attention to. The person has agency too, and acts as their own, personal attention allocator deciding which of the many possible ways of allocating their continuous, first-person experience to ultimately enact or realize. We can say that the individual partially delegates their attention allocation to the recommender. More formally, this means that the recommender (as an attention allocator) influences the set of potential allocations that are available as inputs for the person’s own, downstream attention allocator.

Figure 5 Example of two attention allocators chained together. The allocation processes of both recommender systems and individual humans play a role in how human attention is allocated, because people partially delegate the allocation of their attention to the recommender. For simplicity, we have not included the learning processes of either the human or the recommender on this diagram.

While our focus here is human attention, under our definition, some algorithmic systems can also be said to attend to things. We chose a definition with this property because such systems can also significantly impact people depending on what information they process.

For example, consider a large language model (LLM) along with the infrastructure used to fine-tune that model. If a recommender system (in particular, the allocation process of a recommender system) is used to select content on which to fine-tune the LLM, then both the LLM and the recommender are attention allocators, with the LLM (in combination with its fine-tuning infrastructure) also “attending” to the recommender system. Because the recommender’s allocation system is not updated during the allocation (as is standard), it is not attending to the content it recommends. 11. However, separately, the recommender’s learning system may be used to train the allocation system; such a learning process would involve attending as it materially changes the system. There are some subtleties here, as our definition of attention is relative to the system boundary that is being examined, which can include multiple people, algorithmic systems, organizations, etc. As another example, if someone uses an LLM to summarize an article and then reads the summary, that summarizer is an attention allocator, but it is not attending; since its state is not changing, it is simply outputting. However, the combined system of the person and the summarizer together are attending to the article. Similarly, if a person attending to an article is an employee of an organization, that organization (system) can also be said to be paying attention to the article.This becomes increasingly relevant with LLM-based systems being deployed to directly execute goals in the world.

2.3 The Optimization Stack

To reason effectively about attention allocators (and allocators in general), it is important to recognize that they involve multiple levels of optimization (Figure 6)—what we call an optimization stack.

The combination of accuracy optimization 12. Note that we can only observe phenomena like bridging-ness or entertaining-ness indirectly via measurable outcomes, such as the diversity of people who comment on a post or the watch time on a YouTube video. Thus, whenever we use the word “accuracy,” we mean accuracy as measured against these measurable outcomes, which are proxies for the phenomena of interest. In many cases, there may not be a literal ground truth against which to measure accuracy, as it may not be possible to know the true impacts depending on what is predicted (e.g., we can’t know true bridging-ness or entertaining-ness, we rely on proxies for them).during the learning process and value optimization during the allocation process is an example of bilevel optimization. [31] But in algorithmic attention allocators (such as recommender systems), there are at least two other levels of optimization. Downstream of value optimization is the fact that individuals—both producers and consumers of (potential) allocations—will strategically optimize their behavior to further their own goals. Upstream of accuracy optimization are decisions made about the design of the learning and allocation processes. What impacts should be predicted? How should they be weighted in the value model? What data should be collected? Each of these questions will be answered to “optimize” some (perhaps qualitative) measure of the value of the system as a whole.

Figure 6 The optimization stack. The four levels of optimization that take place within (or adjacent to) an algorithmic attention allocator, such as a recommender system.

At all levels of optimization, quantitative signals and metrics are used to quantify the degree to which optimization efforts are successful. As signals and metrics are a significant focus of the remainder of this paper, we have included a list below which summarizes the most common ways in which they are used for optimization. In short, signals are used for allocation or ranking, and metrics are used for the evaluation of a system.

1. Ranking. If the value of a signal (or set of signals) is predictable or known for each of a set of alternatives, the alternatives can be directly ranked from most favorable to least favorable according to those signals, and the most favorable alternative(s) chosen. For example, value optimization may consist of evaluating the value model for each potential atomic allocation, and then promoting those allocations which are deemed most valuable.

2. A/B Testing. If the value of the metric is not easily predictable, then experiments or A/B tests can be performed to produce estimates of the causal effects of each alternative intervention on the value of the metric. For example, system design often consists of conducting a large number of A/B tests to inform which design changes are implemented.

3. Machine Learning. If the process being optimized is a machine learning model, then signals and metrics can be included as part of the loss or reward function on which that model is trained. For example, a reinforcement learning-based value optimization process could be trained to maximize the value (according to the value model) of the allocations it realizes.

All of these have human facilitation analogs. A facilitator may directly rank topics to prioritize for discussion, may experiment with different methods of structured deliberation to see which work best in a given context, and will be constantly learning over time what facilitation strategies to use, according to their qualitative, internal “metrics” that measure the degree to which deliberation is successful.

2.4 Bridging as a Property of Attention Allocators

The bridging goal is an increase in mutual understanding and trust across divides, creating space for productive conflict, deliberation, or cooperation. We say that an attention allocator is bridging to the extent that it causally supports the bridging goal through its allocations.

Bridging is thus a system property, rather than a distinct kind of system, and falls on a spectrum—attention allocators may either facilitate or discourage such allocations to varying degrees (and are thus more or less bridging). For example, a recommender system that shows people content that leads to an overall increase in mutual understanding and trust could be considered bridging. If a recommender system has a parameter that determines how much to weigh “bridging signals” versus “pure engagement” signals, then depending on the setting of that parameter, the “same” recommender system might be significantly bridging or extremely polarizing.

While this qualitative definition of bridging is not directly operationalizable, in the remainder of this paper, we provide formalisms that make an approximation of bridging actionable using models, signals, and metrics. In particular, there are several places in which bridging can be incorporated into attention allocators, at different parts of the optimization stack, some of which are indicated by bridge icons in figures 3 and 4. For example, the system can model relations present in a population and can implicitly predict, in the impact prediction stage of the allocation process, whether divisions will increase or decrease. These predictions can then be included as signals in the value model and influence which allocations of attention are realized. [93] At the level of system design, bridging metrics could be considered and used to adjudicate which of a set of potential interventions should be implemented. In sections 3 through 5, we give concrete examples of such signals and metrics and describe how attention allocators—particularly those that are algorithmic—can be designed to be bridging.

3 Data and Modeling

Above, we stated that an attention allocator is bridging to the extent that it supports the “bridging goal”: increasing mutual understanding and trust across divides, creating space for productive conflict, deliberation, or cooperation. This qualitative property must be formalized if we are to extend the insights from offline bridging practices that have been developed over millennia into digital attention allocators such as recommender systems and collective response systems.

In the next three sections, we describe how the notion of bridging can be formally represented and then quantified. In particular, we: introduce the concept of a relation model, a representation of the relationships or affinities between individuals in a given population (Section 3); describe signals which can be used as the basis for ranking and allocation (Section 4); and describe metrics which can be used to measure the impact of an entire attention allocation system (Section 5).

3.1 Data

To model bridging, you need relevant data. Different systems use different approaches for collecting or eliciting such information, and the data can take different forms. Examples are given below from our three example domains.

Example 10 (recommender system)

Users of online platforms generate engagement data such as likes, reactions, comments, shares, dwell time, tagging, direct messages, and so on. Recommender systems use this data to learn about the preferences and perspectives of users. Implicitly, this data reveals how users differ, and the degree of affinity between them.

Example 11 (collective response system)

Each instance of a collective response system is focused around an issue, prompt or question. Participants can either provide new responses or vote on existing ones. The user interfaces generally differ from those of conventional forums such as Reddit. For example in Polis, people are shown a succession of responses one by one, and can choose to agree, disagree, or pass in response to each statement. Polis’ algorithm selects which statement to show next using a number of criteria, one of which is to maximize learning about the relations between participants, responses and, implicitly, each other. [92]

Example 12 (human facilitation)

Facilitators “reading a room” identify a host of subtle cues over the course of deliberation. Common examples include flared nostrils, changes in breathing, and hushed silences. Structured exercises may also be used to sort the room into groups of perspectives. [26]

This data contains information about people’s preferences, perspectives, opinions, identities, or worldviews. Implicitly, it provides information about the relationships, affiliations, and affinities that exist in a population. In many cases, this “data” would have been “collected” anyway. For example, an effective facilitator will intuitively gather such information in the course of interacting with a group. Similarly, a recommender system where users interact with each other will implicitly be eliciting data useful for modeling human relationships. The data elicited by such systems is often very contextual—it is highly dependent on system/process design, affordances, culture, the environment, existing divisiveness and many other factors.

Question Set 1 (eliciting data for bridging)

1.1 What affordances provide the most useful information about relationships in a population? (Taking into account that respect for privacy is valuable both ethically and for gathering accurate data.)

1.2 Can elicitation methods from one domain (say, mini-publics) be translated to another (say, recommender systems)?

1.3 How can we mitigate and account for the fact that the act of eliciting information about relationships can itself influence those relationships?

3.2 Relation Model

A relation model is a formal representation of the relationships between people in a population, which can be learned or inferred based on the data available. These relationships could be explicit (e.g., friendships) or implied (e.g., the affinities between people who have similar preferences or worldviews). The relation model is a “state model,” describing a snapshot of the world at a given point in time.

Formally, a relation model—as we define it—can be decomposed into three components: people (the people who interact with the system); items (the alternative objects to which people may attend, which may also be people); and relations (the one-to-one relations between people and items, intended to capture goodwill, agreement, affinity, reactions, or similar). This general framework highlights a common structure shared by our three examples, as summarized in Table 1.

Table 1 Recommender systems, collective response systems, and human facilitation all share a common structure.

The three components will be modeled differently in different contexts. Most comprehensively, the relation model may simply be the totality of all available data available about the population. More structured models can also be used, two common examples of which are graph-based models and space-based models.

Graph-based models represent people as nodes in a (mathematical) graph or network. The edges in the graph characterize the relationship between people and can represent explicit, active communication channels or be weighted to represent more fine-grained and abstract types of affiliation. A simple example of a graph-based model is given in Figure 7(a).
Space-based models represent people as locations within an ambient “opinion space.” The similarity between people’s opinions, preferences, or viewpoints is characterized by how close to one another they are in this space. For example, people may be modeled as a location on the left-right political spectrum (a one-dimensional space) or assigned a position on a political compass (a two-dimensional space). A simple example of a space-based model is given in Figure 7(b). In general, the space might have hundreds or thousands of dimensions. The items may also be represented as points in the same space, in which case the proximity of a person to an item represents the degree of goodwill, agreement, or other “favorable” relationship between them.

These two approaches to relation modeling are not exhaustive. For example, it may make sense to model relations at a higher level of abstraction because it is not necessary to model individuals to represent useful information about divisions in a population. A very simple relation model might be: {70% of people like The Beatles, 50% like Adele, and 10% like Nickelback}. This tells us that there is overlap between people who like The Beatles and Adele, that both are fairly popular, and that Nickelback is comparatively not. Such aggregate models could incorporate item classifications (e.g., Nickelback might be labeled as nonbridging), or be broken down by subgroups. Note, however, that such models are likely less expressive than graph or space-based models, and depending on the percentages it will not always be possible to infer overlap.

Figure 7 Simple examples of (a) a graph-based model and (b) a space-based model.

Example 13 (recommender system)

The recommender systems on modern social media platforms are often built using deep neural networks that learn a numerical representation of each user and item. These vectors or “embeddings” are usually high-dimensional (hundreds or thousands of dimensions) and correspond to a position within a latent embedding space that characterizes each user’s history of behavior on the platform and, by implication, their opinions, preferences, and worldview. Thus, these embeddings constitute a space-based model. [34, 86]

Social networks also often have an underlying graph-based structure, such as graphs of friends on Facebook, or follow networks on Twitter. Such networks constitute graph-based models of people. There are other possible graph structures. For example, you could consider a single graph where both people and items are vertices, and edges are used to indicate interactions between people and items. In some cases, information from such graph-based models is translated to space-based models for use in recommender systems. [86]

Example 14 (collective response system)

Collective response systems like YourView or Polis involve people submitting items (called comments or responses) and voting on items shown via a recommendation algorithm. These votes are represented by a matrix where rows correspond to people, columns correspond to items, and the cell values indicate votes: “Agree” (encoded as +1), “Disagree” (−1), or “Pass” (0). The rows in the vote matrix can be viewed as the locations of people in a space-based model. So that it can be visualized, both YourView and Polis compress this relatively high-dimensional representation into a two-dimensional space-based model, positioning people closer together if their votes are more similar. Intuitively, this means that if two people voted the same way on every item, they will end up at the same point. [92] An example of the YourView space-based model is shown in Figure 8.

Figure 8 The YourView “Panorama”—a two-dimensional space-based model.

Example 15 (human facilitation)

Facilitators pay attention to how people relate to each other, and where they fall along the salient axes of disagreement within the groups that they are facilitating, thus intuitively applying qualitative versions of both graph-based and space-based models.

By introducing the term “relation model,” we are not aiming to prescribe a particular kind of model or claiming to have invented one but merely aiming to describe a role that certain models can play within an attention allocator. Models from many existing fields (social choice, opinion dynamics, recommender systems, etc.) could be used as a relation model. The new term also helps when talking about the commonalities across recommender systems, collective response systems like YourView and Polis, and human-facilitated deliberations.

How can information about the quality of relations be extracted from space- or graph-based models? In both cases, the models can be used to identify clusters or groups of people who think similarly. Divisions can be thought of as the spaces between these groups: how far apart they are, how they think about each other, and how they interact. Annotating the models with such additional structure can provide insight into the nature and strength of divisions that exist in a population. Figure 9 presents examples of this sort of clustering in both graph-based and space-based models.

Figure 9 Examples of clustering in relation models. Figure (A) is a graph-based model of Twitter users, clustered by their stance regarding the legal status of abortion. Figure (B) is a space-based model generated by Polis, where a New Zealand newspaper elicited a collective response on the topic of protecting biodiversity. Image credits: Clifton [28] and Scoop. [89]

Example 16 (collective response system)

In Polis, people are then clustered into two to five distinct groups, an example of which is shown in Figure 9(b). Clustering is performed directly on the two-dimensional projection using the k-means clustering algorithm, and a goodness-of-fit statistic is used to determine which number of clusters best fits the vote data. [92]

Question Set 2 (relation models)

2.1 What kinds of representations are appropriate for a given context? For example, discrete clusters (Polis) versus continuums (Twitter Community Notes).

2.2 How can these representations be best operationalized in that context? For example, Polis currently uses Principal Components Analysis to project the vote space into two dimensions, sets k for k-means clustering to 2-5 groups, and chooses the best k using silhouette coefficient to identify clusters. [73]

2.3 Are the data collected processes and relation models of existing attention allocators, such as recommender systems, sufficient to perform bridging-based ranking?

4 Signals

Signals are numbers that are aggregated into an overall measure of value and used as the basis for value optimization during an allocation process.

To make attention allocators more bridging, we need signals which indicate the degree to which a potential allocation is likely to be bridging. For example, we want to be able to quantify the degree to which recommending a particular Facebook post, or presenting a particular claim in Polis, is likely to achieve the bridging goal.

To aid discussion, we draw two distinctions between different kinds of signals. The first distinction is that signals can be observed or predicted, depending on whether the information they represent is already known. The second distinction is that signals can be causal or heuristic, depending on whether allocating attention to the corresponding object is known to causally contribute to bridging, or is merely thought to do so.

4.1 Practical Examples

In this section, we describe three approaches to constructing signals for bridging systems and give concrete examples of each.

4.1.1 Motifs

Signals can be built on motifs, which are patterns of interaction thought to be associated with certain outcomes—in our case, bridging. Here we describe three previously proposed motifs: diverse approval, response bimodality, and exposure diversity. Visual intuition for each of these motifs is given in Figure 10.

Figure 10 Examples of motifs which could form bridging signals. Note that evidence for the validity of all three motifs is limited, and in particular, some evidence suggests that use of the exposure diversity motif may increase affective polarization (see main text).

The first motif, diverse approval, states that bridging occurs when a person creates an item that is approved of, supported, or otherwise validated by people who would normally disagree with them. Such “surprising validation” [95, 81] is thought to correlate with accuracy and good faith communication, while also prompting observers to consider more perspectives (e.g., “if someone like that disagrees with me, maybe I had better rethink.” [95]). A slightly stronger version of the heuristic states that bridging occurs when an item is endorsed by people from multiple (i.e., more than two) diverse viewpoints. Versions of the diverse approval heuristic have been operationalized with success in at least two bridging systems.

Example 17 (collective reponse system)

In YourView, each participant had a “credibility” rating—a type of reputation score that quantified, primarily, the degree to which the comments they contributed on any particular issue attracted support from people who disagreed with them about that issue. Thus grounded, the definition of credibility could be extended recursively: YourView assumed that people were credible if they were respected or trusted by credible people they disagreed with. This circular definition is akin to the concept of “eigentrust” [1] or the logic of the PageRank algorithm, which has been applied in the context of social media to measure different forms of reputation. [5, 4, 83, 84]

Example 18 (recommender system)

Community Notes (formerly known as Birdwatch) is a Twitter feature that allows users to “collaboratively add helpful notes to Tweets that might be misleading.” [111] In its current form, Community Notes contributors can also rate notes contributed by others as “helpful,” “somewhat helpful,” or “not helpful.” Notes are awarded a high helpfulness score (and hence displayed publicly) only if they have been “rated helpful by raters with a diversity of viewpoints.” This is achieved using a version of the matrix factorization algorithm that is commonly used in recommender systems. [104]

Under the second motif, response bimodality, attention allocations are said to be (relatively) bridging if the distribution of ratings or reactions elicited by the relevant item is not polarized. Intuitively, a distribution is most polarized if it is markedly bimodal (that is, it has a “U” shape). One line of research on the use of recommender systems for depolarization relies on this heuristic. [7, 9, 8] They quantify the degree of polarization in a rating distribution by training a binary classifier that takes in features computed from a histogram of item ratings and outputs whether or not the distribution is polarized, as trained on “human expert” classifications.

The third motif, exposure diversity, states that bridging occurs when people attend to items from diverse sources, particularly from sources they don’t normally see. This motif implies that to facilitate bridging, people should be shown items from outside their “filter bubble” or “echo chamber.” However, a number of studies have failed to find evidence of algorithmic filter bubbles, [22] and other research suggests that indiscriminately showing people posts from their outgroup may cause relations to deteriorate. [11] For these reasons, the validity of the exposure diversity heuristic is questionable.

The interaction patterns captured by the above three motifs are conceptually simple. We can also define more complex and subtle motifs that, for example, model longer patterns of interaction, or incorporate more nuanced information about the people and items involved. These could be specified based on insights from bridging in offline settings [78, 90] or learned algorithmically. Below, we give an example of a learned class of motifs, positive interactions across divides, which generalizes the idea of “diverse approval.”

Example 19 (recommender system)

Consider a hypothetical social media platform named BridgeTok. BridgeTok built a dataset of posts with their subsequent social exchanges (reactions, comments, shares, etc.), along with human ratings of the degree to which each exchange was “positive.” They then trained a model to predict the positivity of any arbitrary interaction pattern. These predicted positivity scores, combined with existing user embeddings, are used to quantify the degree to which new interactions represent “positive interactions across divides.” The most bridging allocations of attention, according to this heuristic, are then selected for in the BridgeTok recommender system.

Note there is a precedent for this kind of modeling at Facebook: among the Facebook Papers provided by Frances Haugen, of experiments using “diverse positive motifs” to identify positive “conversations” and civic interactions. [39, 40, 41]

A focus on patterns of interaction, without consideration of the meaning of natural language content, increases the extent to which motif-based signals can be implemented internationally, across languages. That said, the effectiveness of a given “structural” heuristic for identifying bridging behaviors may depend on cultural and social context, and in some settings, content-based signals (Section 4.1.3) may be useful or necessary.

4.1.2 Surveys

Signals can also be built on survey questions, the responses to which indicate the degree to which a given allocation of attention is bridging. An example of such survey questions is given in Figure 11.

Figure 11 Example of a survey that could form a bridging signal. This example is taken from Milli et al. [65]

While surveys are burdensome for users to complete, it is not necessary to ask users to respond to a survey about everything they see. Instead, each user’s response can be predicted using a model trained on a much sparser dataset, and these predictions used as signals.

There is a huge number of potentially relevant questions that could be asked. A useful starting point for possible questions is Stanford’s Strengthening Democracy Challenge, [107] which included many relevant survey questions as outcome variables, including questions about partisan animosity, support for partisan violence, support for undemocratic practices, opposition to bipartisan cooperation, social distrust, social distance, anger towards outpartisans, empathy with outpartisans, perceived threat of outpartisans, and strength of partisan identity. In particular, affective polarization or partisan animosity is commonly measured using a feeling thermometer.

The feeling thermometer is a short survey instrument commonly used to measure affective polarization. In its simplest form, it consists of asking people to rate their feelings towards their country’s two largest political parties on a scale from 0 (cold) to 100 (warm). The difference between these two scores indicates the degree of affective polarization. Stray [93] proposed incorporating this short survey instrument into social media platforms. Survey responses could be linked with items recommended to the user shortly prior to their completing the survey. In this way, the degree to which responses trend towards less affective polarization could be used to quantify the degree to which recently shown items of content were bridging.

4.1.3 Content

Finally, signals can be based on (semantic) properties of the content itself. For example, one can use a machine learning model to estimate the degree to which content is dehumanizing or contemptuous. More examples of more complex properties relevant to bridging are given in Figure 12.

Figure 12 Examples of content labels that could form a bridging signal. These example are taken from the “archetypes of polarization on social media” framework described in Hawke. [48]

As with surveys, there are a large number of semantic properties of content that are potentially relevant for bridging. A useful starting point for possible properties to target is the “archetypes of polarization on social media” framework described in Hawke. [48] In this framework, polarization is decomposed into five archetypes—attitudes, affiliation, interaction, interests, and norms—and for each of these archetypes, there are several examples given of concrete online behaviors that reflect that archetype, most of which depend on an understanding of the semantics of an item of content. For example, polarization of attitudes can be reflected by “negative reactions (laughing, anger, etc.) to content with a group signifier,” “keywords and hashtags that refer to the outgroup,” “use of plural 'you' and 'they' pronouns in reply threads,” “generalized blame and attribution,” and “interaction with bias-affirming content.” Automated classifiers could be used to identify these kinds of semantic properties in items of content, and these semantic labels could be used as signals to prioritize allocations of attention that are bridging (or conversely, deprioritize allocations of attention that are divisive).

Question Set 3 (signals)

3.1 How can we evaluate the quality of bridging signals? What are their relative strengths and weaknesses? Are there particular domains where one signal is more appropriate than another?

3.2 How common are examples of diverse approval in practice (e.g., on modern social media platforms)? Are there enough examples to make a meaningful impact if they are featured more prominently by a recommender system?

3.3 To what extent does giving bridging signals more weight in the value model actually incentivize bridging behavior? In economic terms, what is the “value model elasticity of bridging”?

3.4 How computationally complex are different bridging signals?

3.5 To what extent are particular motifs valid across international contexts?

3.6 How could bridging signals be gamed or abused? What would “bridging-bait” content look like?

3.7 Some promising signals, such as diverse approval, may have significant bridging benefits but also incentivize shallow content (e.g., cat memes). How might they be refined or complemented by additional signals to incentivize deeper content?

3.8 How can the best bridging signals be effectively used in more federated or decentralized systems? If there are challenges such as deployments, under what circumstances are they fundamentally intractable?

5 Metrics

Metrics provide quantitative assessments of attention allocation systems; they are used both for ongoing monitoring and guiding system development (e.g., via A/B tests).

To be useful, metrics need a clear normative interpretation. Within a given context, we should be able to say that higher metrics reflect success at achieving the bridging goal, or that certain configurations of multiple metrics are better than others.

We distinguish between two kinds of metrics: relation metrics and bridging metrics. Relation metrics capture the quality or “health” of relations at a given point in time (as represented by the relation model). In contrast, bridging metrics capture the degree to which attention allocators are associated with an improvement or deterioration in relations, over time. To use a mathematical analogy, bridging metrics can be thought of as the “derivatives” of relation metrics with respect to time.

5.1 Practical Examples

In this section, we describe two approaches to constructing metrics for bridging systems and give concrete examples of each.

5.1.1 Prevalence of Signals

Most simply, metrics can be built on top of signals (Section 4) that indicate good or improving relations in a population. For example, a relation metric could be defined to be the overall prevalence in which a positive motif occurred on a platform during a given window of time (e.g., “weekly average number of instances of dehumanization, per user”). Correspondingly, a bridging metric could be defined to be the change in such a relation metric over time (e.g., “change in the weekly average number of instances of dehumanization, per user”), or the overall prevalence of a motif thought to be bridging (“weekly average number of participations in a diverse approval motif, per user”).

5.1.2 Formal Measures

Metrics can also be computed directly from a relation model, using summary statistics which are intended to capture normative judgments about whether the relations represented in the model are good, bad, improving, or deteriorating. A visual intuition for this is given in Figure 13.

Figure 13 Example of a formal measure that could be used as a metric. The formal measure used in this example is random walk controversy. [52]

Candidates for relation metrics include existing formal measures of polarization, which have been proposed for both space-based [37, 14, 59] and graph-based [67, 62] models. However, there are many different kinds of polarization, [20] and it is not clear whether existing polarization measures are appropriate targets for optimization. We believe the space of relation metrics is significantly underexplored and have begun a survey of potential metrics at bridging.systems/metrics. [100]

Example 20 (recommender system)

In the context of recommender systems and social media, relation metrics have mostly been proposed as summaries of graph-based models, such as follow networks on Twitter or the hyperlink networks of online news publications. These include metrics based on homophily (the degree to which nodes are connected to nodes that are similar to themselves), modularity (the number of intragroup connections relative to the number of intergroup connections), random walk controversy (a measure of how likely you are to cross between groups when randomly traversing the network), and balance theory (which measures how consistent the network is with properties such as “my friend’s friend is my friend” and “my friend’s enemy is my enemy”). See Interian et al. [52] for a comprehensive review.

Question Set 4 (metrics)

4.1 How can we evaluate the quality of relation metrics and bridging metrics in a given context? What are their relative strengths and weaknesses?

4.2 Given the challenges of measuring the effects of online platforms, [101] how can we produce bridging metrics that can be interpreted as causal estimates?

4.3 How might bridging metrics be misaligned, or be affected by the different versions of Goodhart’s Law [60]?

4.4 What can be done to minimize negative side effects if relation metrics are used as targets for optimization?

4.5 What is a reasonable single gold standard metric for bridging, that can provide useful intuition—a “GDP of bridging”?

5.2 Evaluation

The metrics described above (and by extension, their component signals) are intended to be optimized via changes in the design of an attention allocator. For this reason, it is important that these formalisms are closely connected to the underlying bridging goal (an increase in mutual understanding and trust across divides, creating space for productive conflict, deliberation, or cooperation). If the metrics do not capture the aspects of reality we care about, then efforts to optimize them will be at best ineffective, and at worst harmful.

Fortunately, there is a mature literature in the field of measurement theory that provides a framework for thinking about the quality of relation metrics and bridging metrics. In this section, we discuss two properties that it is good for metrics to have—validity and reliability—and consider what they mean in the context of bridging systems.

5.2.1 Validity

Validity refers to whether a metric actually measures what it claims to measure. More precisely, validity is the “degree to which evidence and theory support the interpretations of [metrics] for proposed uses of [those metrics].” [6]

To see the importance of validity for relation and bridging metrics, consider the consequences of optimizing for a metric that evidence suggests has poor validity. For example, a platform with good intentions may decide to reward exposure diversity in their recommender system—that is, to specifically show their users content from political leanings they disagree with. They might be “successful” according to this measure and interpret the increasing diversity of content viewed as indicative of increasing mutual understanding and trust across divides. However, it is possible that increased exposure to contrary viewpoints would actually cause people to double down on their existing beliefs and hence cause relations to deteriorate. [11] The reason for the discrepancy between the platform’s good intentions and the regrettable outcome is that exposure diversity has poor validity as a measure of bridging.

Relation and bridging metrics may lack validity for several reasons. They may be based on misconceptions (as in the case of exposure diversity). They may track the quality of relationships in a narrow scope observable by the platform but not correlate with the quality of relationships in broader society. They may identify content that is bridging if shown to one person, but not bridging when shown to 10 million (due to second-order effects). They may be based on relation models that are not expressive enough to capture the underlying plurality of perspectives (this may be the case with space-based models). [16, 103]) Or they may succumb to Goodhart’s Law [60]—that is, they may be inflated by actions that do not respect the original spirit of the metric. For example, promoting puppy videos may create more cases of diverse approval but not have any effect on the quality of political discourse.

Documenting design choices related to relation and bridging metrics, such as in reward reports, [45] might help monitor efforts to improve validity over time.

5.2.2 Reliability

Reliability is the degree to which a metric takes on similar values when calculated in similar contexts. A reliable relation metric will give similar values if calculated multiple times in quick succession, if independently calculated by different teams within a company, or if calculated on multiple random samples of a population of platform users.

Reliability is important because it is a prerequisite for validity. If a metric is too noisy or cannot be replicated, then it cannot be valid as an optimization target.

Relation and bridging metrics may lack reliability (of various sorts), for several reasons. They may be easily influenced by ephemeral current events such as political scandals. They may be affected by ongoing changes in the design or affordances of a platform, or in changes to the composition of its user base, that undermine the ability to interpret changes in the value of the metric across time. They may depend on an automated classifier which is iteratively being updated or improved, effectively changing the metric that is calculated. They may depend on survey questions which, when translated into different languages, lead to response distributions that are no longer comparable.

Question Set 5 (evaluation)

5.1 What are the limitations of different specific relation models (see e.g., [103]), signals, or metrics intended for bridging? How significant are they?

5.2 What gold standard measures or benchmarks should we use to evaluate relation models, signals, and metrics intended for bridging?

5.3 Which signals and metrics intended to promote bridging are more resistant to strategic manipulation?

6 Discussion

In this paper, we have emphasized connections and open questions across three domains—recommender systems, collective response systems, and human facilitation—to help formalize an interdisciplinary research area focused on bridging.

6.1 Implementation

Table 2 provides a summary of concrete steps that can be taken by practitioners to incorporate bridging into attention allocators.

Table 2 Examples of how to incorporate bridging into attention allocators.

6.2 Challenges, Limitations, and Risks

The idea of bridging systems and the interdisciplinary research we aim to articulate are not free from challenges or controversy and should not be considered a panacea. Here we explore these potential challenges and related limitations and risks. After grounding the discussion in terminology and common misunderstandings we articulate challenges to the goal of bridging itself, to our formalization of bridging, and to the implementation of bridging.

6.2.1 Intuition versus formalism

We defined the (intuitive) bridging goal as follows: an increase in mutual understanding and trust across divides, creating space for productive conflict, deliberation, or cooperation. It is common when going from intuitions to formalisms that only a small part of the intuition is captured, sometimes with significant negative societal implications. We attempt to navigate that dilemma by making the intuition explicit and emphasizing it as the ultimate goal to measure success (and formalisms) against. We relatedly don’t attempt to make the formalism fully capture this intuition, as no single formalism understandable by a human is likely to do so. We expect that the best outcomes are likely to result from relying on a broad set of bridging metrics that address different aspects of the intuition.

6.2.2 Common misunderstandings

This notion of bridging is commonly misunderstood (and may be incorrectly formalized) in ways that would lead to outcomes that fall outside of the definition. Bridging does not mean bringing everyone to a homogeneous center or eliminating conflict. Pluralism and productive conflict are valuable. Bridging also does not mean just showing content from across divides; research suggests that this can sometimes increase division. [10] Additional common misunderstandings are listed in Figure 14.

Figure 14 Common misunderstandings. Five common misunderstandings about bridging and diverse approval.

6.2.3 Challenges to the goal of bridging

As the bridging intuition is imprecise, it can have different meanings at different levels of abstraction, formalism, and implementation. This makes the notion of bridging and the formalisms used to measure or apply it contestable. To give a powerful example, one study has found that there is an intervention which simultaneously reduces partisan animosity and increases support for partisan violence, as measured by different survey questions. [107] The decision of whether this constitutes a net improvement or deterioration—and more generally, the decision of how to trade off different interpretations or formalizations of bridging—may be both statistically technical and significantly normative.

Many would argue that there should not even be an attempt to bridge some kinds of divides due to the abhorrence of particular groups, or their lack of respect for human rights. [61] This implies a key question: When is bridging appropriate, and when is it not [79]? There are also arguments that destructive conflict may be an important driver of social change, so reducing destructive conflict could limit the potential for such change. [32, 66, 54] It has been argued that bridging could “interfere with important processes that may be necessary for a democracy to determine the best path forward.” [29]

It is also possible that bridging is too neutral a goal, in that bridges can form around ideologies or opinions that are considered harmful. For example, there are likely some people on the political left and political right who share a prejudice towards a particularly disadvantaged minority group. Such a prejudice bridges those on the left and right but is not a desirable form of bridging. For this reason, one may need to specify additional constraints on the kinds of bridging that should be facilitated. Relatedly, in contexts where other factors, such as the quality of information, are far more important than mutual understanding (and where these are not as correlated as they appear to be with Twitter’s Community Notes, [111]) the bridging goal may also be too neutral (at least on its own).

Finally, there may be differential benefits of bridging to particular groups. For example, some groups may be less willing or able to change than others, perhaps due to significant punishments for engaging outside of the group dogma or because some groups do much more “internal bridging work” than others.

6.2.4 Challenges to our formalization of bridging

Formalizing bridging around attention allocation does not account for material sources of division, such as physical violence, economic inequality, or historical injustices. Extensions of bridging to more general allocation problems (e.g., around physical resources) may perhaps be helpful in addressing some of these critiques, but bridging would still be far from a panacea. In a world with many divides, attention allocations that bridge some divides may exacerbate others, and formalisms may not be able to meaningfully balance those contradictory impacts.

Some may argue that any formalism or optimization of concepts as complex and interpersonal as bridging is inherently inhumane or invalid. Regardless of whether humane formalization of bridging is possible, it is true that any practical formalism will be a considerable simplification of the complex psychological and social changes that correspond to the bridging goal. Such simplifications may not reflect important aspects of bridging and would benefit from ongoing critique and improvement.

Specific choices of relation models, relation metrics, and bridging metrics may lack validity (see Section 5.2.1) or may not be aligned with other societal values and goals. Specific approaches might cause unintended side effects or consequences, some of which may be significant. Some argue that bridging constitutes an attempt to influence a population, and that it is an open question whether this can be done in a manner that respects human agency and avoids unwanted manipulation [12, 25, 24] (though a similar argument might hold for any system that allocates attention). At the very least, bridging can be viewed as a form of “intrinsic governance” [55] or “social mediation”: [82] It is an attempt to make some types of communicative acts easier and others more difficult. There are significant, unresolved questions regarding the contexts in which such governance decisions are ethically justified, [55, 56] and how they can be made in a way that appropriately reflects ideals of democracy and subsidiarity—especially in light of a status quo where such decisions are already frequently made with far less benevolent aims.

6.2.5 Challenges to a technological implementation of bridging

Adversaries are likely to try to use fake or manipulated accounts in order to influence bridging-based ranking algorithms. This is already true of existing algorithms and similar safeguards may apply, but it is possible that there are either more or less algorithmic vulnerabilities as a result of a bridging-based ranking approach. We may also not have the technological capacity to predict societal outcomes with sufficient accuracy to implement important aspects of the intuition for bridging. [68] Instead, we may have to rely on heuristic signals like diverse approval (Section 4.1), which then need to be either validated or justified as having impacts “close enough” to the bridging goal (Section 5.2).

By including bridging in a value model, there is a trade-off between bridging and other goals, such as relevance or engagement. Including bridging may, in some cases, reduce the extent to which these other goals can be achieved. Bridging is less likely to be implemented by large social media and search engine companies unless it is either economically beneficial, neutral, or there is significant and sustained external pressure (or regulation). While it is possible that bridging-based ranking may reduce moderation costs and increase long-term user retention, it may have short-term negative impacts on engagement and growth. Bridging may be particularly challenging to maintain in an environment with sustained competition between platforms for attention, and easiest to maintain in a monopoly. However, in spite of these challenges, there have now been publicly disclosed implementations of bridging across multiple major platforms, albeit in limited areas thus far, and we have been informed of others that have not yet been made public. [104, 40, 41, 39]

There is considerable uncertainty over the impacts of a bridging implementation. It is possible that the creation of bridging content cannot be incentivized or that a particular implementation of bridging practically results in predominantly benign, widely acceptable content such as cat videos being promoted. Some may argue that algorithmic attention allocators are delegated too small a portion of human attention (relative to, say, interpersonal relationships or mainstream media organizations) for efforts incorporating bridging to have any meaningful impact. Bridging-based ranking may also be less impactful or less tractable to implement within distributed or decentralized social media environments, or in models of social media and mass communication that are yet to emerge (e.g., those involving federation or middleware) and in models that may not be based on ordinal ranking (e.g., those built around artificial intelligence- (AI-) generated summaries).

Finally, technologies designed to influence the types of conflict in society are inevitably dual use. If we can use technology to make conflict better, we can also make it worse. It is possible that the recommender systems currently used by large social platforms already cause conflict to deteriorate as a side effect, but none are explicitly designed to “increase polarization.” Developing technology for bridging may only be beneficial if the risks and impacts of such perverse misuse are sufficiently low.

6.2.6 Overcoming these challenges

A large part of the research required in this area is to better understand the extent to which these challenges can be avoided, mitigated, or managed. They must also be considered in the context of the status quo, key elements of which (e.g., optimizing for engagement in social media, political messaging, and advertising) are rarely subject to the same level of caution or scrutiny before implementation that we are applying here. In other words, the risks of intervention must be balanced with the risks of inaction, and as we argue in more detail in [70], the status quo for recommender systems leaves much to be desired.

Question Set 6 (challenges, limitations, and risks)

6.1 How can we best overcome the most critical challenges, limitations, and risks?

6.2 For what contexts and under what conditions is bridging appropriate, and where is incorporating bridging worse than the status quo?

6.3 Are there specific additional objectives that need to be incorporated in order for bridging to be helpful in some contexts, and if so, what are they and how can they be operationalized?

6.4 To what extent is the bridging goal in tension with other goals such as engagement or relevance?

6.5 Are there relatively underexplored approaches to bridging that might address many of these challenges? For example, is there a system that identifies signals of perception gaps [113] and dynamically elicits (or generates) content or activities that can help reduce those gaps, potentially all within an existing system such as a newsfeed recommender or a language model chatbot.

7 Conclusion

Our social spaces should not default to divisive. Bridging is a core part of healthy social fabrics and the systems that allocate attention within them, whether human or machine, implicit or explicit. Without sufficient bridging, destructive conflict may undermine our relationships and prevent us from cooperating effectively to respond to societal challenges.

While bridging may help counteract the “bias toward division,” it is unlikely to address the myriad of other biases that social spaces may entrench. We see this work as complementary to the lines of research around other biases, and it is important that we also keep in mind (and measure where appropriate) these other forms of bias when designing for bridging. We should also caveat that designing systems for meaningful positive interactions is part art, and in this paper we have framed it as a science. We encourage work that can go beyond “shallow bridging heuristics” to “deep bridging”—from bonds around cat videos to bonds about our very different struggles in living complex lives. In this paper we have also focused primarily on bridging as it relates to the selection of what to attend to among a set of options, but the intuitions and formalisms we have explored can also be adapted to the synthesis of such options—i.e., to generative systems and foundation models such as ChatGPT and GPT-4. [13, 21]

New kinds of social spaces, both online or offline, do not always incorporate hard-won lessons from the old. Modern computational and communication technologies have changed the structure of civilization, bringing billions of people onto the internet. [12] But as we moved online, bridging is also one of the elements of offline spaces that we have most underresourced.

Belatedly, some online spaces are catching up and setting an example. The deployment of the Community Notes feature on Twitter [105, 111] accelerated the adoption of bridging-based ranking and demonstrated that signals like diverse approval hold significant promise even at significant scale. Since the publication of the original bridging-based ranking paper, [70] we have heard from people interested in bridging across organizations of all sizes and kinds, from the largest tech companies, to recently incorporated startups, to the technology teams at traditional media organizations. Through this paper, we have attempted to provide a framework for understanding how such bridging systems can be understood and developed and laid out a set of research questions to support and accelerate their safe deployment.

There remains the normative question of to what extent bridging is appropriate for a given system, which is outside of the scope of this paper. Ideally, such decisions would be a collective choice of those impacted by the system. While such collective decision making might seem impractical for, say, platforms with billions of users, representative deliberations such as citizens’ assemblies, in combination with collective response systems, provide a potentially viable mechanism for such democratic decision making at global scale. [69, 72, 74]

Our omnipresent social media, search, and synthetic media systems were not designed to satisfy the bridging goal: to increase mutual understanding and trust across divides, creating space for productive conflict, deliberation, or cooperation. But these systems that help allocate our attention are not irredeemable—they have the potential to support a more deliberative, peaceful, and pluralistic future.

References

[1] Aaronson, Scott. “Eigenmorality.” Shtetl-Optimizedblog.June 18, 2014. https://scottaaronson.blog/?p=1820. Accessed May 30, 2022.

[2] Aggarwal, Charu. Recommender Systems. New York: Springer, 2016, 1.

[3] Allen, Danielle. “Toward a Connected Society.” Our Compelling Interests. Princeton: Princeton University Press, 2016, 71-105.

[4] Allen, Jeff. Analysis of Facebook’s Widely Viewed Content Report (Q4 2021 - Q1 2022). The Integrity Institute, June 2022.

[5] Allen, Jeff. How Communities Are Exploited On Our Platforms: A Final Look At The “Troll Farm” Pages. Facebook, April 16, 2023. https://www. documentcloud.org/documents/21063547-oct-2019-facebook-troll-farms-report.

[6] American Educational Research Association. Standards for Educational and Psychological Testing. Washington, DC: American Educational Research Association, 2014.

[7] Badami, Mahsa. “Peeking into the Other Half of the Glass: Handling Polarization in Recommender Systems.” PhD diss., University of Louisville, 2017. doi: 10.18297/etd/2693 . https://ir.library.louisville.edu/etd/2693. Accessed January 4, 2022.

[8] Badami, Mahsa and Olfa Nasraoui. “PaRIS: Polarization-aware Recommender Interactive System.” PhD diss., University of Louisville, 2021, 8.

[9] Badami, Mahsa, Olfa Nasraoui, and Patrick Shafto. “PrCP: Pre-recommendation Counter-Polarization.” Proceedings of the 10th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management. 10th International Conference on Knowledge Discovery and Information Retrieval. Seville, Spain: SpaSCITEPRESS - Science and Technology Publications, 2018, 282-289. doi: 10.5220/0006938702820289 . Accessed January 4, 2022.

[10] Bail, Chris. Breaking the Social Media Prism: How to Make Our Platforms Less Polarizing. Princeton: Princeton University Press, 2021.

[11] Bail, Christopher A., Lisa P. Argyle, and Taylor W. Brown. “Exposure to Opposing Views on Social Media Can Increase Political Polarization.” Proceedings of the National Academy of Sciences 115 no. 37. Washington, DC.: National Academy of Sciences, 2018, 9216-9221.

[12] Bak-Coleman, Joseph B. et al. “Stewardship of global collective behavior.” Proceedings of the National Academy of Sciences 118 no. 27, 2021, e2025764118. doi: 10.1073/pnas.2025764118.

[13] Bakker, Michiel A. et al. Fine-Tuning Language Models to Find Agreement among Humans with Diverse Preferences. arXiv:2211.15006. Ithaca, New York: Cornell University, arXiv, 2022. doi: 10.48550/arXiv.2211.15006.

[14] Baumann, Fabian et al. “Emergence of Polarized Ideological Opinions in Multidimensional Topic Spaces.” Physical Review X 11 no. 1, 2021, 11-12.

[15] Bengani, Priyanjana, Jonathan Stray, and Luke Thorburn. “What’s Right and What’s Wrong with Optimizing for Engagement.” Understanding Recommenders. April 27, 2022. https://medium.com/understanding-recommenders/whats-right-and-what-s-wrong-withoptimizing-for-engagement-5abaac021851.

[16] Bogomolnaia Anna and Jean-François Laslier. “Euclidean Preferences.” Journal of Mathematical Economics 43 no. 2, 2007, 87-98. https://www.sciencedirect.com/science/article/abs/pii/S030440680600111X.

[17] Botes, Johannes. “Conflict Transformation: A Debate over Semantics or a Crucial Shift in the Theory and Practice of Peace and Conflict Studies?” International Journal of Peace Studies 8 no. 2, 2003, 127. https://www.jstor.org/stable/41852899.

[18] Boxell, Levi, Matthew Gentzkow, and Jesse M. Shapiro. “Cross-Country Trends in Affective Polarization.” working paper 26669, The Review of Economics and Statistics.Cambridge, Massachusetts: MIT Press, National Bureau of Economic Research, 2022. doi: 10.1162/rest_a_01160.

[19] Brady, William J. and Jay J. Van Bavel. Estimating the Effect Size of Moral Contagion in Online Networks: A Pre-registered Replication and Meta-analysis. New Haven: Yale University, 2021. doi: 10.31219/osf.io/s4w2x.osf.io/s4w2x.

[20] Bramson, Aaron et al. “Understanding Polarization: Meanings, Measures, and Model Evaluation.” Philosophy of Science 84 no. 1, 2017, 115-159. doi: 10.1086/688938.

[21] Brown, Tom et al. “Language Models Are Few-Shot Learners.” Advances in Neural Information Processing Systems,” 34th Conference on Neural Information Processing Systems 33. Red Hook, New York: Curran Associates, 2020, 1877-1901.

[22] Bruns, Axel. Are Filter Bubbles Real? Medford, Massachusetts: Polity Press, 2019.

[23] Burgess, Guy, Heidi Burgess, and Sanda Kaufman. “Applying Conflict Resolution Insights to the Hyper-polarized, Society-wide Conflicts Threatening Liberal Democracies.” Conflict Resolution Quarterly 39 no. 4, 2022, 355-369. https: //onlinelibrary.wiley.com/doi/pdf/10.1002/crq.21334 .

[24] Carroll, Micah et al. Characterizing Manipulation from AI Systems, arXiv:2303.09387. New York: Cornell University, arXiv, 2023. https://arxiv.org/abs/2303.09387.

[25] Carroll, Micah D. et al. “Estimating and Penalizing Induced Preference Shifts in Recommender Systems.” Proceedings of the 39th International Conference on Machine Learning, eds. Kamalika Chaudhuri et al. 162, July 2022, 2686-2708. https://proceedings.mlr.press/v162/carroll22a.html .

[26] Carson, Lyn. Facilitating Public Deliberations. newDemocracy Foundation, podcast series, 2020. https://facilitatingpublicdeliberation.libsyn.com/.

[27] Celis, L. Elisa et al. “Controlling Polarization in Personalization: An Algorithmic Framework.” Proceedings of the Conference on Fairness, Accountability, and Transparency. FAT* ’19: Conference on Fairness, Accountability, and Transparency. New York: Association of Computer Machinery, 2019, 160-169. doi: 10.1145/3287560.3287601. Accessed January 4, 2022.

[28] Clifton, Brian. “How To Tell Whether a Twitter User Is Pro-choice or Pro-life without Reading Any of Their Tweets.” Quartz. October 9, 2015.

[29] Clyde, Austin. Algorithmic Systems Designed to Reduce Polarization Could Hurt Democracy, Not Help It. Tech Policy Press, February 17, 2022. https://techpolicy.press/algorithmicsystems-designed-to-reduce-polarization-could-hurt-democracy-not-help-it/.

[30] Coleman, Peter T., Larry S. Liebovitch, and Joshua Fisher. “Taking Complex Systems Seriously: Visualizing and Modeling the Dynamics of Sustainable Peace.” Global Policy 10 no. S2, 2019, 84-92. https://onlinelibrary. wiley.com/doi/pdf/10.1111/1758-5899.12680.

[31] Colson, Benoit, Patrice Marcotte, and Gilles Savard. “An Overview of Bilevel Optimization.” Annals of Operations Research 153 no. 1, 2007, 235-256. doi: 10.1007/s10479-007-0176-2.

[32] Coser, Lewis A. The Functions of Social Conflict 9. London: Routledge, 1998.

[33] Courant, Dimitri. “Citizens’ Assemblies for Referendums and Constitutional Reforms: Is There an ‘Irish Model’ for Deliberative Democracy?” Frontiers in Political Science,January 8,2021. doi: 10.3389/fpos.2020.591983.

[34] Covington, Paul, Jay Adams, and Emre Sargin. “Deep Neural Networks for YouTube Recommendations.” Proceedings of the 10th ACM Conference on Recommender Systems. RecSys ’16. Boston, Massachusetts: Association for Computing Machinery, 2016, 191-198. https://doi.org/10.1145/2959100. 2959190.

[35] Deutsch, Morton. “Conflicts: Productive and Destructive*.” Journal of Social Issues 25 no. 1, 1969, 742. doi: 10.1111/j.1540-4560.1969.tb02576.x.

[36] Dryzek, John S. et al. “The Crisis of Democracy and the Science of Deliberation.” Science 363 no. 6432, 2019, 1144-1146. doi: 10.1126/science.aaw2694.

[37] Duclos, Jean-Yves, Joan Esteban, and Debraj Ray. “Polarization: Concepts, Measurement, Estimation.” Econometrica 72 no. 6, November 2004, 1737-1772.

[38] Escobar, Oliver and Stephen Elstub. Forms of Mini-publics: An Introduction to Deliberative Innovations in Democratic Practice. Research and Development Note. May 8, 2017. https://newdemocracy.com.au/wp-content/uploads/2017/05/docs_researchnotes_2017_May_ nDF_RN_20170508_FormsOfmini-publics.pdf.

[39] Facebook. “2020 H2 in Review: Diverse Positive Motifs Can Improve Civic Conversations.” Facebook Papers about Bridging. Facebook, Bridging Systems blog, 2023. https://bridging.systems/ facebook-papers/.

[40] Facebook. “MSI Metric Changes for 2020 H1.” Facebook Papers Directory. eds. Dell Cameron, Shoshana Wodinsky, and Mack DeGeurin. Gizmodo, 2022. https://www.documentcloud.org/documents/21601827-tier2_rank_ro_0120.

[41] Facebook. “News Feed Research: Looking Back on H2 2020.” Facebook Papers Directory. eds. Dell Cameron, Shoshana Wodinsky, and Mack DeGeurin. Gizmodo, 2022. https://www. documentcloud.org/documents/21748428-tier1_news_ir_0221.

[42] Fournier-Tombs, Eleonore and Michael K. MacKenzie. “Big Data and Democratic Speech: Predicting Deliberative Quality Using Machine Learning Techniques.” Methodological Innovations 14 no. 2, May 2021. doi: 10.1177/20597991211010416.

[43] van Gelder, Tim. “Cultivating Deliberation for Democracy.” Journal of Deliberative Democracy 8 no. 1, May 1, 2020.

[44] van Gelder, Tim. “Public Wisdom.” 2020: Vision for a Sustainable Society, ed. Craig Pearson. Melbourne, Australia: Melbourne Sustainable Society Institute, 2012, chapter 10, 79–86. https://sites.google.com/site/timvangelder/publications-1/public-wisdom.

[45] Gilbert, Thomas Krendl et al. “Reward Reports for Reinforcement Learning.” arXiv:2204.10817. Ithaca, New York: Cornell University, arXiv, 2022. https://arxiv.org/abs/2204.10817.

[46] Giraudet, Louis-Gaëtan et al. “’Co-construction’ in Deliberative Democracy: Lessons from the French Citizens’ Convention for Climate.” Preprint. May 2022. https://hal-enpc.archives-ouvertes.fr/hal-03119539.

[47] Haidt, Jonathan and Christopher A. Bail. “Social Media and Political Dysfunction: A Collaborative Review.” Unpublished manuscript, New York University, November 2021.

[48] Hawke, Julie. “Archetypes of Polarization on Social Media.” Build Upblog,April 30, 2022. https://howtobuildup.medium.com/archetypes-of-polarization-on-social-mediad56d4374fb25. Accessed July 17, 2023).

[49] Helping Ensure News on Facebook Is From Trusted Sources. Facebook, January 2018. Accessed July 17, 2023.

[50] Hsiao,Yu-Tan et al. “vTaiwan: An Empirical Study of Open Consultation Process in Taiwan.” SocArXiv, July 4, 2018. doi: https://doi.org/10.31235/osf.io/xyhft.

[51] Hwang, Tim. Subprime Attention Crisis: Advertising and the Time Bomb at the Heart of the Internet. New York: FSG Originals, 2020.

[52] Interian, Ruben et al. Network Polarization, Filter Bubbles, and Echo Chambers: An Annotated Review of Measures, Models, and Case Studies, arXiv:2207.13799.Ithaca, New York: Cornell University, arXiv, 2022. doi: 10.48550/arXiv.2207.13799.

[53] de Keulenaar, Emillie. “Algorithmic Diplomacy: Implementing Mediation Techniques in YouTube’s Recommender System.” diss., University of Amsterdam, September 2018. https://scripties.uba.uva.nl/search?id=record_24357. Accessed April 9, 2022.

[54] Kreiss, Daniel and Shannon C McGregor. “A Review and Provocation: On Polarization and Platforms”. New Media & Society,April 11, 2023. https:// doi.org/10.1177/14614448231161880.

[55] Lazar, Seth. “Lecture I: Governing the Algorithmic City.” Obert C. Tanner Lecture on AI and Human Values, Stanford University. 2023. YouTube video, https://www.youtube.com/watch?v=MzRWdpB39qw.

[56] Lazar, Seth. “Lecture II: Communicative Justice and the Distribution of Attention.” Obert C. Tanner Lecture on AI and Human Values, Stanford University. 2023. YouTube video, https://www.youtube.com/ watch?v=97U8BZAbJYo.

[57] Lederach, John Paul. The Little Book of Conflict Transformation. New York: Simon and Schuster, 2015.

[58] Lorenz-Spreen, Philipp et al. “A Systematic Review of Worldwide Causal and Correlational Evidence on Digital Media and Democracy.” Nature Human Behaviour 7,November 7, 2022, 74-101. https://doi.org/10.1038/s41562-022-01460-1.

[59] Macy, Michael W. et al. “Polarization and tipping points.” Proceedings of the National Academy of Sciences 118 no. 50, December 6, 2021. https://www. pnas.org/doi/pdf/10.1073/pnas.2102144118.

[60] Manheim, David and Scott Garrabrant. Categorizing Variants of Goodhart’s Law, arXiv:1803.04585. Ithaca, New York: Cornell University, arXiv. https://arxiv.org/abs/1803.04585.

[61] Margalit, Avishai. On Compromise and Rotten Compromises. Princeton: Princeton University Press, 2009. doi: doi:10.1515/9781400831210.

[62] Matakos, Antonis, Evimaria Terzi, and Panayiotis Tsaparas. “Measuring and Moderating Opinion Polarization in Social Networks.” Data Mining and Knowledge Discovery 31 no. 5, September 2017, 1480-1505. doi: 10.1007/s10618-017-0527-9. Accessed January 9, 2022.

[63] McCoy, Jennifer and Murat Somer. “Toward a Theory of Pernicious Polarization and How It Harms Democracies: Comparative Evidence and Possible Remedies.” The ANNALS of the American Academy of Political and Social Science 681 no. 1, January 1, 2019, 234-271. doi: 10.1177/0002716218818782.

[64] Meta. Facebook Feed AI System, June 29, 2023. https://transparency.fb.com/features/ explaining-ranking/fb-feed/.

[65] Milli, Smitha et al. Twitter’s Algorithm: Amplifying Anger, Animosity, and Affective Polarization. Ithaca, New York: Cornell University, arXiv:2305.16941v1, 2023. arXiv 2305.16941v1.

[66] Mouffe, Chantal. “Deliberative Democracy or Agonistic Pluralism?” Social Research 66 no. 3, 1999, 745-758. https://www.jstor.org/stable/40971349.

[67] Musco, Christopher et al. How to Quantify Polarization in Models of Opinion Dynamics. Ithaca, New York: Cornell University, arXiv:2110.11981, 2021. https://arxiv.org/abs/2110.11981.

[68] Narayanan, Arvind. “How to Recognize AI Snake Oil.” Princeton University, 2019. https: //www.cs.princeton.edu/~arvindn/talks/MIT-STS-AI-snakeoil.pdf.

[69] Ovadya, Aviv. ’Generative CI’ through Collective Response Systems, arXiv:2302.00672. Ithaca, New York: Cornell University, arXiv, 2023. https://arxiv.org/abs/2302.00672.

[70] Ovadya Aviv. Harvard Kennedy School, Belfer Center for Science and International Affairs, May 17, 2022.

[71] Ovadya, Aviv. Holding Platforms Accountable Is Not Enough. We Need A ‘Compass’ For Social Technologies, November 9, 2021.

[72] Ovadya, Aviv. “Meta Ran a Giant Experiment in Governance. Now It’s Turning to AI.” Wired, July 18, 2023, 1059-1028. Accessed July 28, 2023.

[73] Ovadya, Aviv. Restricting Clustering to 2-5 Groups Impacts Group Aware I Informed Consensus and Comment Routing. Github, 2022. https://github.com/compdemocracy/polis/issues/1289.

[74] Ovadya, Aviv. Towards Platform Democracy: Policymaking Beyond Corporate CEOs and Partisan Pressure. Harvard Kennedy School, Belfer Center for Science and International Affairs, October 18, 2021. https://www. belfercenter.org/publication/towards-platform-democracy-policymaking-beyondcorporate-ceos-and-partisan-pressure.

[75] Ovadya, Aviv and Luke Thorburn. “Definitions.” Bridging Systems Researchblog,2023. https://bridging.systems/definitions/.

[76] Pettigrew, Thomas F. et al. “Recent Advances in Intergroup Contact Theory.” International Journal of Intercultural Relations 35 no. 3, May 2011, 271-280. https://www.sciencedirect.com/science/ article/pii/S0147176711000332.

[77] Polis. https://pol.is/. Accessed May 27, 2022.

[78] Powell. John A. “Bridging or Breaking? The Stories We Tell Will Create the Future We Inhabit.” Nonprofit Quarterly,February 15, 2021. https://nonprofitquarterly.org/bridging-or-breakingthe-stories-we-tell-will-create-the-future-we-inhabit/.

[79] Powell, John A. “Overcoming Toxic Polarization: Lessons in Effective Bridging.” Minnesota Journal of Law and Inequality 40, no. 2, June 2022, 247-247.

[80] Rastegarpanah, Bashir, Krishna P. Gummadi, and Mark Crovella. “Fighting Fire with Fire: Using Antidote Data to Improve Polarization and Fairness of Recommender Systems.” Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. WSDM ’19: The Twelfth ACM International Conference on Web Search and Data Mining. Melbourne, Australia: ACM, January 30, 2019, 231-239. https://dl.acm.org/doi/10.1145/3289600.3291002.

[81] Reisman, Richard. Filtering for Serendipity Extremism, "Filter Bubbles" and "Surprising Validators.” October 8, 2012. https://ucm.teleshuttle.com/2012/10/ filtering-for-serendipity-extremism.html.

[82] Reisman, Richard. “From Freedom of Speech and Reach to Freedom of Expression and Impression.” Tech Policy Press, February 14, 2023. https://techpolicy.press/fromfreedom-of-speech-and-reach-to-freedom-of-expression-and-impression/.

[83] Reisman, Richard. Method and Apparatus for an Idea Adoption Marketplace. United States Patent Office, no. 20040186738A1, September 1969. https://patents.google.com/patent/US20040186738.

[84] Reisman, Richard. The Augmented Wisdom of Crowds: Rate the Raters and Weight the Ratings, July 22, 2018. https://ucm.teleshuttle.com/2018/07/the-augmentedwisdom-of-crowds-rate.html.

[85] Ripley, Amanda. High Conflict: Why We Get Trapped and How We Get Out. New York: Simon and Schuster, 2021.

[86] Satuluri, Venu et al. “SimClusters: Community-Based Representations for Heterogeneous Recommendations at Twitter.” Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’20, virtual event: Association for Computing Machinery, 2020, 3183-3193. doi: 10.1145/3394486.3403370.

[87] Schirch, Lisa. A Roadmap for Collaboration on Technology and Social Cohesion. Policy Brief 154. Toda Peace Institute, 2023.

[88] Schirch, Lisa. “The Case for Designing Tech for Social Cohesion: The Limits of Content Moderation and Tech Regulation.” Yale Journal of Law and the Humanities, forthcoming, February 21,2023.

[89] Scoop. Protecting and Restoring NZ’s Biodiversity. Polis, September 22, 2019. https://pol.is/3atycmhmer.

[90] Shigeoka, Scott et al. Bridging Differences Playbook. Greater Good Science Center, 2020. https://greatergood.berkeley.edu/images/uploads/Bridging_Differences_Playbook-Final.pdf.

[91] Short, Michael and Tim van Gelder. “Full transcript: Tim van Gelder.” The Age,March 14, 2012. https://www.theage.com.au/national/full-transcript-tim-van-gelder20120313-1uxzr.html.

[92] Small, Christopher et al. “Polis: Scaling Deliberation by Mapping High Dimensional Opinion Spaces.” Recerca: Revista de Pensament i Anàlisi 26 no. 2, 2021, 126. https://www.e-revistes.uji.es/index.php/recerca/article/view/5516.

[93] Stray, Jonathan. “Designing Recommender Systems to Depolarize.” arXiv:2107.04953. Ithaca, New York: Cornell University, arXiv, 2021. arXiv: 2107.04953.

[94] Stray, Jonathan, Ravi Iyer, and Helena Puig Larrauri. “The Algorithmic Management of Polarization and Violence on Social Media.” Draft for the Knight First Amendment Institute, Columbia University, 2023.

[95] Sunstein, Cass. “Breaking Up the Echo.” The New York Times,September 17, 2012. https: //archive.is/RgAX5.

[96] Svolik, Milan W. “Polarization Versus Democracy.” Journal of Democracy 30 no. 3, July 2019, 20-32.

[97] Szreter, Simon and Michael Woolcock. “Health by Association? Social Capital, Social Theory, and the Political Economy of Public Health.” International Journal of Epidemiology 33 no.4, August 2004, 650-667.

[98] Technology and Social Cohesion, Council. The Case for Designing Tech for Social Cohesion. 2023. https://techandsocialcohesion.org/wp-content/uploads/2023/02/Digital_Tech_ SocialCohesion_ExecSummary.pdf.

[99] Thorburn, Luke, Priyanjana Bengani, and Jonathan Stray. “How Platform Recommenders Work.” Understanding Recommenders,January 20, 2022. https://medium.com/understandingrecommenders/how-platform-recommenders-work-15e260d9a15a.

[100] Thorburn, Luke and Aviv Ovadya. “Relation Metrics.” Bridging Systems Researchblog,2023. https://bridging.systems/metrics/.

[101] Thorburn, Luke, Jonathan Stray, and Priyanjana Bengani. “How to Measure the Effects of Recommenders.” Understanding Recommenders,July 20, 2022. https://medium.com/ understanding-recommenders/how-to-measure-the-causal-effects-of-recommenders5e89b7363d57.

[102] Thorburn, Luke, Jonathan Stray, and Priyanjana Bengani. “What Does it Mean to Give Someone What They Want? The Nature of Preferences in Recommender Systems.” Understanding Recommenders,March 11, 2022. https://medium.com/understanding-recommenders/whatdoes-it-mean-to-give-someone-what-they-want-the-nature-of-preferences-inrecommender-systems-82b5a1559157.

[103] Luke Thorburn, Jonathan Stray, and Priyanjana Bengani. “What Will ‘Amplification’ Mean in Court?” Understanding Recommenders,May 23, 2022.

[104] Twitter. “Note Ranking Algorithm.” Community Notes Guide,2022. https://twitter.github. io/communitynotes/ranking-notes/. Accessed January 9, 2023.

[105] Twitter. “Overview.” Community Notes Guide, 2022. https://twitter.github.io/ communitynotes/. Accessed January 9, 2023.

[106] Vecchi, Eva Maria et al. “Towards Argument Mining for Social Good: A Survey.” Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, (Volume 1: Long Papers). Association for Computational Linguistics, August 2021, 1338-1352. doi: 10.18653/v1/2021.acllong.107.

[107] Voelkel, Jan G. et al. “Megastudy Identifying Successful Interventions to Strengthen Americans’ Democratic Attitudes.” Stanford University, Strengthening Democratic Challenge, 2022. https://www.strengtheningdemocracychallenge.org/ paper/.

[108] Walter, Barbara F. How Civil Wars Start. New York: Crown, 2022.

[109] Weyl, Glen. “Why I Am a Pluralist.” RadicalxChange,February 10, 2022. https://www. radicalxchange.org/media/blog/why-i-am-a-pluralist/.

[110] Williams, James. Stand Out of Our Light: Freedom and Resistance in the Attention Economy. New York: Cambridge University Press, 2018.

[111] Wojcik, Stefan et al. Birdwatch: Crowd Wisdom and Bridging Algorithms can Inform Understanding and Reduce the Spread of Misinformation, arXiv:2210.15723. Ithaca, New York: Cornell University, arXiv, 2022. doi: 10.48550/ARXIV.2210.15723.

[112] YourView. Accessed February 26, 2014.

[113] Yudkin, Daniel A., Stephen Hawkins, and Tim Dixon. The Perception Gap: How False Impressions are Pulling Americans Apart. PsyArXiv, September 14, 2019. doi: 10.31234/osf.io/r3h5q.psyarxiv.com/r3h5q.

[114] Zhu, Liwang and Zhongzhi Zhang. “Minimizing Polarization and Disagreement in Social Networks via Link Recommendation” in Advancements in Neural Information Processing Systems 35. Cambridge, Massachusetts: MIT Press, 2021, 13. https://proceedings.neurips.cc/paper/2021/hash/101951fe7ebe7bd8c77d14f75746b4bc-Abstract.html.

Acknowledgements

We thank Tim van Gelder and Colin Megill for sharing information and insights from their work on YourView and Polis, respectively, and members of the GETTING-Plurality Research Network at Harvard University (gettingplurality.org) for their ongoing feedback and support. We would additionally like to thank Danielle Allen, Natania Antler, Jay Baxter, Priyanjana Bengani, Leisel Bogan, Paul Bouchaud, Jason Burton, Micah Carroll, Alan Chan, Austin Clyde, Madeleine Daepp, Fernando Diaz, Joe Edelman, Davis Foote, Thomas Gilbert, Katy Glenn Bass, Sam Hinds, Ravi Iyer, Amritha Jayanti, Julia Kamin, Emillie van de Keulenaar, Andrew Konya, David Krueger, Stephen Larrick, Seth Lazar, Jesse McCrosky, James Mickens, Smitha Milli, Arvind Narayanan, Max Nickel, Kyle van Oosterum, Kathy Pham, Maria Polukarov, Helena Puig Larrauri, Joaquin Quiñonero Candela, Richard Reisman, Afsaneh Rigot, Bruce Schneier, Christopher Small, Mia Speier, Jonathan Stray, Ted Suzman, Carmine Ventre, Glen Weyl, Cathy Wu, and Jessica Yu, among others, for helpful discussions and feedback. Any errors or limitations of this work remain those of the authors.

Aviv Ovadya was supported in part by a Technology and Public Purpose Fellowship at the Belfer Center for Science and International Affairs, Harvard Kennedy School. Luke Thorburn was supported by UK Research and Innovation [grant number EP/S023356/1] in the UKRI Centre for Doctoral Training in Safe and Trusted Artificial Intelligence (safeandtrustedai.org), King’s College London. Graphics were developed in part with support from Tereza Flídrová through a grant from the Plurality Institute (plurality.institute).

Cite as: Aviv Ovadya and Luke Thorburn, Bridging Systems: Open Problems for Countering Destructive Divisiveness Across Ranking, Recommenders and Governance, 23-11 Knight First Amend. Inst. (Oct. 26, 2023), https://knightcolumbia.org/content/bridging-systems [https://perma.cc/6UQD-JA3Y].

Recent work has also moved towards using multiple measures of division, including “resistance to cross-partisan collaboration,” “resistance to interpersonal contact with outpartisans,” and “willingness to use violent tactics against outpartisans.” [107]

An economics framing of the ultimate goal might describe it as countering the “externality of incentives for divisive behavior” by “intentionally subsidizing bridging.”

For example, a system might bridge divides by facilitating understanding that one’s existing values and beliefs are much more similar than expected to those of other people, closing what has been called the “perception gap.” [113] In this way, beliefs about other people’s beliefs are changing, while personal beliefs are staying fixed.

There are also many associated nonacademic bodies of knowledge around attention allocators generally and bridging specifically which can be learned from across industry, government, and civil society.

The learning process is optional because some systems (e.g., chronological recommender systems) do not require learning.

The value model formally defines what it means for an allocation to be “worthy.” For example, a recommender system might define value to be a weighted sum of different measures of engagement, while a human facilitator will have a qualitative value model that balances the needs of group members to be heard with the need for the group to deliver on its remit.

Going forward, we will simply use “data collection” as a catch-all term.

This is an intentionally broad definition, which may be more expansive than much of the current usage across machine learning and psychology.

In practice, most items will be removed from consideration during a candidate generation phase, to reduce computational cost. [99]

This phenomenon of attention allocators feeding into each other can be called an attention stack or attention delegation network depending on the structure.

However, separately, the recommender’s learning system may be used to train the allocation system; such a learning process would involve attending as it materially changes the system. There are some subtleties here, as our definition of attention is relative to the system boundary that is being examined, which can include multiple people, algorithmic systems, organizations, etc. As another example, if someone uses an LLM to summarize an article and then reads the summary, that summarizer is an attention allocator, but it is not attending; since its state is not changing, it is simply outputting. However, the combined system of the person and the summarizer together are attending to the article. Similarly, if a person attending to an article is an employee of an organization, that organization (system) can also be said to be paying attention to the article.

Note that we can only observe phenomena like bridging-ness or entertaining-ness indirectly via measurable outcomes, such as the diversity of people who comment on a post or the watch time on a YouTube video. Thus, whenever we use the word “accuracy,” we mean accuracy as measured against these measurable outcomes, which are proxies for the phenomena of interest. In many cases, there may not be a literal ground truth against which to measure accuracy, as it may not be possible to know the true impacts depending on what is predicted (e.g., we can’t know true bridging-ness or entertaining-ness, we rely on proxies for them).

Reward Reports provide an approach for understanding the components and interactions within this optimization stack
[45]

In principle, one way to promote bridging would be to cause people to value bridging impacts more within their own, personal attention allocators. However, having people unilaterally change their behavior at a significant scale is a difficult collective action challenge and faces significant psychological friction; thus our focus on algorithmic and institutional bridging systems.

We use musical artists as an example, but these could equivalently be levels of support for different political positions. For example, Facebook’s identification of news sources that are seen as broadly trustworthy might be considered an example of such a model. [49]

We use the term “signals” because it is used by platforms who manage recommender systems (e.g. [64]). In our more general model of an attention allocator (Section 2), signals correspond to the predicted impacts.

When predicted, they can be included among the predicted impacts of an allocation in the allocation process of an attention allocator (Figure 3) and included in the value model according to which allocations are selected. When observed (retrospectively), they can be used for monitoring, and compiled into datasets to train the models used for impact prediction.

We expect that in practice most signals will be heuristic due to the impracticality of inferring, at scale, the causal impact of every individual item of content. However, such causal inference has been done in small settings. [107]

It is theoretically straightforward to determine when there is a divide between two people; this can be quantified by a metric between their respective vertices or positions in graph- or space-based relation models. Such divides are already inferred automatically by many existing attention allocators.

Though in most cases, their computational complexity remains an open question.

We also recognize that this level of investment will not always be feasible. Even creating a set of modular bridging metrics and algorithms appropriate to a large variety of contexts is a significant public goods challenge.

If AI advances enable simulation and automation of high-quality humanlike facilitation, that may allow us to move beyond formalisms, though at the risk of reduced understanding of what such systems are doing.

Aviv Ovadya is affiliated with Harvard's Berkman Klein Center, a visiting researcher at Cambridge University's Center for the Future of Intelligence, and consults for technology companies, civil society organizations, and funders.

Luke Thorburn is a Ph.D. student at King's College London and a research fellow at the AI & Democracy Foundation.

Filed Under

Essays and Scholarship