Introduction
In recent years, risk mitigation has grown increasingly salient in the AI governance landscape. Across the world, both countries and multilateral organizations moved beyond high-level statements about the risks posed by AI and toward adopting frameworks, laws, and policies to clarify the rights and values that AI developers and users ought to respect—and the practices they should adopt to do so. From the European Union Artificial Intelligence Act (Regulation (EU) 2024/1689, hereinafter EU AI Act), to the Biden-Harris Administration rules governing United States federal agencies’ use of AI (Young), to a unanimous United Nations General Assembly resolution on trustworthy AI (UN G.A. Res. 78/265, hereinafter UNGA AI Resolution 2024), these policies articulated the public interest that needs to be protected against AI risks. Countries also founded new AI safety institutes to develop the science and practices to design, evaluate, and use AI responsibly and formed an international network to enhance global collaboration on the technical aspects of AI safety.
In parallel, many policymakers and stakeholders have become concerned about the increasing capabilities of AI models. As a result, many of the emerging AI risk management actions and practices focus on governing AI models through improved testing and evaluation, safeguards on model inputs and outputs, and limiting access to AI models’ weights. Supporters of this model-centric governance approach argue that interventions at the AI model training and release stages can reduce the risks posed by downstream uses, especially misuses, which may be particularly important given the increasingly pervasive use of generative AI models. Some also argue that certain model outputs are harmful and thus can be most efficiently and effectively mitigated at the model development stage.
Critics argue that model-centric governance is infeasible and ineffective, and entails collateral cost to innovation, scientific practice and progress, economic competition, and other approaches to risk mitigation. They claim that constraining models can stymie productive downstream AI applications, whilst doing little to prevent AI harms. We find many of the critiques of model-centric governance (Narayanan & Kapoor 2024) compelling. However, our goal here isn’t to settle this debate but to structure it. Current AI risk management activities are stymied by the lack of a conceptual framework to reason about the strengths and weaknesses of various intervention sites (data, models, applications, policies, etc.), resulting in a constrained set of methods, tools, and enlisted expertise.
As AI risk management transitions from principles into law and practice, we must assess whether it is intervening in the optimal points in the sociotechnical system, imposing responsibilities on the right actors, deploying the right tools, and enlisting the right expertise to reduce the harms AI use can produce and exacerbate. Today, AI risk management activities are increasingly unmoored from the goal of protecting the public’s rights and safety and public goods. Even well-intentioned and well-executed model evaluations are insufficient, on their own, to mitigate harms of AI deployed in context. Moreover, the expertise necessary to effectively and legitimately mitigate harms often lies outside the companies that develop AI models.
In this essay we seek to recenter the prevention of harms (i.e., realized negative impacts) on the ground as the goal of AI risk management. We argue that accomplishing this task requires AI risk assessments at the sociotechnical level. Only a sociotechnical systems orientation to risk assessment that accounts for the technical and organizational context (Dobbe 2022) can produce a full understanding of the ways in which system components—human, material, technical, and social—coproduce harms or exacerbate the possibility of their production. A sociotechnical approach clarifies the range of potential sites for risk mitigation activities that directly reduce harms and reduce their probability. As we show, intervening at these various sites—data sets, models, organizational processes, professional training—requires an expanded set of mitigation methods and tools. The expanded methods and tools of risk mitigation introduced through the sociotechnical frame elucidate the variety of competencies and expertise necessary to effectively and legitimately manage AI risks, thus requiring policy frameworks that broaden the frame of assessment and potential interventions from technical to sociotechnical and broaden the actors enlisted in those activities.
Left on its current path, AI risk management will drive a proliferation of technocratic practices that fit the workflows and expertise of powerful entities, such as large tech companies building AI models, and the specific, predominantly technical, experts they employ, but fail to sufficiently reduce harm from AI systems. Mitigating AI-related harms requires policymakers to reaffirm the goal of risk management as reducing harms, identify appropriate sites of intervention, and expand the tools and experts involved in AI risk management.
Part I describes and contrasts aspects of various AI risk management frameworks including the EU AI Act, the U.S. guidance to federal agencies on the responsible use of AI (Young 2024), the United Kingdom’s AI Security Institute’s research agenda (AISI 2025), the U.S. National Institute of Standards AI Risk Management Framework (NIST 2023, hereinafter NIST AI RMF) and the voluntary AI commitments secured by the Biden-Harris Administration (White House 2023). We highlight what AI risk management frameworks seek to protect against AI risks (e.g., rights, safety, democracy, etc.), who they task with managing risks, how (the tools and methods) they direct be used to mitigate risks, and where (the frame(s) e.g., data, model, system, etc.) they direct interventions. We identify and critique the narrow focus on technical systems in some of these approaches, and often on models specifically, and the related emphasis on technical and computational testing and mitigation methods and practices which on their own are insufficient to address real world harms.
Part II presents our conceptual framework for approaching AI risk management. The framework proposes two important analytic shifts to AI risk management: a sociotechnical approach to risk management, and a preference for interventions and collections of interventions that prevent harms (realized negative impact) over those that reduce particular component hazards (probability of future harm). These analytic shifts require policies that broaden the methods and tools of risk mitigation and disentangle ownership or control of AI models and systems from participation in risk management, making way for other entities who possess relevant expertise, operational capacity, and independence. These two steps will bring in the broader set of experts and accompanying methods and tools required to effectively and legitimately reduce AI related harms and hazards.
In Part III, we examine risk mitigation approaches to addressing a particular example where this framework is helpful for mitigating risk. In particular, we look at image-based sexual abuse exacerbated by AI capabilities through the lens of our proposed reorientation of AI risk management.
In Part IV, we offer four recommendations to policymakers. First, policymakers should begin by developing a sociotechnical system map that identifies the technical and organizational system components related to the harm under exploration. Second, deployers of AI systems should be tasked with assessing and mitigating risks of AI use cases, not systemic risks. Third, regulatory frameworks should reduce reliance on the developers and deployers of AI systems to independently engage in risk mitigation activities. Policymakers should incentivize entities that develop and deploy systems to enlist external stakeholders with risk-relevant expertise in risk mitigation. This includes bringing external stakeholders into strategic decisions about where and how to mitigate risks, and, where relevant, directly into risk mitigation activities. Finally, governments and companies need to invest in the infrastructure and research to support sociotechnical evaluations and the richer set of technical and non-technical risk mitigation techniques required to reduce harms.
I. Limitations of AI Governance Frameworks’ Approaches to Risk Management
Policymakers in the United States and the European Union have introduced new governance efforts to manage the risks AI poses to a range of the public’s rights and safety, as well as public goods, including cybersecurity and biosecurity, among others (EU AI Act; Exec. Order No. 14110; Young 2024; Biden 2024, NIST 2023; International Organization for Standardization and International Electrotechnical Commission 2023, hereinafter ISO/IEC 2023). These risk-based (or management-based) regulatory approaches stand in contrast to regulatory approaches that establish ex ante rules requiring particular conduct or particular measurable outcomes. Both academics (Marcus 2023; Marchant & Stevens 2017; Matthew & Suzor 2017; Scherer 2016) and AI companies (U.S. Congress, Senate Committee on the Judiciary, Subcommittee on Privacy, Technology, and the Law 2023a; 2023b) have advocated for risk-regulation approaches that direct or encourage entities to evaluate the risks generated by their AI (this could be at the level of models or systems or use cases) and adopt mitigations to manage those risks.
AI regulations and governance efforts direct entities to mitigate risks to a wide range of objects, such as rights, safety, democracy, and the environment (White House OSTP 2022; Exec. Order No. 14110; Biden 2024; EU AI Act; UNGA AI Resolution 2024). Below we discuss four weaknesses of many current AI risk governance frameworks: insufficient attention paid to the relationality of risk, reliance on developers and deployers for risk management, emphasis on technocratic tools and methods, and over-reliance on model-centric mitigations.
A. Component-by-component hazard reduction may not reduce harms
While most of the AI governance frameworks identify the objects to be protected, they allow entities to direct mitigation efforts as they see fit. The resulting component-by-component analysis with each entity focusing on the technical artifact they develop or deploy lacks the coordination necessary to reduce harms that are largely produced by interactions in sociotechnical systems. These mitigation activities often target hazard (possibility of future harm) reduction and do not consider whether the actions undertaken collectively reduce actual harms.
This component orientation combined with “if-then” logic pervades AI safety discourse to drive ex ante speculation of potential hazards and mitigation practices that center models and other technical artifacts along with technical methods and expertise to produce risk mitigation efforts that over-emphasize component-level hazards and presume linear relationships between hazard reduction and harm reduction (Karnofsky 2024). Today’s frameworks drive AI risk mitigation approaches that tend to posit risk as emerging in a causal mechanistic chain flowing downwards from particular components, typically models or model outputs. As Dobbe (2025) argues, the construction of safety as being “of” an AI system, in the sense of a “property,” and about “avoiding harmful outputs,” misunderstands safety as “model reliability.” Safety cases from the AI Security Institute illustrate Dobbe’s point: they aim to make “inability arguments” to assure (Goemans et al. 2024) why model capabilities would not “incur[] large-scale harm” (Clymer et al. 2025). By positing that the model capability itself leads to harm, structured in an if-then rationality, they derive mitigations that center the model (even if they are not necessarily targeting the model itself—for example, cybersecurity interventions). In other words, first something is posited as a hazard, then its pathways to harm are derived. Good governance ought to do the reverse.
Recent reports from users of chatbots of mental health and physical harms reveals a particular limitation of the if-then approach to AI risk. An if-then orientation enframes interventions at the model output level by building out technical solutions such as content guardrails that prevent models from producing certain kinds of “disallowed” content (OpenAI 2023). Yet users engage with models in ways unanticipated by developers that leads to an overflowing of this framing, for instance, by engaging in long form conversations which erode the efficacy of safety training (Eliot 2025). Although developers may attempt to limit user interactions they consider ‘misuse’ through terms of service, science and technology studies scholars have long told us that designers cannot rationally predict or control all the ways in which users and stakeholders will interact with an artifact. Instead, research and engineering practices must be contextually situated and account for foreseeable (even if prohibited) uses (Suchman 1987; Dourish 2001). In the automobile industry, this perspective has meant that vehicle manufacturers must account for failures that arise from “ordinary abuse” regardless of whether the abuse is legally or contractually prohibited (Goldenfein et al. 2020). Determining what is “reasonably foreseeable" and what is “ordinary abuse” requires an approach to risk that is not merely technological but sociotechnical (Goldenfein et al. 2020). This view, that risks too are embodied, reveals that harms are not, or at least not only, in the capabilities of models, they are in the world.
The weakness of the component-by-component and if-then approach to mitigating risks is compounded by the failure to understand risks as relational and embedded in broader sociotechnical systems, not produced solely by a system output. Models and other technical components are individual parts of an assemblage that encompasses the social, political, and economic as well as the technological, and it is often the interactions between these components that produce the hazards that lead to harm. Recovery efforts following Hurricane Katrina illustrate the complex interactional nature of harms. The emphasis on the breaking levee as the hazard led to the construction of a $14.5 billion flood control system (Layne 2021), yet the harms of the flood were coproduced by the hurricane, the failure of the levee, hollowed out public services, and other crumbling infrastructure (Knowles 2014). Focusing on the levee alone produces a stilted picture of what contributed to the death of nearly 1,400 people and a limited view of what ought to be fixed to avoid similar future harms (Knabb et al. 2023). Similarly, the disastrous consequences of the 2025 Central Texas floods can be partially attributed to the social and political conditions that shaped Kerr County’s vulnerability to shocks (Colman et al. 2025). In both instances, the levees and the flash flood are parts of an assemblage—encompassing the social, political, and economic, as well as the technological—necessary to protect against harm. It is the interactions between these components that produced the hazards that led to loss of life, not the levee failure or flash flood alone.
While risk management should be concerned with potential hazard reductions across components, as we explain in Part II below, the failure of AI governance frameworks to take a systems approach undermines optimal risk management strategies. While yielding many efforts that reduce hazards, those efforts may not compose to actually reduce harms. Harms emerge from systems, through the way that complex components interact, through everyday practices, not in the model artifact alone. To that end in Part II, we propose the handoff lens as an analytic approach compatible with this sociotechnical perspective on risk management.
B. Risk management in the hands of developers and deployers limits expertise and undermines legitimacy
AI governance frameworks generally task entities developing and deploying AI systems with risk management responsibility. The EU AI Act imposes a suite of testing and evaluation requirements on AI systems of varying risk levels. It creates opportunities for a broader range of stakeholders to participate in developing the code of practice and standards to support the risk assessments it requires developers and deployers to undertake, but largely leaves those providing and deploying models and systems to do so without external involvement. Even providers of high-risk AI systems and general-purpose AI (GPAI) models are trusted to independently evaluate the risks posed by their models. The exception is AI models and systems intended to be used as a safety component of a product or be a product which require external evaluations. While requirements on federal agencies imposed during the Biden-Harris Administration and left intact by the Trump Administration leave agencies to test and evaluate their own systems, they recommend some level of internal independence, i.e., employees testing or auditing systems should be distinct from those developing or deploying them, and imposed a set of high-level risks that agencies had to care about mitigating, rather than leaving that risk identification to agencies themselves (Young 2024, §5c IV C). The guidance also requires agencies to engage affected stakeholders in the design of AI systems, including risk mitigation choices, however it does not prescribe how agencies should execute on this obligation and instead offers an inexhaustive list of potential methods (Young 2024, §5c IV C).
Scholars have critiqued AI governance frameworks for leaving regulated entities fully in charge of risk management processes and mitigations (Kaminski 2023; Wachter 2024) and called for more government involvement. Some researchers argue that AI evaluations produced under current frameworks reduce the public’s understanding of AI risks, acting as a form of “safetywashing” (Henshall 2024). These critiques align with those of regulatory scholars who have identified the risks to public goals and public trust of delegating decisions about how to achieve regulatory objectives to regulated entities. Kenneth Abbott and Duncan Snidal explain the inability of any single non-state actor to provide all the competencies required for “regulatory standard-setting” (RSS), a term they coin to capture the emerging transnational “non-state and public-private governance arrangements focused on setting and implementing standards for global production in the areas of labor rights, human rights and the environment” (Abbott and Snidal 2009). With respect to regulated entities, Abbott and Snidal note “firms lack independence…a weakness especially significant for monitoring…are relatively weak on normative expertise and commitment…[and] are not generally representative beyond their economic stakeholders” (emphasis added) (Abbot and Snidal 2009). Due to these deficiencies they conclude “firms are unlikely to produce regulatory standards and programs that serve common interests and may lack legitimacy and credibility in the eyes of the public—and certainly those of activists—even if they are sincere about self-regulation” (Abbot and Snidal 2009). Kenneth Bamberger’s work further explains how “corporate structures, mindsets, and routines developed to allow efficient firm behavior can skew compliance efforts by filtering out the very information about risk and change that regulation seeks to identify” (Bamberger 2006).
Other scholars place the responsibility for risk identification and mitigation activities on entities developing and using AI models and systems, seeing this as the only realistic regulatory strategy given the imbalance in expertise and resources between the public and private sectors (Coglianese and Crum 2025; Wasil et al. 2024). While the expertise and resources of model developers and deployers are necessary for successful AI risk management efforts, like in RSS, they are insufficient. AI risk management efforts like RSS aim to address “social and environmental externalities rather than demands for technical coordination” and regulated entities lack the expertise to independently define and identify the relevant risks (Abbott and Snidal 2009). The coordinated activities undertaken to address child sexual abuse material (CSAM) enlist different institutions with different kinds of expertise, legitimacy, and capacity in a manner that increases corporate accountability for public goals and through the provisioning of shared infrastructure reduces the costs of addressing the harms caused by CSAM circulation (Mulligan & Bamberger 2021). Like RSS, AI risk management efforts pose significant challenges for “monitoring and enforcement (due to the strategic structure of those externalities)” which require visibility into risks that arise due to the interactions of products and services and use cases outside the purview of individual model developers or deployers as well as the sector, and mitigation activities that span model developers and deployers as well as other actors in the ecosystem (Abbott and Snidal 2009). Furthermore, deeply entrenched firm processes produce what Bamberger calls “cognitive decisionmaking pathologies” that if left unchallenged and unchecked undermine AI risk management efforts as they lead firms to interpret and embed new activities in ways that fit neatly into existing firm routines and personnel (Bamberger 2006). The breadth of AI deployments and the variation of deployers—including small businesses and governments—suggest that shared infrastructure to support at least some risk management activities will be important to ensure the benefits of AI are broadly available and its risks are consistently mitigated across deployments. As Solow-Niederman (2020) warns, without reforms, current regulatory frameworks are ushering in “an era of private governance” that will prevent “public values [from inform[ing] Al research, development, and deployment” and will undermine “the democratic accountability that is at the heart of public law.”
C. An overly narrow and technocratic set of risk management tools and practices
The AI governance frameworks direct entities towards an expanded set of risk management tools and practices. The explicit AI governance frameworks all feature some combination of traditional risk management tools such as impact assessments, post-market monitoring, audits, and registration. A few instruments take a more precautionary approach, for example precluding certain uses of AI in specific contexts due to risk (EU AI Act Art. 5; Biden 2024, Pillar I) and requiring pre-market risk assessments (EU AI Act Art. 40(1); Young 2024, §5). In addition to traditional risk management tools, the AI governance frameworks draw in a range of additional techniques ranging from new forms of model, system, and data documentation; cybersecurity-inspired adversarial testing; public reporting of use cases; transparency when in use; requirements for explanations to impacted individuals; bias identification and mitigation efforts; human fallbacks; and consultation with affected communities and the public on the design, development, and use of the system as well as the risk mitigation choices (EU AI Act Art. 5, 16-27; Young 2024). But in practice, institutional arrangements and power diverge from these broad methods to produce technocratic practices and tools.
While AI risk management frameworks call for a range of risk management approaches and tools, key public sector efforts by the United States Center for AI Standards and Innovation (CAISI), formerly the U.S. AI Safety Institute, housed in the National Institute for Standards and Technology (NIST) at the Department of Commerce, and sister AI safety institutes around the world, are increasingly focused on technocratic, and predominantly model-centric, evaluations and mitigation strategies. For example, while NIST’s seminal work, the Artificial Intelligence Risk Management Framework (2023), describes risk management as “coordinated activities to direct and control an organization with regard to risk” (emphasis added) and sets out a process for risk management that entails a wide range of technical and organizational practices designed to govern, map, measure, and manage AI risks and improve AI safety and trustworthiness, NIST and CAISI’s subsequent work has focused on technical issues, including guidance for AI developers in managing the evaluation of misuse of dual-use foundation models (NIST 2024b), frameworks on managing generative AI risks (NIST 2024a), and securely developing generative AI systems and dual-use foundation models (NIST 2025). The work of the AI Security Institute (AISI) is similarly highly model-focused for example, in relation to the topic of “cyber misuse” emphasizing that its “research intends to understand, assess and research potential model developer mitigation strategies” (AISI 2025).
Rather than using and developing the infrastructure and approaches to support the broad range of risk management strategies and activities found in the frameworks, efforts of these key public sector entities lean heavily on risk identification and mitigation approaches developed in the private sector that focus on technical evaluations of and risk mitigations within AI models. In particular, the work of AISI and CAISI draw on private sector approaches in areas including risk assessments, testing, evaluations, and benchmarks focused on better understanding the capabilities and limitations of models (Anderljung 2023; Vidgen 2024) paired with efforts to determine acceptable or unacceptable level of risk (METR 2023). For example, the published pre-release AISI and CAISI joint evaluation of OpenAI’s o1 and Anthropic’s Claude Sonnet 3.5 models frames the goal “to better understand the capabilities and potential impacts of o1 considering the availability of several similar existing models” and relies primarily on industry standard benchmarks (HarmBench, Cybench, LAB-bench) to evaluate the model’s chemical, biological, radiological, and nuclear risks; offensive cyber capabilities; and software engineering capabilities (CAISI and AISI 2024a, 2024b).
AISI and CAISI’s red-teaming work has similarly drifted towards model-centric and technocratic approaches developed within industry. For example, AISI’s principles highlight the importance of external partners (both experts and the public) in red-teaming activities, however its research agenda emphasizes the production of “technical solutions” to risks generated from an if-then rationality, rather than oriented around harms (AISI 2025). Methodologically, red-teaming features prominently as safeguards that are posited to directly mitigate risk, for example for inability arguments in safety cases, even though as a practice, red-teaming is about the discovery of potential weaknesses, not the guarantee of their absence.
Mitigation approaches within AISI and CAISI similarly favor the model-centric focus of industry efforts. For example, AISI and CAISI work emphasizes changing the model itself to avoid undesired behaviors, such as by modifying the model’s weights, training data, or training code (Henderson 2023, Jones et al. 2020). Some of these interventions appear helpful for reducing risks and preventing harms. For example, recent research suggests how some child sexual abuse material might be fine-tuned out of an AI model directly (Gandikota et al. 2023; Thiel 2023). Those in the field of AI safety have recognized the limitations of existing technical interventions, highlighting a need for greater rigour in the field (Apollo 2024). New AI models can exhibit capabilities that are hard to predict, measure, or mitigate through technical interventions. In addition, many of the aforementioned technical interventions can easily be circumvented by a malicious actor or avoided by using a different model that has not had those interventions (Wei et al. 2024). Tunnel vision on the model also commits a kind of framing trap, as recent work from AISI shows in fact that for certain categories of harms for open weight models, mitigations at the data level may be highly effective (O'Brien et al. 2025).
Regardless of their utility, the emphasis on technocratic tools aimed at models in combination with the deference to the private sector entities described above almost inevitably will produce management strategies that are unresponsive or incomplete in their efforts to address public goals. The combination of private sector control, model-centricity, and emphasis on technical mechanisms sets the conditions for what Cohen and Waldman call “regulatory managerialism,” the importing of the “practices for organizing and overseeing private sector, capitalist economic production and…the logics and underlying ideologies in which those practices are rooted” into regulated activities (Cohen and Waldman 2023). Given the complexity and opacity of AI systems the current mix of deference and model-centricity is poised to leave a small group of well-resourced private entities to shape the publics’ and regulators’ understandings of compliance (Edelman 2016 ; Chi et al. 2021).
D. Constrained view on potential mitigation sites
AI governance frameworks and the emerging practices across the fields over-emphasize model assessments and mitigation strategies. This orientation detracts from more fulsome understandings of how rights and safety are put at risk, and in many instances results in mitigations that are occluded from seeing the full spectrum of hazards and their relations to particular harms.
Many AI governance frameworks emphasize the responsibility of model developers and deployers to mitigate risks posed by those models. The EU AI Act requires risk mitigation activities by providers of “general-purpose AI models,” with heightened requirements where they pose “systemic risks,” and deployers of High-Risk AI systems (EU AI Act Art. 53, 55, 26-27). Similarly, President Biden directed various reporting requirements and technical guidelines for AI systems deemed to be “dual-use foundation models” and directed a report on the risks and benefits of AI models with widely available weights and policy recommendations to maximize those benefits while mitigating the risks (Exec. Order No. 14110). Private commitments to mitigate risks to the public’s rights and safety similarly focus on “red-teaming of models or systems,” “sharing…information on advances in frontier [models’] capabilities and emerging risks and threats,” and “protect[ing] proprietary and unreleased model weights” (White House 2023).
Although providers and deployers both play important and crucial roles in mitigating risks from AI, efforts focused on engineering and design procedures and practices are not oriented towards potential harms on the ground. In combination with the emphasis on technical mitigation strategies and the deference to the private sector, the focus on model-level interventions may make it easier for engineers to address risks in their day-to-day practice, but collectively these governance approaches risk constructing a world in which corporations make progress on addressing AI risks without making a meaningful overall dent in negative real-world impacts from AI.
In contrast, binding guidance issued under the Biden-Harris Administration (Young 2024) and reissued under the Trump Administration (Vought 2025), directs risk mitigations at rights- and safety-impacting AI use cases—best understood as deployed sociotechnical systems including the technical, human, and organizational elements—or as Dobbe describes it, accounting for AI system design and institutional context (Dobbe 2022). The Digital Services Act also takes a sociotechnical system approach requiring platforms and search engines of specific sizes (those with 45 million or more users per month in the EU) to “identify, analyse and assess any systemic risks from the design or functioning of their service and its related systems, including algorithmic systems, or from the use made of their services” and adopt mitigation measures to address them (Regulation (EU) 2022/2065 Art. 34-35). The NIST AI RMF (2023) as well as the more recent NIST AI Risk Management profile on Generative AI (NIST 2024a) and State Department Risk Management profile on AI and Human Rights (U.S. Department of State 2024) take a sociotechnical systems perspective as well, explaining the need to consider risks at multiple levels of system abstraction. For example, the NIST AI RMF explains that it is designed to be used by “AI actors…who perform or manage the design, development, deployment, evaluation, and use of AI systems” including those who control the “Application Context, Data and Input, AI Model, and Task and Output” (NIST 2023). Similarly, the Generative AI framework explains, “Risks may exist at individual model or system levels, at the application or implementation levels (i.e., for a specific use case), or at the ecosystem level – that is, beyond a single system or organizational context” (NIST 2024a). While the question of what risks can be evaluated and addressed at various levels of abstraction isn’t crisply set out in the proposals, the NIST AI RMF provides some guidance about the risk mitigation activities that can occur at different dimensions (“Application Context, Data and Input, AI Model, and Task and Output”), and categories of actors who can assess them (NIST 2023). This extends the site of action beyond where the engineering and design work happens, aiming to account for the full complexity of the world.
II. Effectively and Legitimately Managing AI Risks
AI governance frameworks direct risk management activities to protect public rights and interests, but effectively and legitimately reducing harms requires policy frameworks that advance a sociotechnical approach to risk assessment and mitigation; center the prevention of harm through coordinated action; and expand the practices and institutions involved in risk management activities, including those focused on models. Absent a reorientation, AI risk management will continue to devolve into an exercise in technocratic regulatory managerialism, divorced from the rights and public values it was created to serve. It will yield ineffective risk management approaches that will be viewed as illegitimate and captured. In its current form AI governance, while grounded in a deep commitment to advance the interests and protect the rights and safety of the public, is at risk of contributing to “peoples’…alienation from public institutions, and the perspectives of the regulators who are supposed to be safeguarding their interests” (Ford 2023).
AI governance practices are still nascent. Policymakers and other stakeholders have a window of opportunity to shape them. Below we outline four practical ways to reorient risk management and realize its public benefits.
A. Establishing the sociotechnical frame
First, governance approaches should take a sociotechnical systems approach to assessing and mitigating risk. Rather than seeking to address risks component-by-component, a sociotechnical systems approach analyzes how system components collectively produce harms and hazards andidentifies optimal points for riskmitigations across them. The goal is to systematically identify a set of risk mitigations across various technical and institutional components that will effectively mitigate harms and reduce hazards.
Advisory bodies, researchers, advocates and some policymakers have emphasized the importance of taking a sociotechnical systems approach to managing AI risks (NAIAC 2023; NIST 2023 Appendix 3; Polemi et al. 2024; Dobbe 2022; Raji and Dobbe 2023; NASEM 2021; Bogen and Winecoff 2024; White House OSTP 2022; Prabhakar and Klein 2024). Researchers report that a key failing of current risk management methods is an overemphasis on “the technological artefact…in isolation” and absence of attention to “the human factors and systemic structures that influence whether a harm actually manifests” (Weidinger et al. 2024). They find that an emphasis on technical components leaves key sources of risk unidentified and unaddressed (Dobbe 2022; Fox and Victores 2025).
Moving AI risk management beyond models and data is aligned with learnings from safety science and risk management practices in other high-risk fields. As Dobbe (2022) explains, the field of system safety assumes that “systems cannot be safeguarded by technical design choices on the model or algorithm alone” therefore takes an “end-to-end” approach to analyzing risks and a sociotechnical approach—including “the context of use, impacted stakeholders…and institutional environment”— to deploying mitigations. Governance models in other high-risk fields such as transportation, finance and medicine reflect this holistic, systems approach to assessing and mitigating risk (NASEM 2021).
Driving a sociotechnical systems approach in the field requires policymakers to prioritize the mitigation of harms, rather than the evaluation of specific system components. Two practical ways to set risk management activities within the sociotechnical frame include establishing use cases rather than systems as the unit of analysis, and encouraging collaborative, ecosystem-level approaches to risk management. The first approach, pioneered in the guidance to federal agencies and building off the Blueprint for the AI Bill of Rights established during the Biden-Harris Administration emphasizes the need to design and evaluate a specific context of use rather than a generic AI system (White House OSTP 2022; Young 2024). Federal agencies are directed to evaluate AI use cases defined as “the specific scenario in which AI is designed, developed, procured, or used to advance the execution of agencies’ missions and their delivery of programs and services, enhance decision making, or provide the public with a particular benefit” (OMB 2024). At the other end of the spectrum, yet accomplishing a similar objective, the Global Network Initiative and Business for Social Responsibility toolkit, Across the Stack Tool: Understanding Human Rights Due Diligence Under an Ecosystem Lens, offers a generalizable method for establishing a sociotechnical frame to assess rights (Global Network Initiative, n.d.a). This tool focuses on the collective action required by the entities and stakeholders that make up the technology ecosystem to protect, respect, and realize human rights, and reflects the fact that many human rights challenges are sociotechnical, span geographies, and are neither company nor technical system specific. While these approaches are radically different, one narrowing to a specific use of AI in context and the other focusing on specific harms across an ecosystem, in practice both orient risk assessment and mitigation activities towards reducing negative impacts on individuals and groups rather than focusing on evaluating a specific technical component or system.
B. Reorienting harms and hazards relations
AI systems exist within complex assemblages of social and material components; their risks are coproduced and inseparable from their social worlds. Therefore, risk management should be reoriented to first begin with anticipated harms within a sociotechnical system, then conduct systems analysis to identify and mitigate hazards, not the other way around. This reorientation increases the ability of various entities to tailor mitigations across the sociotechnical system to maximize their efficacy and efficiency at reducing harms while maintaining system components’ capacity to support other uses and outputs.
Reducing the model-centric focus produces a more robust understanding of risk management opportunities across the sociotechnical system. To this end, the handoff model (Mulligan and Nissenbaum 2020) provides an analytic that helps elucidate the sociotechnical and relational view of risk. In popular accounts of automation, a function that is performed by a human is imagined to be fungible and exchangeable with an algorithmic counterpart, leaving the overall system intact. Handoff challenges the delegation of human decision-making onto automated decisioning systems by examining how the function of the system (or assemblage) is transformed through automation. It is about states of all the interacting components in a system rather than particular components themselves. Understanding the shifting landscape of harms induced by AI systems in action requires us to articulate shifts in how functions are performed (by distinct configurations of human and non-human components) and subsequently alter, relocate, and redistribute harms in a socio-material assemblage. Automated decision-making systems, for example, are said to be a means to reduce harms that arise from intentional human discrimination, but once in operation, bring about disparate impact and other forms of discrimination. Reducing a model-centric focus and unbundling harms and hazards produce a more robust understanding of risk management opportunities across the sociotechnical system.
C. Expanding the methods and tools of risk mitigation
Most AI risk management tools are designed to support AI designers’ and developers’ efforts during the data-centric and statistical modeling stages of AI development (Kuehnert et al. 2025). As Kieslich et al. (2025) note in their criticism of the risk assessment obligations of the DSA and AI Act, current technically focused practices “are not inclusive, unable to take into account broader systemic factors, unsuitable to account for individual (group) differences, and do not offer guidance on how to balance value conflicts” and “have a tendency to invoke the power and legitimacy of objectifiable criteria, do not leave room to navigate value conflicts, tend to ignore group-specific perceptions of harms and the sociotechnical interplay of end-users with the technology, and invite bureaucratization and overreliance on the perception of one particular stakeholder (developers), while generally suffering from a democratic deficit to the extent that they fail to actively involve and engage the societal stakeholders that are affected.”
To address the limitations of current risk management methods, scholars and researchers have pressed for tools and methods that involve other stakeholders (Cohen and Waldman 2023; Metcalf et al. 2021). Researchers and practitioners are pioneering new methods to support sociotechnical risk assessments (Kieslich et al. 2025; Weidinger 2024; Weidinger et al. 2023; Gandhi et al. 2023; Ganguli et al. 2023). However, as Cohen and Waldman (2023) note much more must be done to “introduce—or, in some cases, reintroduce—knowledge production methods that might compete with and ultimately dislodge managerialist epistemologies” and ensure public values—specifically protecting the public’s rights and safety and public goods—drive AI risk management activities.
Recognizing the importance of moving out of a narrow emphasis on the technical components of systems and quantitative measures of risk towards a holistic approach, NIST sponsored a workshop series convened by the National Academies of Sciences, Engineering, and Medicine to identify and explore approaches to addressing human and organizational risks in AI systems (NASEM 2021). The goal was to develop a complement to the NIST AI RMF that would address the human and organizational aspects of risk and risk management. During the Biden-Harris Administration the U.S. National Science Foundation funded a new AI Institute for Trustworthy AI in Law and Society that is researching participatory approaches to AI development and AI governance approaches, and along with a pool of private philanthropies, launched the Responsible Design, Development, and Deployment of Technologies program that aims to ensure ethical, legal, community, and societal considerations are embedded in the lifecycle of technology’s creation (NSF 2023, 2024).
Developing tools, research initiatives and new institutions are important efforts to broaden the methods and tools for addressing AI risks, but they are insufficient to realize the shift in the field required to advance holistic approaches to AI risk. Funding for interdisciplinary work—particularly efforts that include qualitative, interpretivist social scientists and are situated in specific domains—is essential to develop the methods and tools to support risk management activities across sociotechnical AI systems that center the public’s rights and safety and public values. Existing research has found that responsibly deploying AI systems requires significant investments in human processes, relationships, and training (Sendak et al. 2020). Such changes and investments in institutional processes and personnel must be considered part of the risk mitigation toolbox if AI is to deliver positive impacts in practice. Regulatory frameworks that demand sociotechnical approaches to risk management—like the requirement for U.S. agencies to assess AI use cases—are an important way to drive public and private investment in new methods and tools.
D. Broadening participation in risk management
Current AI governance frameworks largely rely on regulated entities for risk mitigation activities (Wachter 2024). Scholars and advocates argue these approaches leave too much discretion about which risks are in scope and how much residual risk is acceptable to entities providing AI systems (Smuha et al. 2021; NIST 2023). and allow regulated entities to “emphasi[ze]…severity over probability” (Yang 2024). While a key rationale for risk-based regulation is to enlist the expertise, judgment and privileged access of firm insiders, the extent to which outsiders have input and can observe firm choices influences how well private actors and their choices align with public goals. Regulators currently lack the resources and are just beginning to demand the access and acquire the expertise to more directly evaluate and shape firms’ approaches to AI risk management. The expertise of civil society, academia, and domain experts associated with specific risks is rarely acknowledged and even more rarely leveraged by existing governance approaches. For these reasons, the current governance approaches are unlikely to meet public risk management objectives and, at the same time, the deference to firms undermines public trust in efforts produced under the existing regimes regardless of their merit.
AI risk management needs new players. Solow-Niederman (2020) argues that a functional theory of Al governance must address the power and actions of the private entities and individuals controlling a vast quantity of AI infrastructure. Kuehnert et al. (2025) draw attention to the importance of “who within an organization…conduct[s] what task at different stages of the AI lifecycle, and how” for “…preventing and mitigating AI harms.” They emphasize that “understanding [AI harms’] source in the AI lifecycle—the process by which an AI model is imagined, designed, developed, evaluated, and integrated into broader decision-making processes and workflows” is essential to managing risk and that different individuals within the firm have different vantage points (Kuehnert et al. 2025). While Kuenhert et al. focus on who “within an organization” should be responsible for various risk management tasks, Weidinger et al. (2024) suggest that “the work of conducting safety evaluations can be meaningfully distributed across different actors, based on who is best placed to conduct different types of evaluations” (emphasis added). Weidinger et al.’s suggestion to enlist a wider set of actors beyond regulated firms is consistent with the perspective of regulatory scholars who recommend considering the strengths of various external actors who can be enlisted to support various governance goals. Abbot and Snidal (2009) suggest that multi-actor governance schemes should focus on allocating specific tasks to actors based on consideration of four competencies: independence, representativeness, expertise, and operational capacity. They note that different sets of competencies are essential to the efficacy and legitimacy of distinct governance activities, including distinct risk management activities, and vary with the substantive risk to be managed even within risk management activities. Of particular relevance to risk mitigation, Abbott and Snidal identify competency gaps in regulated firms that counsel against making them wholly responsible for implementation and monitoring activities.
Researchers predict that enlisting external stakeholders in one area of AI risk management, audits, will yield better outcomes (Stein et al. 2024). Building on an analysis of how the public and private sector are enlisted in auditing activities in other high-risk industries they conclude “that public bodies [should] become directly involved in safety critical, especially gray-and white-box, AI model evaluations” (Stein et al. 2024). Their conclusion rests on several factors. First, the “unpredictable but potentially critical and far-reaching impacts” of advanced AI systems. Second, the market concentration in the AI sector. Third, the high cost of testing and evaluation due to the absence of standards. Fourth, the “significant and specialized expertise” including “[d]omain-specific expertise…to develop threat models and red-team” and “research engineers and computational social scientists…to understand models and their impacts” (Stein et al. 2024). Fifth, the potential for audits to reveal pathways to misusing advanced AI. The combination of these factors requires auditors to have independence, significant expertise, and be trusted to protect information that could lead to system misuse. In recommending that public bodies take on this role, the authors note that doing so requires providing public bodies with “extensive access to models and facilities,” could require “[hundreds] of employees for auditing in large jurisdictions like the EU or US,”(Stein et al. 2024). Public bodies play a role in effective and legitimate testing but additional investments into the resources, expertise, and access to conduct rigorous AI model audits are needed.
AI risk mitigation is composed of distinct elements including impact assessments, mitigation, red-teaming, audits, operational testing, and monitoring. The NIST AI RMF provides an overview of the range of actors across the public and private sector who can participate in risk management activities (NIST 2023 Figure 3 and Appendix A). Each AI actor has distinct capacities and competencies and correctly deploying them requires task-specific assessments of the sort modeled by Stein et al. (2023). As Mulligan and Bamberger (2021) show, even within a common risk management task like content moderation, the actor who can best perform a subtask—identifying or removing objectionable content for example—may vary depending upon the substance being moderated (e.g., child sexual abuse material, copyright, privacy). In addition, constraints, such as transparency reporting or limitations on forms of automation, can further improve the efficacy and legitimacy of an enlisted actor’s risk mitigation efforts.
Thoughtful tasking of AI risk management, as Stein et al. (2024) note, will be impossible without investments to resource and build the capacity of governments, civil society organizations, and academic institutions. The Biden-Harris Administration made steps towards this end. They increased government capacity recruiting over 250 AI and AI-enabling experts into government and created processes to involve all stakeholders in the testing and evaluation work of the U.S. AI Safety Institute (now known as CAISI) (White House 2024b; CAISI 2024). They launched the National AI Research Resource pilot to support research teams outside of industry (White House 2024b). Private philanthropic investments in the United States are also building the capacity of civil society organizations to participate in AI risk management activities as well. Nonprofits like the Allen Institute for AI, backed by hundreds of millions of dollars in both public and private sources of funding, are building public alternatives to industry AI models that are more open (NSF 2025). Civil society organizations like the Center for Democracy and Technology, the Leadership Conference on Civil and Human Rights, and the AFL-CIO have all stood up suborganizations that are focused on tech issues as they intersect with their core mission, and that have brought in AI expertise to help mitigate the relevant risks to that mission as it relates to AI (CDT, n.d.; The Leadership Conference on Civil and Human Rights 2023; AFL-CIO 2021).
These efforts are nascent and not yet at the scale required to address the challenges Stein et al. (2024) note, but they are promising examples that bring the independence and representativeness of public and civil society actors into AI risk management tasks and bolster the sociotechnical expertise within government. If public AI initiatives continue to be prioritized, we can expect to see greater civil society capacity to make AI more trustworthy and aligned with the public interest (Marda 2024).
To end this section, we provide two examples which gather stakeholders with appropriate expertise, tools, and methods, around technical and institutional processes. First, Data & Society’s AIMLab takes an on-the-ground participatory approach that works with communities to anticipate potential harms, and integrate lived expertise, expanding who participates in risk governance and altering the substance and process of typically purely technical evaluations, working in tandem with government and industry stakeholders (Data & Society, n.d.). Second, at the infrastructural level, consider the National Center for Missing & Exploited Children (NCMEC). NCMEC maintains a hash sharing platform for CSAM, acting in conjunction with industry and non-profit actors, that allow platforms to take down CSAM at scale by using NCMEC APIs. NCMEC’s independent domain experts identify violative content in accordance with public law and the technical infrastructure scales their expertise allowing it to be integrated into engineering work that happens within numerous firms (Mulligan and Bamberger 2021).
Together these four shifts will reorient AI risk management ensuring appropriate stakeholders are involved, the right tools and methods are used, and both technical and institutional processes are aligned to effectively and efficiently mitigate harms.
III. Applying the Conceptual Analysis: Image-Based Sexual Abuse
In this section, we sketch out how our framework can be used to guide risk management activities to address image-based sexual abuse. Advocates and policymakers frame the creation and distribution of intimate images as “image-based sexual abuse” (McGlynn & Rackley, 2017). This framing recognizes both the diverse harms experienced by individual victim-survivors and the societal harms of its often-gendered nature. It positions the phenomena, as sociotechnical hybrids, within the landscape of information and communication technologies that scholars have shown contribute to the perpetuation of systemic injustice (Benjamin 2019; Noble 2018; Buolamwini and Gebru 2018; Browne 2015).
Both the production of synthetic intimate images of real women generated by AI and the distribution of real intimate images of actual women distributed without consent harm women. They harm the specific women depicted in the images, and they harm women as a group because the creation and distribution of these images represents “a systematic tolerance of sexual violence against women” that “takes away from [women’s] autonomy and ability to move through the world” (Tran 2015). In this sense, the production of such synthetic images from generative AI coproduce harms against women within an existing world of gendered violence. It cannot be mitigated at the level of model output alone. Reflecting this sociotechnical lens, the concept of “sexual digital forgeries,” building on Angela Chen’s definition of “digital forgeries” as “something that a reasonable person would think is real,” was coined by Danielle Citron and Mary Anne Franks and reveals how although generative AI proliferated new technical tools for the creation of synthetic intimate images, the problem, and its solutions, cannot be solely technological (Chen 2019). For example, since not all violative synthetic intimate content can be identified through technical tools such as deepfake detectors, the standard for deciding whether a piece of content should be removed should ultimately be set by victims themselves.
This sociotechnical framing is reflected in a White House call to action from the Biden-Harris Administration addressing the creation and distribution of image-based sexual abuse (IBSA) (Prabhakar and Klein 2024). While the call to action brings AI into view, it also calls on “companies and other organizations to provide meaningful tools that will prevent and mitigate harms, and to limit websites and mobile apps whose primary business is to create, facilitate, monetize, or disseminate image-based sexual abuse” and “Congress to strengthen legal protections and provide critical resources for survivors and victims” (Prabhakar and Klein 2024). Reflecting this sociotechnical orientation towards managing the harms of IBSA, the Biden-Harris Administration garnered voluntary commitments from AI model developers and data providers to reduce AI-generated IBSA including commitments from key model developers to:
responsibly source datasets and safeguard them from IBSA;
remove nude images from AI training datasets for certain models; and
use iterative stress-testing strategies in their development processes and feedback loops to guard against AI models outputting IBSA (White House 2024a).
In addition to commitments from AI model developers, payment service providers committed to help detect and limit payment services to companies producing, soliciting, or publishing IBSA; GitHub committed to prohibit the sharing of software tools that are designed for, encourage, promote, support, or suggest in any way the use of synthetic or manipulated media for the creation of non-consensual intimate imagery; and Microsoft committed to launch a pilot to detect and delist duplicates of survivor-reported non-consensual intimate imagery in Bing’s search results (White House 2024a).
The Biden-Harris Administration’s call to action and the voluntary commitments it garnered address the wide range of actors that create harms, by producing or providing access to non-consensual intimate images, and create hazards, either through AI models and other tools that have eased the creation of fake but convincing intimate images and videos of women or through search engines and payment platforms that make the distribution of such images easy and lucrative.
This sociotechnical framing animates federal laws too. New policies build on existing laws against the non-consensual creation of sexual images—enacted in response to so-called ‘upskirt and down-shirt’ photos and videos ushered in by the use of increasingly small cameras to capture sexual images in public places—to address sexual digital forgeries and non-consensual distribution of such content. The Violence Against Women (VAWA) Reauthorization Act of 2022 created a federal civil cause of action for individuals whose identifiable intimate visual images are disclosed without their consent, allowing victims to recover damages and legal fees (U.S. Congress 2022) while the 2025 Tools to Address Known Exploitation by Immobilizing Technological Deepfakes on Websites and Networks (TAKE IT DOWN) Act, which covers both “digital forgeries” and “authentic intimate visual depictions,” targets individuals who knowingly publish non-consensual intimate imagery and platforms that host them requiring platforms to set up a process to remove covered images upon notice from a victim (U.S. Congress 2025).
Together these new laws and the voluntary actions spurred by the call to action approach IBSA at an ecosystem level intervening at various points–creation, distribution, hosting, monetization–to minimize harm. Together, these actions address three distinct harms: the inclusion of IBSA in datasets, the production of synthetic IBSA that identifies real women, and the circulation of IBSA (synthetic and non-synthetic). They target model outputs that directly inflict harm (i.e., non-consensually produced and circulated intimate images) and embed risk mitigations across the sociotechnical system involving a wide swathe of actors to reduce their production and circulation of IBSA. We will explore the first two through our model.
The inclusion of real non-consensual intimate images in datasets can be categorized as a harm to privacy. The voluntary commitments made by AI model providers include a commitment to safeguarding datasets used in model production from IBSA. While this harm can be mitigated at the level of training data, satisfying the sociotechnical and relational orientations of our model, it is unclear whether model developers possess the capacity and competencies to perform this mitigation. AI model developers can use their resources to create contractual requirements on dataset providers to use existing databases such as StopNCII and TakeItDown to identify and remove IBSA from datasets. However, other actors may be better able to effectively and legitimately undertake this task.
With respect to expertise and operational capacity and efficiency, dataset developers, as opposed to AI model developers, are arguably better positioned to ensure their datasets do not contain IBSA. They have control over the dataset development process, they can make decisions about sourcing data, they can filter data before including it in the dataset. Other parties may have superior expertise in the subject matter of IBSA and possess the independence and representativeness viewed as essential to legitimately undertake the work. For example, StopNCII.org and Take It Down are non-profit organizations who work directly with affected individuals and communities and are trusted to advance and protect their interests. Their databases are increasingly used by other AI actors to identify IBSA for harm mitigation (White House 2024a; Gregoire 2024). Only a diverse collection of entities have the right set of capacities and competencies to mitigate this harm effectively and legitimately. In addition, requiring dataset developers to ensure IBSA is removed scales risk management across all models using a dataset rather than having each AI model developer undertake the task. Ideally a distributed system of risk mitigation will emerge that, like the current approach to identifying and removing CSAM, will enlist subject matter experts and victim-survivors in identifying material to be removed and create automated screening tools to reduce the costs and improve the pace of removing such images from datasets.
A second harm is AI model outputs that either reproduce a non-consensual intimate image or produce a synthetic intimate image of a real identifiable person—a “sexual digital forgery.” Leading AI model developers agreed to incorporate iterative stress-testing and feedback loops in their development processes to mitigate against the production of both kinds of images. This harm mitigation strategy appears well targeted to the operational capacity of AI model developers. The output of the AI models is itself the harm and AI model developers can use various technical processes from prompt engineering to filtering outputs to prevent the production of IBSA. However, here too they likely need the expertise, independence, and representativeness of other actors to define what IBSA is in relation to synthetic intimate images, and to identify existing non-consensual intimate images. StopNCII, TakeItDown, and other groups that represent affected parties may play a role in both activities. In addition, transparency reports and third-party audits of AI model outputs may be important constraints on the risk mitigation practices enabling others to check the validity of the AI model developers’ work.
Finally, the updates to federal law deter individuals from distributing real intimate images and sexual digital forgeries without consent and require websites, platforms, and mobile applications to remove both kinds of images when notified. The legal framework taps into the expertise of those harmed by IBSA and the capacity of entities that host it to curb its distribution and availability.
Examining efforts to address IBSA through our conceptual model highlights the sociotechnical approach, the focus on coordinated action to reduce harms and hazards, and the practical and normative benefits of enlisting a wider range of actors in risk mitigation activities.
IV. Implications for AI Policy
Policymakers must recenter harm reduction as the goal of risk management and require a sociotechnical orientation to achieve it. AI policy debates increasingly assume that increasing capabilities of a few AI models in a number of specific areas generate the only—or at least the most significant—pathways to harm. This assumption has whittled down risk governance activities limiting the sites where mitigations are deployed, the tools and methods of mitigation, and narrowing the actors and expertise brought into these activities. The assumption that mitigations directly on model capabilities is the key to harm reduction is misguided. While model evaluations and mitigations are necessary, they are insufficient to address harms as they rarely meaningfully reduce the wide range of hazards and harms to which models contribute. The model-centricity of current risk mitigation practices will not reduce harms, and it has given AI model developers and deployers the discretion to operationalize and mitigate risks with little public participation and oversight. The resulting risk management governance system is flawed and occluded because it ignores many more effective ways to mitigate relevant risks, and sidelines external players with core competencies and desirable independence from the economic interests of the AI industry. In more simple terms, this often ends up being self-regulation.
A sociotechnical approach to risk governance centers harms and examines systems to understand how hazards arise from configurations of system components. This sociotechnical approach requires new institutional arrangements and infrastructures to involve a wide range of actors with different kinds of expertise in risk mitigation of AI use cases. These rearrangements, as we describe above, alter both the process and the substance of evaluations, aligning them with public goals and building public trust.
Governments, nonprofits, and companies need to invest in the human and technical infrastructure and the research to support sociotechnical governance. This ranges from building the infrastructure needed for third-party actors to find and report flaws in AI systems (see, e.g., Longpre et al. 2025), to investing in AI safety tooling across the AI stack—not just at the model level (see, e.g., Marda et al. 2024, Surman and Bdeir 2025). It requires financial support for civil society, academia, and government actors who provide crucial expertise, capacity, and independence necessary for legitimate and accountable risk governance. Crucially, many of these interventions can leverage testing and regulatory processes set up for specific domains, such as using existing reporting frameworks and subject matter experts around medical device safety to identify risks in AI systems deployed in the healthcare system. Evaluations and mitigations at other frames (e.g., data, model, system), while important, do not capture the organized complexity that produces hazards and harms in the wild. Testing AI systems in deployed and operational contexts is vital, as is ongoing monitoring. Some policy documents, such as the Biden-Harris administration’s guidance to federal agencies on the use of AI (OMB 2024), recognize the importance of this. Other regulatory frameworks, such as those enforced by the Federal Trade Commission, also look at AI hazards and harms as they manifest in the real world to build a case against unfair or deceptive AI practices (Nguyen 2025).
We recommend four concrete steps for stakeholders in the AI ecosystem to better incorporate this perspective into their approach to AI risk mitigation:
First, policymakers should begin by developing a sociotechnical system map that identifies the technical and organizational system components related to the harm under exploration. In the related area of Human Rights Impact Assessments, Global Network Initiative and Business for Social Responsibility take a ‘value-chain’ ecosystem approach that examines the interdependency between many different actors (suppliers, providers, deployers) to develop a relational approach in mapping and mitigating risks, that goes beyond responsibilities located within particular firms (Global Network Initiative, n.d.b). This approach called for engagement with a broadened set of stakeholders, spanning public, private, academia, and civil society, and including those that are not directly implicated in the value chain but nonetheless impact or are impacted by it.
Second, task the deployers of AI systems with assessing and mitigating risks of harms from specific AI use cases, not the full range of potential risks. Currently, the expectation of responsibility is often placed on AI deployers for a much broader set of risks than ones manifested by particular use cases. Deployers should be focused on mitigating the risks that manifest from the specific use cases they are deploying by working backwards from harms associated with those use cases, not on considering the whole range of possible AI risk mitigations. This strategy would result in a more holistic and targeted approach as described throughout this paper.
Third, regulatory frameworks should reduce reliance on the developers and deployers of AI systems to independently engage in risk mitigation activities. As this paper argues, this often produces self-regulation. Policymakers should instead directly require, or at least incentivize, entities that develop and deploy systems to enlist external stakeholders with relevant expertise in risk assessment and mitigation processes. This includes bringing external stakeholders into strategic decisions about where and how to mitigate risks, and, where relevant, into risk mitigation activities. Mechanisms for sustainably funding the expertise of these external parties, and building the expertise of government agencies, is essential to AI risk management. One author has suggested potential sustainable funding models (Mulligan and Bamberger 2018; Doty and Mulligan 2013). And civil society organizations such as the National Fair Housing Alliance are innovating funding models, beyond philanthropy, to support robust AI and domain specific expertise (National Fair Housing Alliance, n.d.).
Finally, governments and companies need to invest in the infrastructure and research to support evaluations of sociotechnical systems and a richer set of technical and non-technical risk mitigation techniques. The National Artificial Intelligence Research Resource (NAIRR) Pilot (NSF, n.d.) and CalCompute (California 2025) are infrastructural efforts in this direction which provides not only necessary cloud computing infrastructure but also includes appropriate human expertise. Evaluations of models and data sets do not capture the organized complexity that produces harms and increases their probability in the wild and model-centric mitigations cannot on their own prevent harms on the ground. As discussed in Part II, Data & Society’s AIMLab represents innovations in evaluations which are not merely technical. Building a testing infrastructure—physical, technical, human and methodological—for AI use cases is key to identifying and addressing harms in the wild. There are efforts in this direction within the U.S. government, for example the Department of Homeland Security’s biometric testing facility, or as discussed earlier in the paper, NCMEC, and in the private sector, the Health AI Partnership’s work to define the requirements for adequate organizational governance of AI systems in healthcare settings (Kim et al. 2023; Health AI Partnership 2024; The Maryland Test Facility, n.d.).
Together, these four steps would drive a sociotechnical systems approach to the study and mitigation of AI risks that focuses on reducing harms through coordinated action, is oriented around use cases that include social and technical components, targets mitigations towards appropriate sites across the sociotechnical system, and enlists a wide range of actors who possess the capacity and competencies necessary to effectively and legitimately perform the work. For AI systems to be trustworthy, risk mitigations must be performed by entities with the relevant expertise and operational capacity—including technical and human resources and access to the relevant system components. For systems to be trusted, risk mitigations should lie in the hands of entities viewed as legitimate due to their independence and representativeness. Our approach recenters risk management in the prevention of harms on-the-ground, situating it within a complex everyday world rather than in firms and research labs, and in doing so, paves the path towards a vision for accountable AI governance in the public interest.
Citations
Abbott, Kenneth W., and Duncan Snidal. 2009. “The Governance Triangle: Regulatory Standards Institutions and the Shadow of the State.” In The Politics of Global Regulation, edited by Walter Mattli & Ngaire Woods. Princeton University Press.
AFL-CIO. 2021. “AFL-CIO Launches Technology Institute.” January 11. https://aflcio.org/press/releases/afl-cio-launches-technology-institute.
AISI (AI Security Institute). 2025. “Our Research Agenda.” May 6. https://www.aisi.gov.uk/research-agenda.
Allen, Danielle, Sarah Hubbard, Woojin Lim, et al. 2025. “A Roadmap for Governing AI: Technology Governance and Power Sharing Liberalism.” AI Ethics 5: 3355–77. https://doi.org/10.1007/s43681-024-00635-y.
Anderljung, Markus, Joslyn Barnhart, Anton Korinek, et al. 2023. “Frontier AI Regulation: Managing Emerging Risks to Public Safety.” Preprint, arXiV, July 6. arXiv:2307.03718.
Apollo Research. 2024. “We Need A Science of Evals.” January 22. https://www.apolloresearch.ai/blog/we-need-a-science-of-evals/.
Article 29 Data Protection Working Party. 2014. Statement On The Role Of A Risk-Based Approach In Data Protection Legal Frameworks (WP 218). May 30. https://ec.europa.eu/justice/article-29/documentation/opinion-recommendation/files/2014/wp218_en.pdf.
Bamberger, Kenneth A. 2006. “Regulation as Delegation: Private Firms, Decisionmaking, and Accountability in the Administrative State.” Duke Law Journal 56(2): 377–468. https://scholarship.law.duke.edu/dlj/vol56/iss2/1.
Benjamin, Ruha. 2019. Race After Technology: Abolitionist Tools for the New Jim Code. Polity Press.
Biden, Joseph R. 2024. National Security Memorandum on Advancing the United States' Leadership in Artificial Intelligence; Harnessing Artificial Intelligence To Fulfill National Security Objectives; and Fostering the Safety, Security, and Trustworthiness of Artificial Intelligence. The White House. https://archive.is/hFQ8i.
Bogen, Miranda, and Amy Winecoff. 2024. “Applying Sociotechnical Approaches to AI Governance in Practice.” Center for Democracy and Technology, May 15. https://cdt.org/insights/applying-sociotechnical-approaches-to-ai-governance-in-practice/.
Browne, Simone. 2015. Dark Matters: On the Surveillance of Blackness. Duke University Press.
Buolamwini, Joy, and Timnit Gebru. 2018. “Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification.” Proceedings of Machine Learning Research 81: 1–15. https://proceedings.mlr.press/v81/buolamwini18a/buolamwini18a.pdf.
Business for Social Responsibility. 2025. A Human Rights Assessment of the Generative AI Value Chain. https://www.bsr.org/files/BSR-A-Human-Rights-Assessment-of-the-Generative-AI-Value-Chain.pdf.
CAISI (Center for AI Standards and Innovation, formerly U.S. AI Safety Institute). 2024. The United States Artificial Intelligence Safety Institute: Vision, Mission, and Strategic Goals. https://www.nist.gov/system/files/documents/2024/05/21/AISI-vision-21May2024.pdf.
CAISI (Center for AI Standards and Innovation, formerly U.S. AI Safety Institute) and AISI (AI Security Institute) (formely UK AI Safety Institute) 2024a. US AISI and UK AISI Joint Pre-Deployment Test: OpenAI o1. https://cdn.prod.website-files.com/663bd486c5e4c81588db7a1d/6763fac97cd22a9484ac3c37_o1_uk_us_december_publication_final.pdf.
CAISI (Center for AI Standards and Innovation, formerly U.S. AI Safety Institute) and AISI (AI Security Institute) (formely UK AI Safety Institute). 2024b. US AISI and UK AISI Joint Pre-Deployment Test: Anthropic’s Claude 3.5 Sonnet. https://cdn.prod.website-files.com/663bd486c5e4c81588db7a1d/673b689ec926d8d32e889a8e_UK-US-Testing-Report-Nov-19.pdf.
California. 2025. Artificial Intelligence Models: Large Developers, Ch. 138, Statutes of 2025. Enacted September 29, 2025.
Casper, Stephen, Carson Ezell, Charlotte Siegmann, et al. 2024. “Black-Box Access is Insufficient for Rigorous AI Audits.” Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency, 2254–2272. https://doi.org/10.1145/3630106.3659037.
CDT (Center for Democracy and Technology). n.d. “AI Governance Lab.” Accessed February 25. 2026. https://cdt.org/cdt-ai-governance-lab/.
Chairs and the Vice-Chairs of the General-Purpose AI Code of Practice. 2025. Third Draft of the General-Purpose AI Code of Practice. European Commission. https://digital-strategy.ec.europa.eu/en/library/third-draft-general-purpose-ai-code-practice-published-written-independent-experts.
Chen, Angela. 2019. “Three Threats Posed by Deepfakes That Technology Won’t Solve.” MIT Technology Review, October 2. https://www.technologyreview.com/2019/10/02/75400/deepfake-technology-detection-disinformation-harassment-revenge-porn-law/.
Chi, Nicole, Emma Lurie, and Deirdre K. Mulligan. 2021. “Reconfiguring Diversity and Inclusion for AI Ethics.” Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, 447–57. https://doi.org/10.1145/3461702.3462622.
Clymer, Joshua, Jonah Weinbaum, Robert Kirk, Kimberly Mai, Selena Zhang, and Xander Davies. 2025. “An Example Safety Case for Safeguards Against Misuse.” Preprint, arXiV, May 23. https://doi.org/10.48550/arXiv.2505.18003.
Coglianese, Cary, and Colton R. Crum. 2025. “Leashes, Not Guardrails: A Management-Based Approach to Artificial Intelligence Risk Regulation.” Risk Analysis, 45 (12): 4397–4407. https://doi.org/10.1111/risa.70020.
Cohen, Julie E., and Ari Azra Waldman. 2023. “Introduction: Framing Regulatory Managerialism as an Object of Study and Strategic Displacement.” Law & Contemp. Probs. 86 (3). https://ssrn.com/abstract=4661146.
Colman, Zack, Annie Snyder, and James Bikales. 2025. “Why Texas’ Floods Are a Warning for the Rest of the Country.” Politico, July 8. https://www.politico.com/news/2025/07/08/climate-change-makes-deadly-floods-more-likely-but-washington-is-responding-with-cuts-00441921?utm_content=user/politico&utm_source=flipboard.
Data & Society. n.d. “Algorithmic Impact Methods Lab.” Accessed January 15, 2026. https://datasociety.net/research/algorithmic-impact-methods-lab/?tab=About.
Data & Society. n.d. Pilot 1 Case Report: The City of San José, Object Detection. https://datasociety.net/wp-content/uploads/2025/10/Pilot-1-San-Jose.pdf.
Dobbe, Roel I. J. 2022. “System Safety and Artificial Intelligence.” In Oxford Handbook of AI Governance, edited by Justin B. Bullock, Yu-Che Chen, Johannes Himmelreich, et al. Oxford University Press.
Dobbe, Roel I. J. 2025. “AI Safety is Stuck in Technical Terms—A System Safety Response to the International AI Safety Report.” Preprint, arXiv, February 5. https://doi.org/10.48550/arXiv.2503.04743.
Doty, Nick, and Deirdre K. Mulligan. 2013. “Internet Multistakeholder Processes and Techno-Policy Standards: Initial Reflections on Privacy at the World Wide Web Consortium.” J. on Telecomm. & High Tech. L. 11 (135).
Dourish, Paul. 2001. Where the Action Is: The Foundations of Embodied Interaction.MIT Press.
Edelman, Lauren B. 2016. Working Law: Courts, Corporations, and Symbolic Civil Rights. University of Chicago Press.
Eliot, Lance. 2025. “OpenAI Acknowledges That Lengthy Conversations With ChatGPT And GPT-5 Might Regrettably Escape AI Guardrails.” Forbes, August 29. https://www.forbes.com/sites/lanceeliot/2025/08/29/openai-acknowledges-that-lengthy-conversations-with-chatgpt-and-gpt-5-might-regrettably-escape-ai-guardrails/.
Exec. Order No. 14110, 88 Fed. Reg. 75191 (Oct. 30, 2023).
Ford, Cristie. 2023. “Regulation as Respect.” Law & Contemp. Probs. 86: 133–55. https://scholarship.law.duke.edu/lcp/vol86/iss3/6.
Fox, Stephen, and Juan G. Victores. 2024. “Safety of Human–Artificial Intelligence Systems: Applying Safety Science to Analyze Loopholes in Interactions between Human Organizations, Artificial Intelligence, and Individual People.” Informatics 11 (2): 36. https://doi.org/10.3390/informatics11020036.
Gandhi, Kanishk, Jan-Philipp Fränken, Tobias Gerstenberg, and Noah D. Goodman. 2023. “Understanding Social Reasoning in Language Models with Language Models.” Preprint, arXiV, December 4. https://arxiv.org/abs/2306.15448.
Gandikota, Rohit, Hadas Orgad, Yonatan Belinkov, Joanna Materzyńska, David Bau. 2024. “Unified Concept Editing in Diffusion Models.” Preprint, arXiV, October 22. https://doi.org/10.48550/arXiv.2505.18003.
Ganguli, Deep, Nicholas Schiefer, Marina Favaro, and Jack Clark. 2023. “Challenges in Evaluating AI Systems.” Anthropic, October 4. https://www.anthropic.com/index/evaluating-ai-systems.
G.A. Res. 78/265, Seizing the Opportunities of Safe, Secure and Trustworthy Artificial Intelligence Systems for Sustainable Development (Mar. 21, 2024), https://docs.un.org/en/A/res/78/265.
Global Network Initiative. n.d.a. “Human Rights Due Diligence Across the Technology Ecosystem.” https://eco.globalnetworkinitiative.org/.
Global Network Initiative n.d.b. “The Importance of an Ecosystem Approach.” https://eco.globalnetworkinitiative.org/ecosystem-approach/.
Goemans, Arthur, Marie Davidsen Buhl, Jonas Schuett, et al. 2024. “Safety Case Template for Frontier AI: A Cyber Inability Argument.” Preprint, arXiV, November 12. https://arxiv.org/pdf/2411.08088.
Goldenfein, Jake, Deirdre K. Mulligan, Helen Nissenbaum, and Wendy Ju. 2020. “Through The Handoff Lens: Competing Visions of Autonomous Futures.” Berkeley Technology Law Journal 35(3): 835–910. https://doi.org/10.15779/Z38CR5ND0J.
Gregoire, Courtney. 2024. “An Update on Our Approach to Tackling Intimate Image Abuse.” Microsoft, September 5. https://blogs.microsoft.com/on-the-issues/2024/09/05/an-update-on-our-approach-to-tackling-intimate-image-abuse/.
Guohot, Michael, Anne F. Matthew, and Nicolas P. Suzor. 2020. “Nudging Robots: Innovative Solutions to Regulate Artificial Intelligence.” Vanderbilt Journal of Entertainment and Technology Law 20 (2): 385. https://scholarship.law.vanderbilt.edu/jetlaw/vol20/iss2/2.
Health AI Partnership. 2024. Event Report: A Summit On AI Product Lifecycle Management in Healthcare.https://drive.google.com/file/d/14qL9MYctX76pd0W87p2lONZnasQ21ucB/view.
Heikkilä, Melissa. 2024. “AI Companies Promised to Self-Regulate One Year Ago. What’s Changed?” MIT Technology Review, July 22. https://www.technologyreview.com/2024/07/22/1095193/ai-companies-promised-the-white-house-to-self-regulate-one-year-ago-whats-changed/.
Henderson, Peter, Eric Mitchell, Christopher Manning, Dan Jurafsky, and Chelsea Finn. 2023. “Self-Destructing Models: Increasing the Costs of Harmful Dual Uses of Foundation Models.” Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, 287–96. https://dl.acm.org/doi/abs/10.1145/3600211.3604690.
Henshall, Will. 2024. “Nobody Knows How to Safety-Test AI.” TIME, March 21. https://time.com/6958868/artificial-intelligence-safety-evaluations-risks/.
Inan, Hakan, Kartikeya Upasani, Jianfeng Chi, et al. 2023. “Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations.” Preprint, arXiv, December 7. https://doi.org/10.48550/arXiv.2312.06674.
International Organization for Standardization and International Electrotechnical Commission. 2023. Information Technology – Artificial Intelligence – Guidance on Risk Management (ISO/IEC 23894). https://www.iso.org/standard/77304.html.
Kaminski, Margot E. 2023. “Regulating the Risks of AI.” Boston University Law Review 103 (5):1347-1411. https://doi.org/10.2139/ssrn.4195066.
Karnofsky, Holden. 2024. “If-Then Commitments for AI Risk Reduction. Carnegie Endowment for International Peace.” Carnege Endowment, September 13. https://carnegieendowment.org/research/2024/09/if-then-commitments-for-ai-risk-reduction?lang=en
Kieslich, Kimon, Natali Helberger, and Nicholas Diakopoulos. 2025. “Scenario-Based Sociotechnical Envisioning (SSE): An Approach to Enhance Systemic Risk Assessments.” Preprint, SocArXiv, January 29. https://doi.org/10.31235/osf.io/ertsj_v1.
Kim, Jee Young, William Boag, Freya Gulamali, et al. 2023.“Organizational Governance of Emerging Technologies: AI Adoption in Healthcare.” FAccT '23: Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, 1396–417. https://doi.org/10.1145/3593013.3594089.
Knabb, Richard D., Jamie R. Rhome, and Daniel P. Brown. 2023. Tropical Cyclone Report: Hurricane Katrina. National Hurricane Center. https://www.nhc.noaa.gov/data/tcr/AL122005_Katrina.pdf.
Knowles, Scott Gabriel. 2014. “Learning from Disaster? The History of Technology and the Future of Disaster Research.” Technology and Culture 55 (4): 773–84. http://www.jstor.org/stable/24468470.
Kuehnert, Blaine , Rachel Kim, Jodi Forlizzi, and Hoda Heidari. 2025. “The ‘Who,’ ‘What,’ and ‘How’ of Responsible AI Governance: A Systematic Review and Meta-Analysis of (Actor, Stage)-Specific Tools.” FAccT '25: Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency, 2991–3005. https://doi.org/10.1145/3715275.3732191.
Jones, Erik, Robin Jia, Aditi Raghunathan, and Percy Liang. 2020. “Robust Encodings: A Framework for Combating Adversarial Typos.” Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2752-65. https://aclanthology.org/2020.acl-main.245/.
Layne, Nathan. 2021. “New Orleans’ Levees Got a $14.5 Billion Upgrade. Will They Hold?” Reuters, August 29. https://www.reuters.com/world/us/new-orleans-levees-got-145-billion-upgrade-will-they-hold-2021-08-30/.
Leveson, Nancy. 2012. Engineering a Safer World: Systems Thinking Applied to Safety. MIT Press.
Longpre, Shayne, Kevin Klyman, Ruth E. Appel, et al. 2025. “In-House Evaluation Is Not Enough: Towards Robust Third-Party Flaw Disclosure for General-Purpose AI.” Preprint, arXiv, March 21. https://doi.org/10.48550/arXiv.2503.16861.
Marchant, Gary E., and Yvonne A. Stevens. 2017. “Resilience: A New Tool in the Risk Governance Toolbox for Emerging Technologies.” U.C. Davis L. Rev. 51: 233–36.
Marda, Nik, Jasmine Sun, and Mark Surman. 2024. “Public AI: Making AI Work For Everyone, By Everyone.” Mozilla. https://assets.mofoprod.net/network/documents/Public_AI_Mozilla.pdf.
McGlynn, Clare, and Erika Rackley. 2017. “Image-Based Sexual Abuse.” Oxford Journal of Legal Studies, 37 (3): 534–61. https://doi.org/10.1093/ojls/gqw033.
MdTF. n.d. “The Maryland Test Facility.” Accessed February 25, 2026. https://mdtf.org/.
Metcalf, Jacob, Emanuel Moss, Elizabeth Anne Watkins, Ranjit Singh, and Madeleine Clare Elish. 2021. “Algorithmic Impact Assessments and Accountability: The Co-Construction of Impacts.” FaccT '21: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 735–46. https://doi.org/10.1145/3442188.3445935.
METR. 2023. “Responsible Scaling Policies (RSPs).” METR, September 26. Accessed January 7, 2025. https://metr.org/blog/2023-09-26-rsp/.
Mittelstadt, Brent D. 2019. “Principles Alone Cannot Guarantee Ethical AI.” Nature Machine Intelligence 1: 501–7. https://doi.org/10.1038/s42256-019-0114-4.
Mulligan, Deirdre K., and Kenneth A. Bamberger. 2018. “Saving Governance-By-Design.” California Law Review 106 (3): 697–784. https://doi.org/10.15779/Z38QN5ZB5H.
Mulligan, Deirdre K., and Kenneth A. Bamberger. 2021. “Allocating Responsibility In Content Moderation: A Functional Framework.” Berkeley Technology Law Journal, 36 (3): 1091–172. https://doi.org/10.15779/Z383B5W872.
Mulligan, Deirdre K., and Helen Nissenbaum. 2020. “The Concept of Handoff as a Model for Ethical Analysis and Design.” In Oxford Handbook of Ethics of AI, edited by Markus D. Dubber, Frank Pasquale, and Sunit Das.Oxford University Press.
Narayanan, Arvind, and Sayash Kapoor. 2024. “AI Safety is Not a Model Property.” AI As Normal Technology, March 12. https://www.normaltech.ai/p/ai-safety-is-not-a-model-property.
NAIAC (National Artificial Intelligence Advisory Committee (NAIAC). 2023. National Artificial Intelligence Advisory Committee Year l. https://web.archive.org/web/20230905003617/https://www.ai.gov/wp-content/uploads/2023/05/NAIAC-Report-Year1.pdf.
NASEM (National Academies of Sciences, Engineering, and Medicine). 2021. Assessing and Improving AI Trustworthiness: Current Contexts and Concerns: Proceedings of a Workshop–in Brief. National Academies Press. https://doi.org/10.17226/26208.
National Fair Housing Affliance. n.d. “Algorithmic Bias in Housing and Lending.” Accessed February 25, 2026. https://nationalfairhousing.org/issue/tech-equity-initiative/.
Nguyen, Stephanie. 2025. “AI-Related Programmatic Advances at the FTC (June 2021 - January 2025).” Federal Trade Commission, January 17. https://www.ftc.gov/news-events/news/public-statements/ai-related-programmatic-advances-ftc-june-2021-january-2025.
NIST (National Institute of Standards and Technology). 2023. Artificial Intelligence Risk Management Framework (NIST AI 100-1). https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.100-1.pdf.
NIST (National Institute of Standards and Technology). 2024a. Artificial Intelligence Risk Management Framework: Generative Artificial Intelligence Profile (NIST AI 600-1). https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.600-1.pdf.
NIST (National Institute of Standards and Technology). 2024b. Managing Misuse Risk for Dual-Use Foundation Models (NIST AI 800-1). https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.800-1.ipd.pdf.
NIST (National Institute of Standards and Technology). “CAISI Works with OpenAI and Anthropic to Promote Secure AI Innovation.” National Institute of Standards and Technology, September 25. https://www.nist.gov/news-events/news/2025/09/caisi-works-openai-and-anthropic-promote-secure-ai-innovation.
Noble, Safiya Umoja. 2018. Algorithms of Oppression: How Search Engines Reinforce Racism. New York University Press.
NSF (National Science Foundation). n.d. “National Artificial Intelligence Research Resource Pilot.” Accessed January 15, 2026. https://www.nsf.gov/focus-areas/ai/nairr.
NSF (National Science Foundation). 2023. “NSF Announces 7 New National Artificial Intelligence Research Institutes.” U.S. National Science Foundation, May 4. https://www.nsf.gov/news/nsf-announces-7-new-national-artificial?sf176473159=1.
NSF (National Science Foundation). 2024. “Responsible Design, Development, and Deployment of Technologies (ReDDDoT).”January 8. https://www.nsf.gov/funding/opportunities/redddot-responsible-design-development-deployment-technologies/506215/nsf24-524.
NSF (National Science Foundation). 2025. “NSF and NVIDIA Partnership Enables Ai2 to Develop Fully Open AI Models to Fuel U.S. Scientific Innovation.” August 14. https://www.nsf.gov/news/nsf-nvidia-partnership-enables-ai2-develop-fully-open-ai.
O’Brien, Kyle, Stephen Casper, Quentin Anthony, et al. 2025. “Deep Ignorance: Filtering Pretraining Data Builds Tamper-Resistant Safeguards into Open-Weight LLMs.” Preprint, arXiv, August 8. https://arxiv.org/abs/2508.06601.
OMB (Office of Management and Budget). 2024. Guidance For Agency Artificial Intelligence Reporting per EO 14110. Office of Management and Budget, August 14. https://www.cio.gov/assets/resources/2024-Guidance-for-AI-Use-Case-Inventories.pdf.
OpenAI. 2025. GPT-5 System Card. August 13. https://cdn.openai.com/gpt-5-system-card.pdf.
Ordoñez, Franco. 2023. “These Tech Giants Are at The White House Today to Talk About the Risks of AI.” NPR, September 12. https://www.npr.org/2023/09/12/1198885516/these-tech-giants-are-at-the-white-house-today-to-talk-about-the-risks-of-ai.
Polemi, Nineta, Isabel Praça, Kitty Kioskli, and Adrien Bécue. 2024. “Challenges and Efforts in Managing AI Trustworthiness Risks: A State of Knowledge.” Frontiers in Big Data 7. https://doi.org/10.3389/fdata.2024.1381163.
Prabhakar, Arati, and Jennifer Klein. 2024. “A Call to Action to Combat Image-Based Sexual Abuse.” White House Office of Science and Technology Policy, May 23. https://bidenwhitehouse.archives.gov/ostp/news-updates/2024/05/23/a-call-to-action-to-combat-image-based-sexual-abuse/.
Raji, Inioluwa Deborah, and Robel Dobbe. 2023. “Concrete Problems in AI Safety, Revisited.” Preprint, arXiv, December 18. https://arxiv.org/abs/2401.10899.
Regulation (EU) 2022/2065 of the European Parliament and of the Council of 19 October 2022 on a Single Market for Digital Services and amending Directive 2000/31/EC (Digital Services Act), 2022 O.J. (L 277) 1.
Regulation (EU) 2024/1689, of the European Parliament and of the Council of 13 June 2024, laying down harmonised rules on artificial intelligence (Artificial Intelligence Act), 2024 O.J. (L 1689) 1.
Scherer, Matthew U. 2016. “Regulating Artificial Intelligence Systems: Risks, Challenges, Competencies, and Strategies.” Harv. J.L & Tech. 29 (2): 353–6. http://dx.doi.org/10.2139/ssrn.2609777.
Sendak, Mark P., William Ratliff, Dina Sarro, et al. 2020. “Real-World Integration of a Sepsis Deep Learning Technology Into Routine Clinical Care: Implementation Study.” JMIR Medical Informatics 8 (7): e15182. doi:10.2196/15182.
Smuha, Nathalie A., Emma Rengers, Adam Harkens, et al. 2021. “How the EU Can Achieve Legally Trustworthy AI: A Response to the European Commission’s Proposal for an Artificial Intelligence Act.” SSRN. http://dx.doi.org/10.2139/ssrn.3899991.
Solow-Niederman, Alicia. 2020. “Administering Artificial Intelligence.” S. Cal. L. Rev. 93: 633–96. http://dx.doi.org/10.2139/ssrn.3495725.
Stein, Merlin, Milan Gandhi, Theresa Kriecherbauer, Amin Oueslati, and Robert Trager. 2024. “Public vs Private Bodies: Who Should Run Advanced AI Evaluations and Audits? A Three-Step Logic Based on Case Studies of High-Risk Industries.” Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 7 (1): 1401–15. https://doi.org/10.1609/aies.v7i1.31733.
Suchman, Lucy A. 1987. Plans and Situated Actions: The Problem of Human-Machine Communication. Cambridge University Press.
Surman, Mark and Ayah Bdeir. 2025. “ROOST: Open Source AI Safety for Everyone.” Mozilla, February 10. https://blog.mozilla.org/en/mozilla/ai/roost-launch-ai-safety-tools-nonprofit/.
The Leadership Conference on Civil and Human Rights. 2023. “The Leadership Conference Education Fund Announces Its ‘Center for Civil Rights and Technology,’ a First of Its Kind Research and Advocacy Hub.” September 7. https://civilrights.org/2023/09/07/the-leadership-conference-education-fund-announces-its-center-for-civil-rights-and-technology-a-first-of-its-kind-research-and-advocacy-hub/.
Thiel, David. 2023. Identifying and Eliminating CSAM in Generative ML Training Data and Models. Stanford Internet Observatory. https://www.congress.gov/118/meeting/house/116913/documents/HHRG-118-JU08-20240306-SD005-U5.pdf.
Tran, Marc. 2015. “Combatting Gender Privilege and Recognizing a Woman's Right to Privacy in Public Spaces: Arguments to Criminalize Catcalling and Creepshots.” Hastings J. Gender & L. 26 (2): 185–206. https://repository.uclawsf.edu/hwlj/vol26/iss2/1.
U.S. Congress. 2022. Violence Against Women Act Reauthorization Act of 2022. Pub. L. No. 117-103, 136 Stat. 840.
U.S. Congress. Senate. Committee on the Judiciary. Subcommittee on Privacy, Technology, and the Law. 2023a. Oversight of AI: Rules for Artificial Intelligence. Hearing, 118th Cong. (Testimony and Questions for the Record of Sam Altman, Chief Executive Officer, OpenAI. Testimony advocating for licensing or registration requirements that would ensure risk management practices are applied to “AI models above a crucial threshold of capabilities.” Questions for the Record “For future generations of the most highly capable foundation models, which are likely to prove more capable than models that have been previously shown to be safe, we support the development of registration, disclosure, and licensing requirements…. Licensees could be required to perform pre-deployment risk assessments and adopt state-of-the-art security and deployment safeguards.”)
U.S. Congress. Senate. Committee on the Judiciary. Subcommittee on Privacy, Technology, and the Law. 2023b. Oversight of AI: Rules for Artificial Intelligence. Hearing, 118th Cong. (Testimony and Questions for the Record of Gary Marcus, Professor Emeritus, New York University. Testimony arguing for “independent scientists access” to test AI systems “before they are widely released – as part of a clinical trial-like safety evaluation.” Questions for the Record advocating the creation of “an FDA-like regulatory regime for AI that evaluates large-scale deployment, balancing risks and benefits.”)
U.S. Congress. Senate. Committee on the Judiciary. Subcommittee on Privacy, Technology, and the Law. 2023c. Oversight of AI: Rules for Artificial Intelligence. Hearing, 118th Cong. (Testimony of Christina Montgomery, Chief Privacy and Trust Officer, IBM, advocating for “risk-based, use-case specific approach” to AI regulation.)
U.S. Congress. 2025. Tools to Address Known Exploitation by Immobilizing Technological Deepfakes on Websites and Networks Act (Take It Down Act). Pub. L. No. 119-12, 139 Stat. 55.
U.S. Department of State, Bureau of Cyberspace and Digital Policy. 2024. Risk Management Profile for Artificial Intelligence and Human Rights. https://2021-2025.state.gov/risk-management-profile-for-ai-and-human-rights/.
Vidgen, Bertie, Adarsh Agrawal, Ahmed M. Ahmed, et al. 2024. “Introducing v0.5 of the AI Safety Benchmark from MLCommons.” Preprint, arXiv, April 18. https://arxiv.org/abs/2404.12241.
Vought, Russell T. 2025. “M-25-21 Memorandum for the Heads of Executive Departments and Agencies: Accelerating Federal Use of AI through Innovation, Governance, and Public Trust.” Office of Management and Budget. https://www.whitehouse.gov/wp-content/uploads/2025/02/M-25-21-Accelerating-Federal-Use-of-AI-through-Innovation-Governance-and-Public-Trust.pdf.
Wachter, Sandra, Brent Mittelstadt, and Chris Russell. 2020. “Why Fairness Cannot Be Automated: Bridging the Gap Between EU Non-Discrimination Law and AI.” Computer Law & Security Review 41: 46–47. http://dx.doi.org/10.2139/ssrn.3547922.
Wachter, Sandra. 2024. “Limitations and Loopholes in the EU AI Act and AI Liability Directives: What This Means for the European Union, the United States, and Beyond.” Yale Journal of Law & Technology 26 (3): 671–718. http://dx.doi.org/10.2139/ssrn.4924553.
Wasil, Akash R., Joshua Clymer, David Krueger, Emily Dardaman, Simeon Campos, and Evan R. Murphy. 2024. “Affirmative Safety: An Approach to Risk Management for High-Risk AI.” Preprint, arXiv, April 14. https://doi.org/10.48550/arXiv.2406.15371.
Wei, Alexander, Nika Haghtalab, and Jacob Steinhardt. 2024. “Jailbroken: How Does LLM Safety Training Fail?” Proceedings of 37th Conference on Neural Information Processing Systems (NeurIPS 2023). https://proceedings.neurips.cc/paper_files/paper/2023/file/fd6613131889a4b656206c50a8bd7790-Paper-Conference.pdf.
Weidinger, Laura, Maribeth Rauh, and Nahema Marchal, et al. 2023. "Sociotechnical Safety Evaluation of Generative AI Systems.” Google Deepmind. Preprint, arXiv, October 31. https://doi.org/10.48550/arXiv.2310.11986.
Weidinger, Laura, Joslyn Barnhart, Jenny Brennan, et al. 2024. “Holistic Safety and Responsibility Evaluations of Advanced AI Models.” Preprint, arXiv, April 22. https://doi.org/10.48550/arXiv.2404.14068.
White House. 2023. Voluntary AI Commitments. https://bidenwhitehouse.archives.gov/wp-content/uploads/2023/09/Voluntary-AI-Commitments-September-2023.pdf.
White House. 2024a. “White House Announces New Private Sector Voluntary Commitments to Combat Image-Based Sexual Abuse.” September 12. https://bidenwhitehouse.archives.gov/ostp/news-updates/2024/09/12/white-house-announces-new-private-sector-voluntary-commitments-to-combat-image-based-sexual-abuse/.
White House. 2024b. “Fact Sheet: Key AI Accomplishments in the Year Since the Biden-Harris Administration's Landmark Executive Order.” October 30. https://bidenwhitehouse.archives.gov/briefing-room/statements-releases/2024/10/30/fact-sheet-key-ai-accomplishments-in-the-year-since-the-biden-harris-administrations-landmark-executive-order/.
White House OSTP (Office of Science and Technology Policy). 2022. Blueprint for an AI Bill of Rights: Making Automated Systems Work for the American People. The White House. https://bidenwhitehouse.archives.gov/ostp/ai-bill-of-rights/.
White House OSTP (Office of Science and Technology Policy). 2024. Framework for Nucleic Acid Synthesis Screening. The White House. https://bidenwhitehouse.archives.gov/wp-content/uploads/2024/04/Nucleic-Acid_Synthesis_Screening_Framework.pdf.
Yang, Stephen. 2024. “Beyond High-Risk Scenarios: Recentering the Everyday Risks of AI.” Center for Democracy and Technology, October 22. https://cdt.org/insights/beyond-high-risk-scenarios-recentering-the-everyday-risks-of-ai/. (Noting that “Anthropic’s Responsible Scaling Policy (RSP) and OpenAI’s Preparedness Framework focus on mitigating risks associated with doomsday scenarios where AI contributes to existential risks, such as pandemics and nuclear wars.”)
Young, Shalanda D. 2024. “M-24-10 Memorandum for the Heads of Executive Departments and Agencies: Advancing Governance, Innovation, and Risk Management for Agency Use of Artificial Intelligence.” Office of Management and Budget. https://www.whitehouse.gov/wp-content/uploads/2024/03/M-24-10-Advancing-Governance-Innovation-and-Risk-Management-for-Agency-Use-of-Artificial-Intelligence.pdf.
© 2026, Deirdre K. Mulligan, Nik Marda, and Victor Zhenyi Wang
Cite as: Deirdre K. Mulligan, Nik Marda, and Victor Zhenyi Wang, A Conceptual Model to Guide AI Risk Governance Strategies, 26-3 Knight First Amend. Inst. (Mar. 16, 2026), https://knightcolumbia.org/content/a-conceptual-model-to-guide-ai-risk-governance-strategies-1 [https://perma.cc/WRD4-ZPC4].
And outside the training of those focused on model safety.
Amazon, Anthropic, Google, Inflection, Meta, Microsoft, and OpenAI signed on to the AI Commitments in July 2023, and Adobe, Cohere, IBM, Nvidia, Palantir, Salesforce, Scale AI, and Stability signed on in September 2023 (Heikkilä 2024; Ordoñez 2023). The September version is nominally modified to address testing-relevant privacy concerns.
Nancy G. Leveson (2012) defines “hazard” as a “system state or set of conditions that, together with a particular set of worst-case environmental conditions, will lead to an accident (loss).”
“Reliability in engineering is defined as the probability that something satisfies its specified behavioral requirements over time and under given conditions—that is, it does not fail” (Leveson 2012).
For instance, GPT-5’s system card measures model refusals against a wide range of “unsafe” content (OpenAI 2025). Although benchmarks are saturated, this training cannot guarantee that all model interactions are safe along these dimensions. While better benchmarks are a part of the picture in improving the technical, we believe widening the aperture and scope of analysis to the broader sociotechnical system may result in safer systems.
The EU AI Act requires AI systems that are “intended to be used as a safety component of a product, or the AI system is itself a product” to undergo a third-party conformity assessment, for example medical devices (Regulation (EU) 2024/1689 Art. 6(1)(a) and Annex I). Other high-risk systems (identified in Annex III), including biometrics, workplace hiring and management among others, “providers shall follow the conformity assessment procedure based on internal control as referred to in Annex VI, which does not provide for the involvement of a notified body” (Regulation (EU) 2024/1689 Art. 43(2)). For a useful overview of the limitations of the EU AI Act generally and the conformity assessments in particular see Wachter 2024.
Among other documentation and transparency requirements, for GPAI posing systemic risks (presumptively models using more than 1025 FLOPS, unless the provider demonstrates the contrary), the EU AI Act requires providers to “perform model evaluation in accordance with standardised protocols and tools reflecting the state of the art, including conducting and documenting adversarial testing of the model with a view to identifying and mitigating systemic risks” however the evaluations need not be conducted by an external entity (Regulation (EU) 2024/1689 Art. 55(1)). In addition, the Third Draft of the General-Purpose AI Code of Practice notes that "[B]efore placing a [GPAI model with systemic risk] on the market, Signatories commit to obtaining independent external systemic risk assessments, including model evaluations, unless the model can be deemed sufficiently safe, as specified in Measure II.11.1” (Chairs and the Vice-Chairs of the General-Purpose AI Code of Practice 2025).
For an overview and comparison of risk management practices in four different AI governance schemes see Kaminski, Margot E. 2023. “Regulating the Risks of AI.” Boston University Law Review 103 (5): 1347-1411. For an overview of the variety of risk management practices that apply to systems and services carrying unacceptable, high, minimal and no risk as well as general purpose AI systems see Wachter, Sandra. 2024. “Limitations and Loopholes in the EU AI Act and AI Liability Directives: What This Means for the European Union, the United States, and Beyond.” Yale Journal of Law and Technology, 26(3): 671–718. https://dx.doi.org/10.2139/ssrn.4924553.
Detailing how the ambiguity of civil rights law allowed regulated entities to create "symbolic structures" that while more reflective of managerial interests than the legal goals came to inform courts understanding of compliance illustrating the concept of "legal endogeneity."
Those who use greater than 1025 floating point operations (FLOPS) in the computation used for training.
“Black-box evaluation techniques assess an AI model’s performance from an external (e.g., user) perspective, limiting analysis to the model’s inputs and outputs without accessing its internal workings (Casper et al. 2024). By contrast, white-box techniques involve analyzing the internal functioning of the model (Casper et al. 2024). Intermediate approaches are referred to as ‘gray-box.’” (Stein et al. 2024)
In particular, consider their pilot with the City of San Jose on a computer vision use case which ultimately led to a bake-off with a number of vendors (Data & Society 2025).
“Image-based sexual abuse—including synthetic content generated by artificial intelligence (AI) and real images distributed without consent—has skyrocketed in recent years, disproportionately targeting women, girls, and LGBTQI+ people. For survivors, this abuse can be devastating, upending their lives, disrupting their education and careers, and leading to depression, anxiety, post-traumatic stress disorder, and increased risk of suicide” (Prabhakar and Klein 2024).
For the purpose of this example, we are setting aside the question of whether fully synthetic intimate images cause harm. To the extent that “systematic tolerance of sexual violence against women” (Tran, 2015) is considered a harm, then it is reasonable to frame systems that make fully synthetic intimate images easy to produce and circulate as producing both risk of harm and hazards.
Including Adobe, Anthropic, Cohere, Common Crawl, Microsoft, and OpenAI who made this commitment (White House 2024a).
For an example of such a mapping for generative AI, see Business for Social Responsibility (2025).
Deirdre K. Mulligan is a professor in the School of Information at the University of California, Berkeley.
Nik Marda is an adjunct lecturer at Northwestern University and Brown University, and a campaign manager for a congressional campaign in southern Minnesota.
Victor Zhenyi Wang is a Ph.D. student at the University of California, Berkeley School of Information.