ARTICLE · 11 MIN READ · JANUARY 14, 2026
Chapter 5: AI Governance — Approval Regulation, Technical Levers, and the Coordination Problem
Alignment is what you build into the model. Governance is the institutional scaffolding that decides which models get built, who gets to run them, and what evidence we demand before they ship. This chapter walks through the technical AI governance toolkit and the FDA-style approval-regulation proposal.
Why Governance Belongs in an AI Safety Series
Chapters 1–4 of this playlist were about what’s true inside an AI system — its capability, its objective, its internal cognition, and its security posture. None of those, on their own, decide whether a particular system gets built, who gets to run it, what evidence has to exist before it ships, or what happens when the evidence is bad.
Those decisions are governance. And governance is where the technical case for AI safety stops being a research program and starts being something the world actually has to do something about.
Two pieces of context to set this chapter:
- Governance is multi-stakeholder by construction. No single lab, regulator, or country can implement the governance described below unilaterally — and the moment one tries, the value of unilateral action becomes a function of whether others follow. That’s the coordination problem, and it’s load-bearing.
- There is a technical side to AI governance. Governance isn’t only law and policy. It also includes the technical instruments that make policy enforceable: evaluations, audits, compute monitoring, structured access, watermarking, model registries, incident reporting. Reuel et al.’s Open Problems in Technical AI Governance (2024) is the field map for this side.
This chapter is structured around three layers — the technical toolkit, the regulatory architecture (with a deep dive on Ezell’s approval-regulation proposal), and the coordination problem — and ends with what each implies for engineers, lab leadership, and policymakers.
AI governance: The institutional, regulatory, and technical mechanisms that determine which AI systems get developed, who gets to develop and deploy them, what conditions they must satisfy, and what consequences attach to failure.
Technical AI governance (TAIG): The subset of governance that requires technical work to be implementable — evaluations, audit infrastructure, compute monitoring, watermarking, structured-access tooling, model registries. Reuel et al.'s framing.
Approval regulation: A regulatory regime where a product cannot be deployed (or in stronger forms, developed) until a regulator pre-certifies it as safe. The FDA model for pharmaceuticals is the canonical example. Ezell's Certified Safe proposal applies the schematic to frontier AI.
Frontier AI: The most capable general-purpose AI systems at any given time — typically the largest training runs, the most capable foundation models, the systems whose deployment risks are not yet well-characterized. The category the heaviest governance attention focuses on.
Responsible Scaling Policy (RSP) / If-Then commitment: A voluntary framework — pioneered by Anthropic, adopted in similar form by OpenAI's Preparedness Framework, Google DeepMind's Frontier Safety Framework, and others — committing a developer to specific safety thresholds tied to capability evaluations, with pre-committed responses if those thresholds are crossed.
Compute governance: Using control of AI training and inference compute as a regulatory lever — through chip export controls, training-run disclosure thresholds, datacenter monitoring, or hardware-based mechanisms. The argument is that compute is more identifiable, more concentrated, and more controllable than weights or data.
Structured access: Granting researchers, auditors, or regulators specific, controlled forms of access to a model — API access with elevated permissions, tiered access to weights, sandboxed inspection environments — without simply releasing weights publicly.
Evaluations (evals): Standardized tests probing a model's capabilities and propensities. Capability evals (can the model do X?) and propensity evals (will it do X under certain conditions?) are both governance-relevant.
The Three Layers of AI Governance
The layers are mutually dependent. Approval regulation without evaluations is a rubber stamp. Evaluations without an institutional consumer are an academic exercise. International coordination without enforceable domestic regimes is a press release. Each chapter section needs the others to be load-bearing.
Layer 1: Technical AI Governance — The Toolkit
Reuel et al.’s 2024 Open Problems in Technical AI Governance is the cleanest taxonomy of what’s actually needed to make any of this implementable. The central observation: almost every plausible AI governance regime depends on technical capabilities that don’t currently exist at the required level of reliability. Building those capabilities is its own research program, sitting between ML and policy.
The toolkit, organized:
Capability & Propensity Evaluations
The core question for governance: what can this system do, and what will it do? Cyberweapon uplift evals, biothreat-knowledge evals, autonomous-replication evals, persuasion evals. Reliable, reproducible, hard to game. Eval design is a nontrivial research problem with active failure modes (sandbagging, eval-aware behavior, narrow generalization).
Open: how do you eval a model that may know it's being evaluated?Auditing & Structured Access
Third parties — regulators, civil-society auditors, AI Safety Institutes — need access that's deeper than an API but less than weight release. Tiered access, sandboxed inspection environments, audit modes that disable certain output filters for compliance testing. Each tier has its own security model.
Open: how do you give auditors enough to do their job without leaking weights?Compute Governance
The Sastry et al. argument: compute is more governable than data or weights. There are few suppliers, finite high-end accelerators, and natural disclosure thresholds (training runs above N FLOPs). Mechanisms: export controls, chip-level cryptographic attestation, training-run reporting, datacenter monitoring.
Open: hardware mechanisms that survive a determined adversary, not just well-meaning operatorsProvenance & Watermarking
Tooling to identify whether a given piece of content came from a particular model, and which model. Output watermarking, content credentials (C2PA), model fingerprinting. Critical for impersonation, election integrity, CSAM detection — and limited by detector reliability, removability under transformations, and the existence of unmarked open-weight competitors.
Open: any robust watermark must survive paraphrase, translation, and partial copyModel Registries & Disclosure
A registry of frontier models with metadata: training compute, training-data sources, evaluation results, deployment surface, incident history. Currently mostly voluntary (e.g. EU AI Act's GPAI register). A working registry is a precondition for almost every other governance action.
Open: what disclosure is mandated, what is voluntary, what is publicly viewable vs. regulator-onlyIncident Reporting & Post-Market Surveillance
Aviation has the FAA mandatory-reporting regime; medicine has FDA adverse-event reporting. AI has, mostly, ad-hoc voluntary disclosure. Building MITRE-style shared incident databases (MITRE ATLAS, OECD AI Incident Database) is a precondition to learning from failures across organizations.
Open: what counts as a reportable incident; how to share without leaking IP or attack detailsWhy “open problems” is the right framing. Reuel et al. don’t claim these tools exist and just need adopting. They claim each tool has technical research questions blocking its production use — and that without that research, governance regimes will either be unenforceable or rely on developers’ good faith. The technical AI governance program is precisely the work of closing those gaps.
The implication for an engineer reading this: a lot of the highest-leverage governance work is engineering, not law. Better evals, better watermarks, better attestation primitives, better incident schemas. The supply of these is a binding constraint on every regulatory regime that depends on them.
Layer 2: Regulatory Architecture — Five Live Models
The instruments above can be assembled into very different regulatory architectures. Roughly five families are active in 2026.
Voluntary RSPs / If-Then Commitments
Developers publish capability thresholds and commit to specific responses if those thresholds are crossed. Anthropic's RSP, OpenAI's Preparedness Framework, Google DeepMind's Frontier Safety Framework. Pros: fast, evolves with the technology. Cons: voluntary; competitive pressure can erode commitments; verification is mostly self-reported.
Horizontal Frameworks (EU AI Act)
Risk-tiered, sector-agnostic regulation: prohibited uses, high-risk obligations, limited-risk transparency, GPAI rules. Emphasizes documentation, conformity assessment, fundamental-rights impact assessment. Pros: democratically legitimate, enforceable. Cons: slow to update, struggles with frontier-AI-specific risks not covered by general categories.
Approval Regulation (Ezell's Certified Safe)
Pre-deployment regulator approval for frontier systems: developer submits safety case, regulator reviews, decision is binding. Modeled on FDA. Pros: strong, evidence-grounded gating; established mature mechanisms in pharma. Cons: requires institutional capacity that doesn't yet exist for AI; risk of regulatory capture; speed concerns.
Sector-Specific Rules
Specific regulators for specific deployment domains — FDA for clinical AI, NHTSA for self-driving, SEC for AI-driven trading, EEOC for hiring. Pros: leverages domain expertise. Cons: foundation-model risks span sectors; horizontal gaps remain.
Liability & Tort
Common-law and statutory liability for harms caused by AI systems. Pros: works without new agencies; market-driven incentive for safety. Cons: notoriously slow signal; difficult causation cases; insufficient for catastrophic harms (who do you sue after a global incident?). Best as a complement, not a substitute.
Compute / Export Controls
The U.S. CHIPS Act and successor controls treat advanced semiconductor manufacturing and high-end AI accelerators as strategic goods. Pros: bites on a chokepoint; few suppliers, identifiable shipments. Cons: regulates an upstream input rather than the model itself; geopolitical dependency.
These aren’t mutually exclusive — most realistic regimes blend several. The interesting design question is which combination, and the answer depends on the maturity of the underlying technical instruments (Layer 1) and the political appetite for binding rules (Layer 3).
Deep Dive: Approval Regulation — Ezell’s Certified Safe
Carson Ezell’s Certified Safe: A Schematic for Approval Regulation of Frontier AI is the most fully-worked-out proposal in the family. It deserves a careful walk-through because the structure illuminates what governance has to be able to do, not just what it would forbid.
The core proposal: frontier AI systems above defined thresholds must obtain regulator approval before deployment, and in stronger versions before development past a certain point. The schematic specifies who, what, and how.
The pharma analogy isn’t decorative — it’s load-bearing. FDA approval works (imperfectly) because:
- The agency is technically credible.
- The submission package (the “NDA” in FDA-speak, or “safety case” in AI) is structured, auditable, and explicit about evidence and uncertainty.
- The regulatory decision is binding, and there are real consequences for circumvention.
- Approval is for a defined indication, not the molecule in general.
- Post-market surveillance is mandatory and consequential.
Ezell’s claim is that the same five properties are achievable for AI — and that without something close to them, the alternative regimes leave the most consequential decisions in the hands of the developers themselves. The strength of the proposal is that it forces governance to confront what evidence would be sufficient to deploy a frontier system — and that question, once asked seriously, exposes how much technical AI governance work still has to be done before any regulator could answer it.
What approval regulation requires that doesn’t yet exist
The honest reading of Certified Safe is that it’s a target, not a current capability. The preconditions are exactly the open problems Reuel et al. identified:
Mature evaluations
The regulator needs evals it trusts as evidence — reproducible, robust to gaming, calibrated to real-world risk. Today, most safety evals are noisy and contested. A mature eval ecosystem is years of work.
Required: reproducible, gameable-resistant capability and propensity evalsInstitutional capacity
An approving body needs technical depth, secure facilities for structured access, statutory authority, and political legitimacy. AI Safety Institutes (UK, US, Japan, etc.) are the early prototypes; they're nowhere near FDA-scale yet.
Required: agencies with deep ML expertise and binding authoritySafety case methodology
What's a good safety case for a frontier model? The aviation and nuclear-safety communities have decades of practice with structured safety cases (e.g. GSN). The AI community is just starting to write them — and the right structure for a foundation model is still under active development.
Required: standardized structures for arguing AI safety, audit-readyCross-jurisdictional coherence
If only one country has approval regulation, frontier development moves elsewhere. If many countries each have different regimes, compliance overhead is unsustainable. The proposal needs international harmonization to be enforceable, which lands us in Layer 3.
Required: bilateral/multilateral recognition or a central international bodyThe honest summary of where approval regulation stands. It is the most coherent end-state proposal for the highest-stakes systems. It is also currently impossible to implement at full strength because the technical and institutional preconditions don’t exist yet. The realistic path is building those preconditions in parallel with bridging regimes — voluntary RSPs, AI Safety Institutes, sector-specific rules — that buy time to mature the toolkit.
Layer 3: The Coordination Problem
Domestic regulation has one fundamental limit: it doesn’t bind anyone outside the jurisdiction. For frontier AI, where the marginal cost of moving compute and talent across borders is non-trivial but finite, that limit is load-bearing.
The instruments for cross-border coordination on AI are still early-stage:
AI Safety Institutes Network
UK AISI (founded 2023), US AISI (2024), and counterparts in Japan, Singapore, Korea, Canada, and the EU. Coordinate on evaluations, share methodologies, conduct pre-deployment testing. The closest thing to an embryonic international technical governance regime.
International Statements & Declarations
Bletchley Declaration (2023), Seoul Declaration (2024), and successor summits. Soft law: aspirational commitments, not enforcement. Important as norm-setting, weak as binding constraint. The G7 Hiroshima AI Process Code of Conduct sits in similar territory.
Export Controls & Technology Transfer
U.S. controls on advanced GPU exports (October 2022 onward, repeatedly tightened) and on associated semiconductor manufacturing equipment. Effective as a one-sided lever; effectiveness depends on multilateral participation (Netherlands, Japan in semiconductor equipment) and on smuggling/circumvention.
Lab-to-Lab Agreements
Voluntary multi-party commitments: Frontier Model Forum, the White House voluntary commitments (2023), pre-deployment testing agreements with AISIs. Faster than treaty-making, weaker as binding commitment, useful as a substrate to build on.
Why this matters for the technical layer. Almost every Layer 1 instrument is more useful if it works across borders — a safety eval whose results are mutually recognized, an incident database that aggregates across jurisdictions, a model registry with international scope, watermarking that’s recognized cross-platform. The technical work isn’t separate from the international coordination work; it’s the substrate that makes coordination implementable.
What This Implies for Practice
Engineers: TAIG is engineering work
If you're an ML engineer who cares about safety, "build better evaluations," "improve audit infrastructure," "implement structured-access tooling" are governance contributions in technical clothing. The supply of these is a binding constraint on the rest of governance.
Lab leadership: take RSPs seriously
Voluntary frameworks are the bridge regime. They are also the regime that's currently load-bearing, in the absence of statutory approval regulation. The credibility of voluntary frameworks now is a precondition for hard regulation later — both as proof-of-concept and as a base of practice for regulators to adopt.
Policymakers: build the institutions before you need them
An FDA-style regime takes a generation to mature. The realistic move is to begin building AISIs, model registries, incident-reporting infrastructure, and structured-access frameworks now — even at modest scale — so the institutions exist when the political moment for mandatory regulation arrives.
Plan for partial cooperation, not full
The realistic governance regime in 2030 is heterogeneous: a U.S./U.K./EU/Japan/Korea bloc converging on aligned standards, others operating under different rules. Design domestic regimes that work in that world — with export controls, structured-access denials, and import restrictions filling the gap left by non-participating jurisdictions.
Treat safety cases as a discipline
The aviation and nuclear communities have decades of practice with structured safety cases. Borrow the discipline: explicit hazard identification, evidence-based mitigation arguments, residual-risk acknowledgement, audit-ready documentation. This is true whether or not it's mandated yet.
Report incidents — and learn from them
The single biggest difference between AI and other safety-critical fields is the absence of disciplined, shared incident reporting. Voluntary disclosure to MITRE ATLAS, OECD AI Incident Database, AISI safety teams is currently the substitute. Use it. Build the muscle now.
Common Confusions
"Regulation will kill innovation."
Empirically, in pharma, aviation, and finance, the relationship between regulation and innovation is more nuanced — predictability and a clear path to market often *enable* investment. The relevant policy question isn't regulation vs. no regulation; it's which regulatory architecture, with what costs and what benefits.
"Voluntary commitments are just PR."
Sometimes. Sometimes not — Anthropic's RSP, OpenAI's Preparedness Framework, and DeepMind's Frontier Safety Framework have triggered actual development and deployment changes. Voluntary commitments are auditable in the limited sense that public commitments create reputational and legal exposure when broken. They aren't sufficient; they aren't nothing.
"China won't comply, so why bother?"
The U.S. didn't decline nuclear safety regulation because the USSR existed. Domestic safety regimes provide protection against domestic misuse, set norms, and create the institutional substrate for whatever international coordination later becomes possible. Unilateral safety is positive-value even without universal participation.
"Compute governance won't survive smuggling."
Export controls leak. They also raise costs, slow timelines, and force adversaries to allocate effort to circumvention. The bar is "do the controls bite enough to matter," not "are they hermetic." Pharma anti-counterfeiting regimes leak too; that doesn't mean you abandon them.
"Approval regulation is too slow for AI."
Speed is a design choice. FDA has fast-track and breakthrough designations. Approval regulation can be designed to be faster than vanilla pharma without abandoning the structure. Speed-vs.-rigor is a real tradeoff, but not a binary.
"Safety evaluations are just security theater."
Sometimes — when they're poorly designed or developer-self-reported with no audit. The fix is better evaluation methodology and independent inspection, not abandoning evals. The whole point of TAIG is to push evals from theater to evidence.
At a Glance
AI governance is the institutional, regulatory, and technical scaffolding that determines which AI systems get built, who runs them, and what evidence has to exist before they ship. It has three layers: technical instruments (evals, audits, compute monitoring, watermarking, registries), regulatory architecture (voluntary RSPs, horizontal frameworks, approval regulation, sectoral rules, liability, export controls), and international coordination.
Alignment work inside a model isn't sufficient if anyone can build any model and deploy it however they want. Governance is what ties capability decisions to evidence and accountability. The Reuel et al. open-problems framing makes the technical preconditions visible. Ezell's Certified Safe proposal makes the regulatory end-state concrete. The coordination problem makes the international layer load-bearing.
Build the technical instruments now — they're the binding constraint on every regulatory regime. Treat voluntary RSPs as a bridge to mandatory regimes, not an alternative. Plan for partial international cooperation, not full. Borrow safety-case discipline from aviation and nuclear. And report incidents — shared data is how every safety field actually improves.
Key Takeaways
-
Three layers, mutually dependent. Technical instruments enable regulatory architecture; regulatory architecture is undercut by lack of international coordination; international coordination has nothing to enforce without domestic instruments. Move on all three or move on none.
-
Technical AI governance is the rate-limiting step. Reliable evaluations, structured access, compute attestation, watermarking, model registries, incident reporting — every governance regime depends on these. Most don’t yet exist at the level of reliability needed. Building them is engineering work disguised as policy work.
-
Voluntary RSPs are the current bridge regime. They’re imperfect — voluntary, self-reported, vulnerable to competitive pressure. They’re also the only thing currently load-bearing in the absence of mature approval regulation. Their credibility now shapes the rules later.
-
Approval regulation is the most coherent end-state for high-stakes systems. Ezell’s Certified Safe schematic — pre-deployment safety case, independent review, conditional approval, post-market surveillance, real enforcement — generalizes a regime that’s worked imperfectly-but-consequentially in pharma. The preconditions don’t yet exist; the proposal is a target, with TAIG as the runway.
-
Compute governance bites on a real chokepoint. High-end accelerators and the equipment to make them are concentrated, identifiable, and controllable in ways that data and weights aren’t. Export controls, training-run reporting, and chip-level attestation are the toolset.
-
The coordination problem is real and partial cooperation is the realistic plan. Full multilateralism is a target, not a base case. Domestic regimes that survive partial international participation — backed by export controls, structured-access denials, and import restrictions — are what’s actually achievable in the medium term.
-
Engineers are governance contributors. The TAIG agenda is not separate from ML research — it’s an applied subset of it. Better evals, better audits, better attestation primitives are policy infrastructure built in code.
-
Borrow ruthlessly from older safety fields. Aviation, nuclear, and pharma all developed structured safety cases, mandatory incident reporting, regulator capacity, and international coordination over decades. AI doesn’t have decades. It does have the option to copy what worked rather than invent from scratch.
Further Reading
- Reuel, Bucknall, Casper, Fist, Soder et al., “Open Problems in Technical AI Governance” (2024) — the field map for the technical instruments side of governance.
- Ezell, “Certified Safe: A Schematic for Approval Regulation of Frontier AI” — the most fully-developed proposal for FDA-style pre-deployment AI regulation.
- Sastry, Heim, Belfield, Anderljung et al., “Computing Power and the Governance of Artificial Intelligence” (2024) — the canonical case for compute as a governance lever.
- Anthropic, “Responsible Scaling Policy”; OpenAI, “Preparedness Framework”; Google DeepMind, “Frontier Safety Framework” — the reference implementations of voluntary if-then commitments.
- EU AI Act (Regulation 2024/1689) — the most comprehensive horizontal framework, with general-purpose-AI specific obligations.
- UK AI Safety Institute and US AI Safety Institute publications — the early operational record of pre-deployment evaluation regimes.
- OECD AI Incident Monitor and MITRE ATLAS — the early infrastructure for cross-organization incident sharing.
Enjoy Reading This Article?
Here are some more articles you might like to read next: