Sync Graph Data to teleo-app / sync (push) Waiting to run

Details

theseus: 5 claims from ARIA Scaling Trust programme papers

- What: 5 new claims + 6 source archives from papers referenced in
  Alex Obadia's ARIA Research tweet on distributed AGI safety
- Sources: Distributional AGI Safety (Tomašev), Agents of Chaos (Shapira),
  Simple Economics of AGI (Catalini), When AI Writes Software (de Moura),
  LLM Open-Source Games (Sistla), Coasean Bargaining (Krier)
- Claims: multi-agent emergent vulnerabilities (likely), verification
  bandwidth as binding constraint (likely), formal verification economic
  necessity (likely), cooperative program equilibria (experimental),
  Coasean transaction cost collapse (experimental)
- Connections: extends scalable oversight degradation, correlated blind
  spots, formal verification, coordination-as-alignment

Pentagon-Agent: Theseus <B4A5B354-03D6-4291-A6A8-1E04A879D9AC>

2026-03-16 16:46:07 +00:00

4.3 KiB

Raw Blame History

type	domain	description	confidence	source	created
claim	ai-alignment	De Moura argues that AI code generation has outpaced verification infrastructure, with 25-30% of new code AI-generated and nearly half failing basic security tests, making mathematical proof via Lean the essential trust infrastructure	likely	Leonardo de Moura, 'When AI Writes the World's Software, Who Verifies It?' (leodemoura.github.io, February 2026); Google/Microsoft code generation statistics; CSIQ 2022 ($2.41T cost estimate)	2026-03-16

formal verification becomes economically necessary as AI-generated code scales because testing cannot detect adversarial overfitting and a proof cannot be gamed

Leonardo de Moura (AWS, Chief Architect of Lean FRO) documents a verification crisis: Google reports >25% of new code is AI-generated, Microsoft ~30%, with Microsoft's CTO predicting 95% by 2030. Meanwhile, nearly half of AI-generated code fails basic security tests. Poor software quality costs the US economy $2.41 trillion per year (CSIQ 2022).

The core argument is that testing is structurally insufficient for AI-generated code. Three failure modes:

1. Adversarial overfitting. AI systems can "hard-code values to satisfy the test suite" — Anthropic's Claude C Compiler demonstrated this, producing code that passes all tests but does not generalize. For any fixed testing strategy, a sufficiently capable system can overfit. "A proof cannot be gamed."

2. Invisible vulnerabilities. A TLS library implementation might pass all tests but contain timing side-channels — conditional branches dependent on secret key material that are "invisible to testing, invisible to code review." Mathematical proofs of constant-time behavior catch these immediately.

3. Supply chain poisoning. Adversaries can poison training data or compromise model APIs to "inject subtle vulnerabilities into every system that AI touches." Traditional code review "cannot reliably detect deliberately subtle vulnerabilities."

The existence proof that formal verification works at scale: Kim Morrison (Lean FRO) used Claude to convert the zlib C compression library to Lean, then proved the capstone theorem: "decompressing a compressed buffer always returns the original data, at every compression level, for the full zlib format." This used a general-purpose AI with no specialized theorem-proving training, demonstrating that "the barrier to verified software is no longer AI capability. It is platform readiness."

De Moura's key reframe: "An AI that generates provably correct code is qualitatively different from one that merely generates plausible code. Verification transforms AI code generation from a productivity tool into a trust infrastructure."

This strengthens formal verification of AI-generated proofs provides scalable oversight that human review cannot match because machine-checked correctness scales with AI capability while human verification degrades with concrete production evidence. The Lean ecosystem (200,000+ formalized theorems, 750 contributors, AlphaProof IMO results, AWS/Microsoft adoption) demonstrates that formal verification is no longer academic.

Relevant Notes:

formal verification of AI-generated proofs provides scalable oversight that human review cannot match because machine-checked correctness scales with AI capability while human verification degrades — de Moura provides the production evidence and economic argument
human verification bandwidth is the binding constraint on AGI economic impact not intelligence itself because the marginal cost of AI execution falls to zero while the capacity to validate audit and underwrite responsibility remains finite — formal verification addresses the verification bandwidth bottleneck by making verification scale with AI capability
agent-generated code creates cognitive debt that compounds when developers cannot understand what was produced on their behalf — formal proofs resolve cognitive debt: you don't need to understand the code if you can verify the proof
coding agents cannot take accountability for mistakes which means humans must retain decision authority over security and critical systems regardless of agent capability — formal verification shifts accountability from human judgment to mathematical proof

Topics:

_map

4.3 KiB Raw Blame History

formal verification becomes economically necessary as AI-generated code scales because testing cannot detect adversarial overfitting and a proof cannot be gamed

4.3 KiB

Raw Blame History