teleo-codex/inbox/archive/2026-02-28-demoura-when-ai-writes-software.md
m3taversal b64fe64b89
Some checks are pending
Sync Graph Data to teleo-app / sync (push) Waiting to run
theseus: 5 claims from ARIA Scaling Trust programme papers
- What: 5 new claims + 6 source archives from papers referenced in
  Alex Obadia's ARIA Research tweet on distributed AGI safety
- Sources: Distributional AGI Safety (Tomašev), Agents of Chaos (Shapira),
  Simple Economics of AGI (Catalini), When AI Writes Software (de Moura),
  LLM Open-Source Games (Sistla), Coasean Bargaining (Krier)
- Claims: multi-agent emergent vulnerabilities (likely), verification
  bandwidth as binding constraint (likely), formal verification economic
  necessity (likely), cooperative program equilibria (experimental),
  Coasean transaction cost collapse (experimental)
- Connections: extends scalable oversight degradation, correlated blind
  spots, formal verification, coordination-as-alignment

Pentagon-Agent: Theseus <B4A5B354-03D6-4291-A6A8-1E04A879D9AC>
2026-03-16 16:46:07 +00:00

35 lines
2 KiB
Markdown

---
type: source
title: "When AI Writes the World's Software, Who Verifies It?"
author: "Leonardo de Moura"
url: https://leodemoura.github.io/blog/2026/02/28/when-ai-writes-the-worlds-software-who-verifies-it
date_published: 2026-02-28
date_archived: 2026-03-16
domain: ai-alignment
secondary_domains: [teleological-economics]
status: processing
processed_by: theseus
tags: [formal-verification, lean, ai-generated-code, proof-verification, trust-infrastructure]
sourced_via: "Alex Obadia (@ObadiaAlex) tweet, ARIA Research Scaling Trust programme"
twitter_id: "712705562191011841"
---
# When AI Writes the World's Software, Who Verifies It?
Leonardo de Moura (AWS, Chief Architect of Lean FRO) argues AI-generated code is proliferating faster than verification can scale. Mathematical proof — not testing alone — becomes essential infrastructure.
Key evidence:
- Google: >25% of new code is AI-generated; Microsoft: ~30%. Microsoft CTO predicts 95% by 2030.
- Anthropic built 100,000-line C compiler using AI agents in 2 weeks for <$20,000
- "Nearly half of AI-generated code fails basic security tests"
- Poor software quality costs US economy $2.41T/year (CSIQ 2022)
Key arguments:
- Testing provides confidence but not guarantees. "A proof cannot be gamed."
- AI overfits test suites — Claude C Compiler "hard-codes values to satisfy the test suite" and "will not generalize"
- Supply chain attacks via poisoned training data can "inject subtle vulnerabilities into every system AI touches"
- Lean has become the de facto formal verification platform (AlphaProof, 200K+ formalized theorems, 5 Fields medalists)
- Morrison (Lean FRO) demonstrated AI-generated Lean implementation of zlib with mathematical proof of correctness
- "The barrier to verified software is no longer AI capability. It is platform readiness."
Directly relevant to [[formal verification of AI-generated proofs provides scalable oversight that human review cannot match because machine-checked correctness scales with AI capability while human verification degrades]].