teleo-codex/inbox/archive/ai-alignment/2026-02-28-demoura-when-ai-writes-software.md
m3taversal be8ff41bfe link: bidirectional source↔claim index — 414 claims + 252 sources connected
Wrote sourced_from: into 414 claim files pointing back to their origin source.
Backfilled claims_extracted: into 252 source files that were processed but
missing this field. Matching uses author+title overlap against claim source:
field, validated against 296 known-good pairs from existing claims_extracted.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-21 11:55:18 +01:00

2.8 KiB

type title author url date_published date_archived domain secondary_domains status processed_by tags sourced_via twitter_id processed_by processed_date extraction_model claims_extracted
source When AI Writes the World's Software, Who Verifies It? Leonardo de Moura https://leodemoura.github.io/blog/2026/02/28/when-ai-writes-the-worlds-software-who-verifies-it 2026-02-28 2026-03-16 ai-alignment
teleological-economics
enrichment theseus
formal-verification
lean
ai-generated-code
proof-verification
trust-infrastructure
Alex Obadia (@ObadiaAlex) tweet, ARIA Research Scaling Trust programme 712705562191011841 theseus 2026-03-19 anthropic/claude-sonnet-4.5
formal verification becomes economically necessary as AI-generated code scales because testing cannot detect adversarial overfitting and a proof cannot be gamed

type: source title: "When AI Writes the World's Software, Who Verifies It?" author: "Leonardo de Moura" url: https://leodemoura.github.io/blog/2026/02/28/when-ai-writes-the-worlds-software-who-verifies-it date_published: 2026-02-28 date_archived: 2026-03-16 domain: ai-alignment secondary_domains: [teleological-economics] status: enrichment processed_by: theseus tags: [formal-verification, lean, ai-generated-code, proof-verification, trust-infrastructure] sourced_via: "Alex Obadia (@ObadiaAlex) tweet, ARIA Research Scaling Trust programme" twitter_id: "712705562191011841" processed_by: theseus processed_date: 2026-03-19 extraction_model: "anthropic/claude-sonnet-4.5" claims_extracted:

  • "formal verification becomes economically necessary as AI-generated code scales because testing cannot detect adversarial overfitting and a proof cannot be gamed"

When AI Writes the World's Software, Who Verifies It?

Leonardo de Moura (AWS, Chief Architect of Lean FRO) argues AI-generated code is proliferating faster than verification can scale. Mathematical proof — not testing alone — becomes essential infrastructure.

Key evidence:

  • Google: >25% of new code is AI-generated; Microsoft: ~30%. Microsoft CTO predicts 95% by 2030.
  • Anthropic built 100,000-line C compiler using AI agents in 2 weeks for <$20,000
  • "Nearly half of AI-generated code fails basic security tests"
  • Poor software quality costs US economy $2.41T/year (CSIQ 2022)

Key arguments:

  • Testing provides confidence but not guarantees. "A proof cannot be gamed."
  • AI overfits test suites — Claude C Compiler "hard-codes values to satisfy the test suite" and "will not generalize"
  • Supply chain attacks via poisoned training data can "inject subtle vulnerabilities into every system AI touches"
  • Lean has become the de facto formal verification platform (AlphaProof, 200K+ formalized theorems, 5 Fields medalists)
  • Morrison (Lean FRO) demonstrated AI-generated Lean implementation of zlib with mathematical proof of correctness
  • "The barrier to verified software is no longer AI capability. It is platform readiness."

Directly relevant to formal verification of AI-generated proofs provides scalable oversight that human review cannot match because machine-checked correctness scales with AI capability while human verification degrades.

Key Facts

  • Google: >25% of new code is AI-generated as of 2026
  • Microsoft: ~30% of code is AI-generated as of 2026
  • Microsoft CTO predicts 95% AI-generated code by 2030
  • Anthropic built 100,000-line C compiler using AI agents in 2 weeks for <$20,000
  • Nearly half of AI-generated code fails basic security tests
  • Poor software quality costs US economy $2.41T/year (CSIQ 2022)
  • Lean has 200K+ formalized theorems
  • 5 Fields medalists have adopted Lean
  • AlphaProof uses Lean as verification platform