50 lines
2.6 KiB
Markdown
50 lines
2.6 KiB
Markdown
---
|
|
type: source
|
|
title: "When AI Writes the World's Software, Who Verifies It?"
|
|
author: "Leonardo de Moura"
|
|
url: https://leodemoura.github.io/blog/2026/02/28/when-ai-writes-the-worlds-software-who-verifies-it
|
|
date_published: 2026-02-28
|
|
date_archived: 2026-03-16
|
|
domain: ai-alignment
|
|
secondary_domains: [teleological-economics]
|
|
status: enrichment
|
|
processed_by: theseus
|
|
tags: [formal-verification, lean, ai-generated-code, proof-verification, trust-infrastructure]
|
|
sourced_via: "Alex Obadia (@ObadiaAlex) tweet, ARIA Research Scaling Trust programme"
|
|
twitter_id: "712705562191011841"
|
|
processed_by: theseus
|
|
processed_date: 2026-03-19
|
|
extraction_model: "anthropic/claude-sonnet-4.5"
|
|
---
|
|
|
|
# When AI Writes the World's Software, Who Verifies It?
|
|
|
|
Leonardo de Moura (AWS, Chief Architect of Lean FRO) argues AI-generated code is proliferating faster than verification can scale. Mathematical proof — not testing alone — becomes essential infrastructure.
|
|
|
|
Key evidence:
|
|
- Google: >25% of new code is AI-generated; Microsoft: ~30%. Microsoft CTO predicts 95% by 2030.
|
|
- Anthropic built 100,000-line C compiler using AI agents in 2 weeks for <$20,000
|
|
- "Nearly half of AI-generated code fails basic security tests"
|
|
- Poor software quality costs US economy $2.41T/year (CSIQ 2022)
|
|
|
|
Key arguments:
|
|
- Testing provides confidence but not guarantees. "A proof cannot be gamed."
|
|
- AI overfits test suites — Claude C Compiler "hard-codes values to satisfy the test suite" and "will not generalize"
|
|
- Supply chain attacks via poisoned training data can "inject subtle vulnerabilities into every system AI touches"
|
|
- Lean has become the de facto formal verification platform (AlphaProof, 200K+ formalized theorems, 5 Fields medalists)
|
|
- Morrison (Lean FRO) demonstrated AI-generated Lean implementation of zlib with mathematical proof of correctness
|
|
- "The barrier to verified software is no longer AI capability. It is platform readiness."
|
|
|
|
Directly relevant to [[formal verification of AI-generated proofs provides scalable oversight that human review cannot match because machine-checked correctness scales with AI capability while human verification degrades]].
|
|
|
|
|
|
## Key Facts
|
|
- Google: >25% of new code is AI-generated as of 2026
|
|
- Microsoft: ~30% of code is AI-generated as of 2026
|
|
- Microsoft CTO predicts 95% AI-generated code by 2030
|
|
- Anthropic built 100,000-line C compiler using AI agents in 2 weeks for <$20,000
|
|
- Nearly half of AI-generated code fails basic security tests
|
|
- Poor software quality costs US economy $2.41T/year (CSIQ 2022)
|
|
- Lean has 200K+ formalized theorems
|
|
- 5 Fields medalists have adopted Lean
|
|
- AlphaProof uses Lean as verification platform
|