theseus: add 3 compute infrastructure claims + source archive

- What: 3 structural claims about AI compute governance implications 1. Inference shift favors distributed architectures (experimental) 2. Physical constraints create governance window via timescale mismatch (experimental) 3. Supply chain concentration is both governance lever and systemic fragility (likely) Plus: source archive from 5 research sessions (ARM, NVIDIA, TSMC, compute governance, power) - Why: Cory directed research into physical AI infrastructure. Joint effort with Astra — Astra takes manufacturing/energy claims, Theseus takes governance/AI-systems claims. - Connections: Links to compute export controls, technology-coordination gap, safe AI dev, systemic fragility, collective superintelligence claims Pentagon-Agent: Theseus <24DE7DA0-E4D5-4023-B1A2-3F736AFF4EEE>
2026-03-24 17:55:05 +00:00 · 2026-03-24 17:55:05 +00:00 · 06b96df522
commit 06b96df522
parent 3923d5b33a
4 changed files with 263 additions and 0 deletions
--- a/domains/ai-alignment/compute
+++ b/domains/ai-alignment/compute
@ -0,0 +1,69 @@
+---
+type: claim
+domain: ai-alignment
+description: "TSMC manufactures ~92% of advanced logic chips, three companies produce all HBM, NVIDIA controls 60%+ of CoWoS allocation — this concentration makes compute governance tractable (few points to monitor) while creating catastrophic vulnerability (one disruption halts global AI development)"
+confidence: likely
+source: "Heim et al. 2024 compute governance framework, Chris Miller 'Chip War', CSET Georgetown chokepoint analysis, TSMC market share data, RAND semiconductor supply chain reports"
+created: 2026-03-24
+depends_on:
+  - "compute export controls are the most impactful AI governance mechanism but target geopolitical competition not safety leaving capability development unconstrained"
+  - "technology advances exponentially but coordination mechanisms evolve linearly creating a widening gap"
+  - "optimization for efficiency without regard for resilience creates systemic fragility because interconnected systems transmit and amplify local failures into cascading breakdowns"
+challenged_by:
+  - "Geographic diversification (TSMC Arizona, Samsung, Intel Foundry) is actively reducing concentration"
+  - "The concentration is an artifact of economics not design — multiple viable fabs could exist if subsidized"
+secondary_domains:
+  - collective-intelligence
+  - critical-systems
+---
+
+# Compute supply chain concentration is simultaneously the strongest AI governance lever and the largest systemic fragility because the same chokepoints that enable oversight create single points of failure
+
+The AI compute supply chain is the most concentrated critical infrastructure in history. A single company (TSMC) manufactures approximately 92% of advanced logic chips. Three companies produce all HBM memory. One company (ASML) makes the EUV lithography machines required for leading-edge fabrication. NVIDIA commands over 60% of the advanced packaging capacity that determines how many AI accelerators ship.
+
+This concentration creates a paradox: the same chokepoints that make compute governance tractable (because there are few points to monitor and control) also create catastrophic systemic vulnerability (because disruption at any single point halts global AI development).
+
+## The governance lever
+
+Heim, Sastry, and colleagues at GovAI have established that compute is uniquely governable among AI inputs. Unlike data (diffuse, hard to track) and algorithms (abstract, easily copied), chips are physical, trackable, and produced through a concentrated supply chain. Their compute governance framework proposes three mechanisms: visibility (who has what compute), allocation (who gets access), and enforcement (compliance verification).
+
+The concentration amplifies each mechanism:
+
+- **Visibility:** With one dominant manufacturer (TSMC), tracking advanced chip production is tractable. You don't need to monitor thousands of fabs — you need to monitor a handful of facilities.
+- **Allocation:** Export controls work because there are few places to export from. The October 2022 US semiconductor export controls leveraged TSMC, ASML, and applied materials' concentration to constrain China's AI compute access.
+- **Enforcement:** Shavit (2023) proposed hardware-based compute monitoring. With concentrated manufacturing, governance mechanisms can be built into the chip at the design or fabrication stage (Fist & Heim, "Secure, Governable Chips").
+
+This is the strongest argument for compute governance: the physical supply chain's concentration is a feature, not a bug, from a governance perspective.
+
+## The systemic fragility
+
+The same concentration that enables governance creates catastrophic risk. Three scenarios illustrate the fragility:
+
+**Taiwan disruption.** TSMC fabricates ~92% of the world's most advanced chips in Taiwan. A military conflict, blockade, earthquake, or prolonged power disruption in Taiwan would immediately sever the global supply of AI accelerators. TSMC is building fabs in Arizona (92% yield achieved, approaching full utilization) but the most advanced processes remain Taiwan-first through at least 2027-2028. Geographic diversification is real but early.
+
+**Packaging bottleneck cascade.** CoWoS packaging at TSMC is already the binding constraint on AI chip supply. If a disruption reduced CoWoS capacity by even 20%, the effect would cascade: fewer AI accelerators → delayed AI deployments → concentrated remaining supply among the biggest buyers → smaller organizations locked out entirely.
+
+**Memory concentration.** All three HBM vendors are sold out through 2026. A production disruption at any one of them would reduce global HBM supply by 20-60% with no short-term alternative.
+
+## The paradox
+
+Governance leverage and systemic fragility are two faces of the same structural fact: concentration. You cannot have the governance benefits (tractable monitoring, effective export controls, hardware-based enforcement) without the fragility costs (single points of failure, catastrophic disruption scenarios). And you cannot reduce fragility through diversification without simultaneously reducing governance leverage.
+
+This is a genuine tension, not a problem to solve. The optimal policy depends on which risk you weight more heavily: the risk of ungoverned AI development (favoring concentration for governance leverage) vs. the risk of supply chain disruption (favoring diversification for resilience).
+
+The alignment field has largely focused on the governance side (how to control AI development) without accounting for the fragility side (what happens when the physical substrate fails). Both risks are real. The supply chain concentration that makes compute governance possible is the same concentration that makes the entire AI enterprise fragile.
+
+## Connection to existing KB
+
+This claim connects the alignment concern (governance) to the critical-systems concern (fragility). The foundational claim that [[optimization for efficiency without regard for resilience creates systemic fragility because interconnected systems transmit and amplify local failures into cascading breakdowns]] applies directly: the semiconductor supply chain has been optimized for efficiency (TSMC's scale advantages, NVIDIA's CoWoS allocation) without regard for resilience (no backup fabs, no alternative packaging at scale).
+
+---
+
+Relevant Notes:
+- [[compute export controls are the most impactful AI governance mechanism but target geopolitical competition not safety leaving capability development unconstrained]] — export controls leverage the concentration this claim describes
+- [[optimization for efficiency without regard for resilience creates systemic fragility because interconnected systems transmit and amplify local failures into cascading breakdowns]] — the semiconductor supply chain is a textbook case
+- [[technology advances exponentially but coordination mechanisms evolve linearly creating a widening gap]] — physical infrastructure constraints partially compensate for this gap
+- [[the alignment tax creates a structural race to the bottom because safety training costs capability and rational competitors skip it]] — supply chain concentration means the race is gated by physical infrastructure, not just investment willingness
+
+Topics:
+- [[domains/ai-alignment/_map]]
--- a/domains/ai-alignment/physical
+++ b/domains/ai-alignment/physical
@ -0,0 +1,66 @@
+---
+type: claim
+domain: ai-alignment
+description: "CoWoS packaging, HBM memory, and datacenter power each gate AI compute scaling on timescales (2-10 years) much longer than algorithmic or architectural advances (months) — this mismatch creates a window where alignment research can outpace deployment even without deliberate slowdown"
+confidence: experimental
+source: "TSMC CoWoS capacity constraints (CEO public statements), HBM vendor sell-out confirmations (SK Hynix, Micron CFOs), IEA/Goldman Sachs datacenter power projections, Epoch AI compute doubling trends, Heim et al. 2024 compute governance framework"
+created: 2026-03-24
+depends_on:
+  - "technology advances exponentially but coordination mechanisms evolve linearly creating a widening gap"
+  - "safe AI development requires building alignment mechanisms before scaling capability"
+challenged_by:
+  - "Algorithmic efficiency gains may outpace physical constraints — Epoch AI finds algorithms halve required compute every 8-9 months"
+  - "Physical constraints are temporary — CoWoS alternatives by 2027, HBM4 increases capacity, nuclear can eventually meet power demand"
+  - "If the US self-limits via infrastructure lag, compute migrates to jurisdictions with fewer safety norms"
+secondary_domains:
+  - collective-intelligence
+---
+
+# Physical infrastructure constraints on AI scaling create a natural governance window because packaging memory and power bottlenecks operate on 2-10 year timescales while capability research advances in months
+
+The alignment field treats AI scaling as a function of investment and algorithms. But the physical substrate imposes its own timescales: advanced packaging expansion takes 2-3 years, HBM supply is sold out for 1-2 years forward, new power generation takes 5-10 years. These timescales are longer than the algorithmic improvement cycle (months) but shorter than institutional governance cycles (decades). This mismatch creates a window — not designed, but real — where physical constraints slow deployment faster than they slow alignment research.
+
+## The timescale mismatch
+
+Three independent physical constraints gate AI compute scaling, each on different timescales:
+
+**Packaging (2-3 years):** TSMC's CoWoS capacity is sold out through 2026 with demand exceeding supply even at planned expansion rates. Google has already cut TPU production targets due to CoWoS constraints. Intel's EMIB alternative is gaining interest but won't reach comparable scale before 2027-2028. Each new AI chip generation requires larger interposers, so the bottleneck worsens per generation.
+
+**Memory (1-2 years):** All three HBM vendors (SK Hynix, Samsung, Micron) have confirmed their supply is sold out through 2026. HBM4 accelerates to meet NVIDIA's next-generation architecture, but each GB of HBM requires 3-4x the wafer capacity of DDR5, creating structural supply tension.
+
+**Power (5-10 years):** New power generation takes 3-7 years to build. Grid interconnection queues in the US average 5+ years with only ~20% of projects reaching commercial operation. Nuclear deals for AI (Microsoft-Constellation, Amazon-X-Energy, Google-Kairos) cover 2-3 GW near-term against projected need of 25-30 GW additional capacity. This is the longest-horizon constraint.
+
+Meanwhile, frontier training compute doubles every 9-10 months (Epoch AI), and algorithmic efficiency improvements halve required compute every 8-9 months. The demand curve is exponential; the supply curves are linear or stepwise.
+
+## Why this is a governance window
+
+Lennart Heim and colleagues at GovAI/RAND have argued that compute is the most governable input to AI development because it is physical, trackable, and produced by a concentrated supply chain. Physical infrastructure constraints amplify this governability: not only can you track who has compute, the total amount of compute is itself limited by physical bottlenecks.
+
+This creates what I call "alignment by infrastructure lag" — the physical substrate buys time for alignment research without requiring anyone to deliberately slow down. The window exists because:
+
+1. **Alignment research is not compute-constrained.** Theoretical alignment work, interpretability research, governance design, and evaluation methodology don't require frontier training clusters. They require researchers, ideas, and modest compute for experiments.
+
+2. **Deployment IS compute-constrained.** Deploying AI capabilities at scale (inference for billions of users, new training runs for frontier models) requires the physical infrastructure that is bottlenecked.
+
+3. **The mismatch favors alignment.** The activities that need more time (alignment research) can proceed unconstrained while the activities that create risk (capability scaling and deployment) are physically gated.
+
+## Challenges
+
+**Algorithmic progress may route around physical constraints.** If algorithmic efficiency improvements (halving required compute every 8-9 months per Epoch AI) compound faster than physical constraints bind, the governance window closes. A 10x capability jump may come from better algorithms on existing hardware, not from new hardware.
+
+**The window is temporary.** CoWoS alternatives may break the packaging bottleneck by 2027. HBM4 increases per-stack capacity. Nuclear and natural gas can eventually meet power demand. The 2-5 year window where these constraints bind most tightly is the window — not a permanent condition.
+
+**Geographic asymmetry.** Physical constraints are location-specific. If US infrastructure lags while other jurisdictions build faster, compute migrates to regions with fewer safety norms. The constraint doesn't reduce total AI capability — it shifts where capability develops. This is the strongest counter-argument and applies equally to deliberate slowdown proposals.
+
+**This is not a strategy — it's an observation.** The claim is that the window exists, not that it should be relied upon. Depending on infrastructure lag for alignment is like depending on traffic for punctuality — it might work but it's not a plan.
+
+---
+
+Relevant Notes:
+- [[technology advances exponentially but coordination mechanisms evolve linearly creating a widening gap]] — physical infrastructure constraints partially close this gap by slowing the exponential
+- [[safe AI development requires building alignment mechanisms before scaling capability]] — infrastructure lag creates a natural version of this ordering
+- [[compute export controls are the most impactful AI governance mechanism but target geopolitical competition not safety leaving capability development unconstrained]] — physical constraints complement export controls by limiting total compute regardless of who controls it
+- [[the alignment tax creates a structural race to the bottom because safety training costs capability and rational competitors skip it]] — infrastructure constraints apply to all competitors equally, unlike voluntary safety commitments
+
+Topics:
+- [[domains/ai-alignment/_map]]
--- a/domains/ai-alignment/the
+++ b/domains/ai-alignment/the
@ -0,0 +1,62 @@
+---
+type: claim
+domain: ai-alignment
+description: "As inference grows from ~33% to ~66% of AI compute by 2026, the hardware landscape shifts from NVIDIA-monopolized centralized training clusters to diverse distributed inference on ARM, custom ASICs, and edge devices — changing who can deploy AI capability and how governable deployment is"
+confidence: experimental
+source: "Deloitte 2026 inference projections, Epoch AI compute trends, ARM Neoverse inference benchmarks, industry analysis of training vs inference economics"
+created: 2026-03-24
+depends_on:
+  - "three paths to superintelligence exist but only collective superintelligence preserves human agency"
+  - "collective superintelligence is the alternative to monolithic AI controlled by a few"
+challenged_by:
+  - "NVIDIA's inference optimization (TensorRT, Blackwell transformer engine) may maintain GPU dominance even for inference"
+  - "Open-weight model proliferation is a greater driver of distribution than hardware diversity"
+  - "Inference at scale (serving billions of users) still requires massive centralized infrastructure"
+secondary_domains:
+  - collective-intelligence
+---
+
+# The training-to-inference shift structurally favors distributed AI architectures because inference optimizes for power efficiency and cost-per-token where diverse hardware competes while training optimizes for raw throughput where NVIDIA monopolizes
+
+AI compute is undergoing a structural shift from training-dominated to inference-dominated workloads. Training accounted for roughly two-thirds of AI compute in 2023; by 2026, inference is projected to consume approximately two-thirds. This reversal changes the competitive landscape for AI hardware and, consequently, who controls AI capability deployment.
+
+## The economic logic
+
+Training optimizes for raw throughput — the largest, most power-hungry chips in the biggest clusters win. This favors NVIDIA's monopoly position: CUDA ecosystem lock-in, InfiniBand networking for multi-node training, and CoWoS packaging allocation that gates how many competing accelerators can ship. Training a frontier model requires concentrated capital ($100M+), concentrated hardware (thousands of GPUs), and concentrated power (100+ MW). Few organizations can do this.
+
+Inference optimizes differently: cost-per-token, latency, and power efficiency. These metrics open the field to diverse hardware architectures. ARM-based processors (Graviton4, Axion, Grace) compete on power efficiency. Custom ASICs (Google TPU, Amazon Trainium, Meta MTIA) optimize for specific model architectures. Edge devices run smaller models locally. The competitive landscape for inference is fundamentally more diverse than for training.
+
+Inference can account for 80-90% of the lifetime cost of a production AI system — it runs continuously while training is periodic. As inference dominates economics, the hardware that wins inference shapes the industry structure.
+
+## Governance implications
+
+Training's concentration makes it governable. A small number of organizations with identifiable hardware in identifiable locations perform frontier training. Compute governance proposals (Heim et al., GovAI) leverage this concentration: reporting thresholds for large training runs, KYC for cloud compute, hardware-based monitoring.
+
+Inference's distribution makes it harder to govern. Once a model is trained and weights are distributed (open-weight models), inference capability distributes to anyone with sufficient hardware — which, for inference, is much more accessible than for training. The governance surface area expands from dozens of training clusters to millions of inference endpoints.
+
+This creates a structural tension: the same shift that favors distributed AI architectures (good for avoiding monolithic control) also makes AI deployment harder to monitor and regulate (challenging for safety oversight). The governance implications of this shift are underexplored — the existing discourse treats inference economics as a business question, not a governance question.
+
+## Connection to collective intelligence
+
+The inference shift is directionally favorable for collective intelligence architectures. If inference can run on diverse, distributed hardware, then multi-agent systems with heterogeneous hardware become architecturally natural rather than forced. This is relevant to our claim that [[collective superintelligence is the alternative to monolithic AI controlled by a few]] — the physical infrastructure is moving in a direction that makes collective architectures more viable.
+
+However, this does not guarantee distributed outcomes. NVIDIA's inference optimization (TensorRT-LLM, Blackwell's FP4 transformer engine) aims to maintain GPU dominance even for inference. And inference at scale (serving billions of users) still requires substantial centralized infrastructure — the distribution advantage applies most strongly at the edge and for specialized deployments.
+
+## Challenges
+
+**NVIDIA may hold inference too.** NVIDIA's vertical integration strategy (CUDA + TensorRT + full-rack inference solutions) is designed to prevent the inference shift from eroding their position. If NVIDIA captures inference as effectively as training, the governance implications of the shift are muted.
+
+**Open weights matter more than hardware diversity.** The distribution of AI capability may depend more on model weight availability (open vs. closed) than on hardware diversity. If frontier models remain closed, hardware diversity at the inference layer doesn't distribute frontier capability.
+
+**The claim is experimental, not likely.** The inference shift is a measured trend, but its governance implications are projected, not observed. The claim connects an economic shift to a governance conclusion — the connection is structural but hasn't been tested.
+
+---
+
+Relevant Notes:
+- [[collective superintelligence is the alternative to monolithic AI controlled by a few]] — the inference shift makes this architecturally more viable
+- [[compute export controls are the most impactful AI governance mechanism but target geopolitical competition not safety leaving capability development unconstrained]] — export controls target training compute; inference compute is harder to control
+- [[technology advances exponentially but coordination mechanisms evolve linearly creating a widening gap]] — the inference shift widens this gap by distributing capability faster than governance can adapt
+- [[economic forces push humans out of every cognitive loop where output quality is independently verifiable because human-in-the-loop is a cost that competitive markets eliminate]] — inference cost competition accelerates this dynamic
+
+Topics:
+- [[domains/ai-alignment/_map]]
--- a/inbox/archive/2026-03-24-theseus-compute-infrastructure-research.md
+++ b/inbox/archive/2026-03-24-theseus-compute-infrastructure-research.md
@ -0,0 +1,66 @@
+---
+type: source
+title: "AI Compute Infrastructure Research Sessions — ARM, NVIDIA, TSMC"
+author: "Theseus (research agent synthesis)"
+url: n/a
+date: 2026-03-24
+domain: ai-alignment
+intake_tier: research-task
+rationale: "Cory directed research into physical infrastructure enabling AI — ARM strategy, NVIDIA dominance/moat, TSMC supply chain chokepoints. Goal: understand compute governance implications for alignment."
+proposed_by: "Cory (via Theseus)"
+format: report
+status: processing
+processed_by: theseus
+tags: [compute-governance, semiconductors, supply-chain, power-constraints, inference-shift]
+notes: "Compiled from 5 research agent sessions. VERIFICATION NEEDED: (1) NVIDIA-Groq acquisition ($20B) — UNVERIFIED, (2) OpenAI-AMD 10% stake — UNVERIFIED, (3) Meta MTIA 4 generations at 6-month cadence — needs confirmation. Structural arguments high-confidence; specific numbers need manual verification."
+flagged_for_astra:
+  - "Power constraints on datacenter scaling — overlaps energy domain"
+  - "TSMC geographic diversification — manufacturing domain"
+  - "CoWoS packaging bottleneck — manufacturing domain"
+cross_domain_flags:
+  - "Rio: NVIDIA vertical integration follows attractor state pattern"
+  - "Leo: Taiwan concentration as civilizational single point of failure"
+  - "Astra: Nuclear revival for AI power, semiconductor supply chain"
+---
+
+# AI Compute Infrastructure Research — Synthesis
+
+Research compiled from 5 agent sessions on 2026-03-24. Three companies studied: ARM Holdings, NVIDIA, TSMC. Plus gap-filling research on compute governance discourse and power constraints.
+
+## Key Structural Findings
+
+### 1. Three chokepoints gate AI scaling
+CoWoS advanced packaging (TSMC near-monopoly, sold out through 2026), HBM memory (3-vendor oligopoly, all sold out through 2026), and power/electricity (5-10 year build cycles vs 1-2 year chip cycles). The bottleneck is NOT chip design.
+
+### 2. NVIDIA's moat is the full stack
+CUDA ecosystem (4M+ developers) + networking (Mellanox/InfiniBand) + full-rack solutions (GB200 NVL72) + packaging allocation (60%+ of CoWoS). Vertical integration following the "own the scarce complement" pattern.
+
+### 3. The inference shift redistributes AI capability
+Training ~33% of compute (2023) → inference projected ~66% by 2026. Training requires centralized NVIDIA clusters; inference runs on diverse, power-efficient hardware. Structurally favors distributed architectures.
+
+### 4. ARM's position is unique
+Doesn't compete with NVIDIA — provides the CPU substrate everyone builds on. Licensing model means revenue from every hyperscaler's custom chip program. Power efficiency advantage aligns with inference shift.
+
+### 5. TSMC is the single largest physical vulnerability
+~92% of advanced logic chips (7nm and below). Geographic diversification underway (Arizona 92% yield) but most advanced processes Taiwan-first through 2027-2028.
+
+### 6. Power may physically bound capability scaling
+Projected 8-9% of US electricity by 2030 for datacenters. Nuclear deals cover 2-3 GW near-term against 25-30 GW needed. Grid interconnection averages 5+ years.
+
+## Compute Governance Discourse Landscape
+
+| Area | Maturity | Key Sources |
+|------|----------|------------|
+| Compute governance | High | Heim/GovAI (Sastry et al. 2024), Shavit 2023 (compute monitoring) |
+| Compute trends | High | Epoch AI (Sevilla et al.), training compute doubling every 9-10 months |
+| Energy constraints | Medium | IEA, Goldman Sachs April 2024, de Vries 2023 in Joule |
+| Supply chain concentration | Medium-High | Chris Miller "Chip War", CSET Georgetown, RAND |
+| Inference shift + governance | LOW — genuine gap | Fragmented discourse, no systematic treatment |
+| Export controls as alignment | Medium | Gregory Allen CSIS, Heim/Fist "Secure Governable Chips" |
+
+## UNVERIFIED Claims (DO NOT extract without confirmation)
+- NVIDIA acquired Groq for $20B (Dec 2025)
+- OpenAI took 10% stake in AMD
+- Meta MTIA releasing 4 chip generations at 6-month cadence
+- ARM Graviton4 "168% higher token throughput" vs AMD EPYC
+- Specific market share percentages (vary by methodology)