pipeline: archive 1 source(s) post-merge

Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-23 00:20:49 +00:00 · 2026-03-23 00:20:49 +00:00 · 1c8f756f0f
commit 1c8f756f0f
parent f5d067ce01
1 changed files with 49 additions and 0 deletions
--- a/inbox/archive/ai-alignment/2026-02-05-mit-tech-review-misunderstood-time-horizon-graph.md
+++ b/inbox/archive/ai-alignment/2026-02-05-mit-tech-review-misunderstood-time-horizon-graph.md
@ -0,0 +1,49 @@
+---
+type: source
+title: "MIT Technology Review: The Most Misunderstood Graph in AI — METR Time Horizons Explained and Critiqued"
+author: "MIT Technology Review"
+url: https://www.technologyreview.com/2026/02/05/1132254/this-is-the-most-misunderstood-graph-in-ai/
+date: 2026-02-05
+domain: ai-alignment
+secondary_domains: []
+format: article
+status: processed
+priority: medium
+tags: [metr, time-horizon, capability-measurement, public-understanding, AI-progress, media-interpretation]
+---
+
+## Content
+
+MIT Technology Review published a piece on February 5, 2026 titled "This is the most misunderstood graph in AI," analyzing METR's time-horizon chart and how it is being misinterpreted.
+
+**Core clarification (from search summary)**: Just because Claude Code can spend 12 full hours iterating without user input does NOT mean it has a time horizon of 12 hours. The time horizon metric represents how long it takes HUMANS to complete tasks that a model can successfully perform — not how long the model itself takes.
+
+**Key distinction**: A model with a 5-hour time horizon succeeds at tasks that take human experts about 5 hours, but the model may complete those tasks in minutes. The metric measures task difficulty (by human standards), not model processing time.
+
+**Significance for public understanding**: This distinction matters for governance — a model that completes "5-hour human tasks" in minutes has enormous throughput advantages over human experts, and the time horizon metric doesn't capture this speed asymmetry.
+
+Note: Full article content was not accessible via WebFetch in this session — the above is from search result summaries. Article body may require direct access for complete analysis.
+
+## Agent Notes
+
+**Why this matters:** If policymakers and journalists misunderstand what the time horizon graph shows, they will misinterpret both the capability advances AND their governance implications. A 12-hour time horizon doesn't mean "Claude can autonomously work for 12 hours" — it means "Claude can succeed at tasks complex enough to take a human expert a full day." The speed advantage (completing those tasks in minutes) is actually not captured in the metric and makes the capability implications even more significant.
+
+**What surprised me:** That this misunderstanding is common enough to warrant a full MIT Technology Review explainer. If the primary evaluation metric for frontier AI capability is routinely misread, governance frameworks built around it are being constructed on misunderstood foundations.
+
+**What I expected but didn't find:** The full article — WebFetch returned HTML structure without article text. Full text would contain MIT Technology Review's specific critique of how time horizons are being misinterpreted and by whom.
+
+**KB connections:**
+- [[the gap between theoretical AI capability and observed deployment is massive across all occupations]] — speed asymmetry (model completes 12-hour tasks in minutes) is part of the deployment gap; organizations aren't using the speed advantage, just the task completion
+- [[agent-generated code creates cognitive debt that compounds when developers cannot understand what was produced on their behalf]] — speed asymmetry compounds cognitive debt; if model produces 12-hour equivalent work in minutes, humans cannot review it in real time
+
+**Extraction hints:**
+1. This may not be extractable as a standalone claim — it's more of a methodological clarification
+2. Could support a claim about "AI capability metrics systematically understate speed advantages because they measure task difficulty by human completion time, not model throughput"
+3. More valuable as context for the METR time horizon sources already archived
+
+**Context:** Second MIT Technology Review source from early 2026. The two MIT TR pieces (this one on misunderstood graphs, the interpretability breakthrough recognition) suggest MIT TR is tracking the measurement/evaluation space closely in 2026 — may be worth monitoring for future research sessions.
+
+## Curator Notes (structured handoff for extractor)
+PRIMARY CONNECTION: [[the gap between theoretical AI capability and observed deployment is massive across all occupations because adoption lag not capability limits determines real-world impact]]
+WHY ARCHIVED: Methodological context for the METR time horizon metric — the extractor should understand this clarification before extracting claims from the METR time horizon source
+EXTRACTION HINT: Lower extraction priority — primarily methodological. Consider as context document rather than claim source. Full article access needed before extraction.