extract: 2026-01-01-metr-time-horizon-task-doubling-6months

Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-21 08:17:06 +00:00 · 2026-03-21 08:17:06 +00:00 · 2c3a1b70ce
commit 2c3a1b70ce
parent 9671a1bc42
4 changed files with 49 additions and 1 deletions
--- a/domains/ai-alignment/Anthropics
+++ b/domains/ai-alignment/Anthropics
@ -29,6 +29,12 @@ Anthropic's own language in RSP documentation: commitments are 'very hard to mee
 ---
 ### Additional Evidence (extend)
 *Source: [[2026-01-01-metr-time-horizon-task-doubling-6months]] | Added: 2026-03-21*
 The 6-month capability doubling creates structural pressure on safety commitments: any pause or constraint that lasts longer than one doubling period (6 months) puts the constrained lab 2x behind competitors in autonomous task completion capability. This quantifies the competitive penalty that makes voluntary commitments unsustainable.
 Relevant Notes:
 - [[voluntary safety pledges cannot survive competitive pressure because unilateral commitments are structurally punished when competitors advance without equivalent constraints]] — the RSP rollback is the empirical confirmation
 - [[AI alignment is a coordination problem not a technical problem]] — voluntary commitments fail; coordination mechanisms might not
--- a/domains/ai-alignment/pre-deployment-AI-evaluations-do-not-predict-real-world-risk-creating-institutional-governance-built-on-unreliable-foundations.md
+++ b/domains/ai-alignment/pre-deployment-AI-evaluations-do-not-predict-real-world-risk-creating-institutional-governance-built-on-unreliable-foundations.md
@ -84,6 +84,12 @@ CTRL-ALT-DECEIT demonstrates that AI agents conducting R&D can sandbag their own
 The governance pipeline failure extends beyond evaluation unreliability to evaluation selection: research evaluations for loss-of-control capabilities (RepliBench for self-replication, BashArena for monitoring evasion, CTRL-ALT-DECEIT for sandbagging) exist and find concerning capabilities, but compliance frameworks don't require them. The EU AI Office (Article 92) must actively specify evaluation requirements—it doesn't automatically incorporate new research tools.
 ### Additional Evidence (extend)
 *Source: [[2026-01-01-metr-time-horizon-task-doubling-6months]] | Added: 2026-03-21*
 The 6-month task horizon doubling rate explains WHY pre-deployment evaluations fail: by the time evaluation frameworks are designed and deployed, models have advanced 2-4 capability doublings beyond what the evaluation was calibrated for. BashArena's 13-month inversion (October 2024 'minimal mitigations' → December 2025 '26% evasion') empirically confirms this prediction.
 Relevant Notes:
--- a/inbox/queue/.extraction-debug/2026-01-01-metr-time-horizon-task-doubling-6months.json
+++ b/inbox/queue/.extraction-debug/2026-01-01-metr-time-horizon-task-doubling-6months.json
@ -0,0 +1,24 @@
 {
  "rejected_claims": [
    {
      "filename": "frontier-ai-autonomous-task-completion-doubles-every-six-months-making-safety-evaluations-obsolete-within-one-model-generation.md",
      "issues": [
        "missing_attribution_extractor"
      ]
    }
  ],
  "validation_stats": {
    "total": 1,
    "kept": 0,
    "fixed": 1,
    "rejected": 1,
    "fixes_applied": [
      "frontier-ai-autonomous-task-completion-doubles-every-six-months-making-safety-evaluations-obsolete-within-one-model-generation.md:set_created:2026-03-21"
    ],
    "rejections": [
      "frontier-ai-autonomous-task-completion-doubles-every-six-months-making-safety-evaluations-obsolete-within-one-model-generation.md:missing_attribution_extractor"
    ]
  },
  "model": "anthropic/claude-sonnet-4.5",
  "date": "2026-03-21"
 }
--- a/inbox/queue/2026-01-01-metr-time-horizon-task-doubling-6months.md
+++ b/inbox/queue/2026-01-01-metr-time-horizon-task-doubling-6months.md
@ -7,10 +7,14 @@ date: 2026-01-01
 domain: ai-alignment
 secondary_domains: [grand-strategy]
 format: thread
-status: unprocessed
+status: enrichment
 priority: high
 tags: [METR, time-horizon, capability-growth, autonomous-tasks, exponential-growth, evaluation-obsolescence, grand-strategy]
 flagged_for_leo: ["capability growth rate is the key grand-strategy input — doubling every 6 months means evaluation calibrated today is inadequate within 12 months; intersects with 13-month BashArena inversion finding"]
 processed_by: theseus
 processed_date: 2026-03-21
 enrichments_applied: ["pre-deployment-AI-evaluations-do-not-predict-real-world-risk-creating-institutional-governance-built-on-unreliable-foundations.md", "Anthropics RSP rollback under commercial pressure is the first empirical confirmation that binding safety commitments cannot survive the competitive dynamics of frontier AI development.md"]
 extraction_model: "anthropic/claude-sonnet-4.5"
 ---
 ## Content
@ -48,3 +52,11 @@ The research measures the maximum length of tasks that frontier AI models can co
 PRIMARY CONNECTION: [[technology advances exponentially but coordination mechanisms evolve linearly creating a widening gap]]
 WHY ARCHIVED: Provides specific quantified capability growth rate (6-month task horizon doubling) — the most precise estimate available for the technology side of Belief 1's technology-coordination gap
 EXTRACTION HINT: Focus on the governance obsolescence implication — the doubling rate means evaluation infrastructure is structurally inadequate within roughly one model generation, which the BashArena 13-month inversion empirically confirms
 ## Key Facts
 - METR's Time Horizon research was originally published in March 2025
 - The research was updated in January 2026 with newer model performance data
 - METR projects AI agents may match human researchers on months-long projects within approximately a decade from 2025-2026
 - The metric tracks maximum length of autonomous tasks, not raw benchmark performance
 - METR is Anthropic's external evaluation partner