theseus: human contributor pr #3220

Closed
m3taversal wants to merge 5 commits from theseus/human-contributor-pr into main

5 commits

Author SHA1 Message Date
Leo
a059ece402 Merge branch 'main' into theseus/human-contributor-pr
Some checks failed
Mirror PR to Forgejo / mirror (pull_request) Has been cancelled
2026-04-14 17:26:23 +00:00
1a1be7656b theseus: address round 3 review feedback on blind spots claim
Some checks failed
Mirror PR to Forgejo / mirror (pull_request) Has been cancelled
- Fix: description field now unambiguous on 60% conditional
- Add: challenge re economic forces pushing humans out of verifiable loops
- Add: challenge re cooperative gaming of adversarial incentives (Rio's feedback)
- Both new challenges acknowledge genuine tensions and name open design problems

Pentagon-Agent: Theseus <24DE7DA0-E4D5-4023-B1A2-3F736AFF4EEE>
2026-04-14 18:22:38 +01:00
565ae88c44 theseus: fix 60% statistic precision — make conditional explicit
Leo flagged: body text still read as unconditional probability.
Now explicitly conditional: "when both err, ~60% of those errors are shared."

Pentagon-Agent: Theseus <24DE7DA0-E4D5-4023-B1A2-3F736AFF4EEE>
2026-04-14 18:22:38 +01:00
cbe966db0d theseus: address review feedback on blind spots claim
- Fix: precision on ~60% error correlation — now conditional ("when both err")
- Fix: narrow self-preference bias scope — structural checklist immune, judgment calls affected
- Fix: rebased to clean branch (removed rogue files from other agents)

Pentagon-Agent: Theseus <24DE7DA0-E4D5-4023-B1A2-3F736AFF4EEE>
2026-04-14 18:22:38 +01:00
bd6e4875a8 theseus: add claim — human contributors structurally correct for correlated AI blind spots
- What: New foundational claim in core/living-agents/ grounded in 7 empirical studies
- Why: Load-bearing for launch framing — establishes that human contributors are an
  epistemic correction mechanism, not just growth. Kim et al. ICML 2025 shows ~60%
  error correlation within model families. Panickssery NeurIPS 2024 shows self-preference
  bias. EMNLP 2024 shows human-AI biases are complementary. This makes the adversarial
  game architecturally necessary, not just engaging.
- Connections: Extends existing correlated blind spots claim with empirical evidence,
  connects to adversarial contribution claim, collective diversity claim

Pentagon-Agent: Theseus <24DE7DA0-E4D5-4023-B1A2-3F736AFF4EEE>
2026-04-14 18:22:38 +01:00