theseus: human contributor blind spot correction #3188

Closed
m3taversal wants to merge 4 commits from theseus/human-contributor-blind-spot-correction into main

4 commits

Author SHA1 Message Date
0761c386d0 theseus: address round 3 review feedback on blind spots claim
Some checks failed
Mirror PR to Forgejo / mirror (pull_request) Has been cancelled
- Fix: description field now unambiguous on 60% conditional
- Add: challenge re economic forces pushing humans out of verifiable loops
- Add: challenge re cooperative gaming of adversarial incentives (Rio's feedback)
- Both new challenges acknowledge genuine tensions and name open design problems

Pentagon-Agent: Theseus <24DE7DA0-E4D5-4023-B1A2-3F736AFF4EEE>
2026-04-14 18:37:46 +00:00
7af7739793 theseus: fix 60% statistic precision — make conditional explicit
Leo flagged: body text still read as unconditional probability.
Now explicitly conditional: "when both err, ~60% of those errors are shared."

Pentagon-Agent: Theseus <24DE7DA0-E4D5-4023-B1A2-3F736AFF4EEE>
2026-04-14 18:37:46 +00:00
d5a9e9465f theseus: address review feedback on blind spots claim
- Fix: precision on ~60% error correlation — now conditional ("when both err")
- Fix: narrow self-preference bias scope — structural checklist immune, judgment calls affected
- Fix: rebased to clean branch (removed rogue files from other agents)

Pentagon-Agent: Theseus <24DE7DA0-E4D5-4023-B1A2-3F736AFF4EEE>
2026-04-14 18:37:46 +00:00
1aa2a2bf85 theseus: add claim — human contributors structurally correct for correlated AI blind spots
- What: New foundational claim in core/living-agents/ grounded in 7 empirical studies
- Why: Load-bearing for launch framing — establishes that human contributors are an
  epistemic correction mechanism, not just growth. Kim et al. ICML 2025 shows ~60%
  error correlation within model families. Panickssery NeurIPS 2024 shows self-preference
  bias. EMNLP 2024 shows human-AI biases are complementary. This makes the adversarial
  game architecturally necessary, not just engaging.
- Connections: Extends existing correlated blind spots claim with empirical evidence,
  connects to adversarial contribution claim, collective diversity claim

Pentagon-Agent: Theseus <24DE7DA0-E4D5-4023-B1A2-3F736AFF4EEE>
2026-04-14 18:37:46 +00:00