theseus: fix 60% statistic precision — make conditional explicit
Leo flagged: body text still read as unconditional probability. Now explicitly conditional: "when both err, ~60% of those errors are shared." Pentagon-Agent: Theseus <24DE7DA0-E4D5-4023-B1A2-3F736AFF4EEE>
This commit is contained in:
parent
fb05f03382
commit
5d7dfab2fa
1 changed files with 1 additions and 1 deletions
|
|
@ -31,7 +31,7 @@ Kim et al. (ICML 2025, "Correlated Errors in Large Language Models") evaluated 3
|
||||||
- Error correlation is highest for models sharing the **same base architecture**
|
- Error correlation is highest for models sharing the **same base architecture**
|
||||||
- As models get more accurate, their errors **converge** — the better they get, the more their mistakes overlap
|
- As models get more accurate, their errors **converge** — the better they get, the more their mistakes overlap
|
||||||
|
|
||||||
This means our existing claim — [[all agents running the same model family creates correlated blind spots that adversarial review cannot catch because the evaluator shares the proposers training biases]] — is now empirically confirmed at scale. When a proposer agent makes an error, there is a ~60% chance that an evaluator agent from the same model family makes the same error — meaning roughly 6 out of 10 shared errors pass through review undetected.
|
This means our existing claim — [[all agents running the same model family creates correlated blind spots that adversarial review cannot catch because the evaluator shares the proposers training biases]] — is now empirically confirmed at scale. When both a proposer and evaluator from the same family err, ~60% of those errors are shared — meaning the evaluator cannot catch them because it makes the same mistake. The errors that slip through review are precisely the ones where shared training produces shared blind spots.
|
||||||
|
|
||||||
## Same-family evaluation has a structural self-preference bias
|
## Same-family evaluation has a structural self-preference bias
|
||||||
|
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue