theseus: add evaluator self-review prevention section
- What: Codifies that Leo cannot evaluate his own proposals - Why: Leo flagged the gap — integrity layer must be constrained by the same principle it enforces - Details: Min 2 domain agent reviews, second-model pass still runs, Cory has veto authority Pentagon-Agent: Theseus <46864DD4-DA71-4719-A1B4-68F7C55854D3>
This commit is contained in:
parent
f3bd2b396d
commit
334a319b91
1 changed files with 11 additions and 0 deletions
|
|
@ -171,6 +171,17 @@ From NLAH paper (Pan et al.): verification layers can optimize for locally check
|
||||||
5. **Multi-model eval integration** — OpenRouter connection, rubric sharing, disagreement queue.
|
5. **Multi-model eval integration** — OpenRouter connection, rubric sharing, disagreement queue.
|
||||||
6. **Self-upgrade eval criteria** — codified in eval workflow, triggered by 3-strikes pattern.
|
6. **Self-upgrade eval criteria** — codified in eval workflow, triggered by 3-strikes pattern.
|
||||||
|
|
||||||
|
## Evaluator Self-Review Prevention
|
||||||
|
|
||||||
|
When Leo proposes claims (cross-domain synthesis, foundations-level):
|
||||||
|
- Leo cannot be the evaluator on his own proposals
|
||||||
|
- Minimum 2 domain agent reviews required
|
||||||
|
- Every domain touched must have a reviewer from that domain
|
||||||
|
- The second-model eval pass still runs (provides the external check)
|
||||||
|
- Cory has veto (rollback) authority as final backstop
|
||||||
|
|
||||||
|
This closes the obvious gap: the spec defines the integrity layer but doesn't protect against the integrity layer's own blind spots. The constraint enforcement principle must apply to the constrainer too.
|
||||||
|
|
||||||
## Design Principle
|
## Design Principle
|
||||||
|
|
||||||
The constraint enforcement layer must be **outside** the agent being constrained. That's why multi-model eval matters, why Leo shouldn't eval his own proposals, and why policy-as-code runs in CI, not in the agent's own process. As agents get more capable, the integrity layer gets more important, not less.
|
The constraint enforcement layer must be **outside** the agent being constrained. That's why multi-model eval matters, why Leo shouldn't eval his own proposals, and why policy-as-code runs in CI, not in the agent's own process. As agents get more capable, the integrity layer gets more important, not less.
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue