From 334a319b91ac2209d177b2a4cbaf14faa7287f91 Mon Sep 17 00:00:00 2001
From: m3taversal <m3taversal@gmail.com>
Date: Tue, 31 Mar 2026 10:47:40 +0100
Subject: [PATCH] theseus: add evaluator self-review prevention section
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- What: Codifies that Leo cannot evaluate his own proposals
- Why: Leo flagged the gap — integrity layer must be constrained by the same principle it enforces
- Details: Min 2 domain agent reviews, second-model pass still runs, Cory has veto authority

Pentagon-Agent: Theseus <46864DD4-DA71-4719-A1B4-68F7C55854D3>
---
 ops/multi-model-eval-architecture.md | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/ops/multi-model-eval-architecture.md b/ops/multi-model-eval-architecture.md
index 37233635..45d0c0c8 100644
--- a/ops/multi-model-eval-architecture.md
+++ b/ops/multi-model-eval-architecture.md
@@ -171,6 +171,17 @@ From NLAH paper (Pan et al.): verification layers can optimize for locally check
 5. **Multi-model eval integration** — OpenRouter connection, rubric sharing, disagreement queue.
 6. **Self-upgrade eval criteria** — codified in eval workflow, triggered by 3-strikes pattern.
 
+## Evaluator Self-Review Prevention
+
+When Leo proposes claims (cross-domain synthesis, foundations-level):
+- Leo cannot be the evaluator on his own proposals
+- Minimum 2 domain agent reviews required
+- Every domain touched must have a reviewer from that domain
+- The second-model eval pass still runs (provides the external check)
+- Cory has veto (rollback) authority as final backstop
+
+This closes the obvious gap: the spec defines the integrity layer but doesn't protect against the integrity layer's own blind spots. The constraint enforcement principle must apply to the constrainer too.
+
 ## Design Principle
 
 The constraint enforcement layer must be **outside** the agent being constrained. That's why multi-model eval matters, why Leo shouldn't eval his own proposals, and why policy-as-code runs in CI, not in the agent's own process. As agents get more capable, the integrity layer gets more important, not less.