teleo-codex/skills/evaluate.md
m3taversal e830fe4c5f Initial commit: Teleo Codex v1
Three-agent knowledge base (Leo, Rio, Clay) with:
- 177 claim files across core/ and foundations/
- 38 domain claims in internet-finance/
- 22 domain claims in entertainment/
- Agent soul documents (identity, beliefs, reasoning, skills)
- 14 positions across 3 agents
- Claim/belief/position schemas
- 6 shared skills
- Agent-facing CLAUDE.md operating manual

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 20:30:34 +00:00

76 lines
2.7 KiB
Markdown

# Skill: Evaluate
Multi-agent evaluation of proposed claims before they enter the shared knowledge base.
## When to Use
When candidate claims exist in inbox/proposals/ awaiting review.
## Process
### Step 1: Leo assigns evaluation
Leo reviews the proposed claim and identifies:
- Primary domain (which agent is the lead evaluator?)
- Secondary domains (which other agents should weigh in?)
- Urgency (time-sensitive? can wait for full review cycle?)
### Step 2: Domain agents evaluate
Each assigned agent reviews the claim against these criteria:
**Quality checks:**
1. Is this specific enough to disagree with?
2. Is the evidence traceable and verifiable?
3. Does the description add information beyond the title?
4. Is the confidence level appropriate for the evidence strength?
**Knowledge base checks:**
5. Does this duplicate an existing claim? (cite the existing one if so)
6. Does it contradict an existing claim? (if so, is the contradiction explicit and argued?)
7. Does it add genuine value the knowledge base doesn't already have?
8. Are wiki links pointing to real files?
**Domain-specific evaluation:**
9. Does this match the agent's understanding of the domain landscape?
10. Would this change any of the agent's current beliefs or positions?
11. Are there cross-domain implications other agents should know about?
### Step 3: Agents vote
Each evaluating agent submits one of:
- **Accept** — claim meets all quality criteria, add to knowledge base
- **Accept with changes** — good claim but needs specific modifications (list them)
- **Reject** — fails quality criteria (explain which ones and why)
- **Request more evidence** — interesting claim but insufficient evidence to accept
### Step 4: Leo synthesizes
- If consensus accept: merge into knowledge base
- If consensus reject: close with explanation
- If mixed: Leo synthesizes the disagreement
- Factual disagreement → identify what evidence would resolve it
- Perspective disagreement → both interpretations may be valid
- Quality concerns → specific changes needed
- If request more evidence: assign research task to relevant agent
### Step 5: Post-merge cascade check
After a claim is accepted:
- Does this affect any agent's beliefs? (check depends_on chains)
- Flag affected beliefs as needs_review
- Notify owning agents
## Output
- Evaluation record: which agents reviewed, how they voted, outcome
- Merged claim (if accepted) in domains/{domain}/
- Cascade flags (if applicable)
- Research tasks (if more evidence needed)
## Quality Gate
- Every rejection explains which criteria failed
- Every mixed vote gets Leo synthesis
- Cascade checks run on every accepted claim
- Evaluation record is preserved for transparency