teleo-codex/skills/self-audit.md
Teleo Agents 064cf969ad auto-fix: strip 23 broken wiki links
Pipeline auto-fixer: removed [[ ]] brackets from links
that don't resolve to existing claims in the knowledge base.
2026-03-16 12:49:36 +00:00

7 KiB

Skill: Self-Audit

Periodic self-examination of an agent's knowledge base for inconsistencies, weaknesses, and drift. Every agent runs this on their own domain.

When to Use

  • Every 50 claims added to your domain (condition-based trigger)
  • Monthly if claim volume is low
  • After a major belief update (cascade from upstream claim changes)
  • When preparing to publish positions (highest-stakes output deserves freshest audit)
  • On request from Leo or Cory

Principle: Detection, Not Remediation

Self-audit is read-only. You detect problems and report them. You do NOT auto-fix.

Fixes go through the standard PR process. This prevents the over-automation failure mode where silent corrections introduce new errors. The audit produces a report; the report drives PRs.

Process

Phase 1: Structural Scan (deterministic, automated)

Run these checks on all claims in your domain (domains/{your-domain}/):

1. Schema compliance

  • Every file has required frontmatter: type, domain, description, confidence, source, created
  • confidence is one of: proven, likely, experimental, speculative
  • domain matches the folder it lives in
  • Description adds information beyond the title (not a restatement)

2. Orphan detection

  • Build incoming-link index: for each claim, which other claims link TO it via title
  • Claims with 0 incoming links and created > 7 days ago are orphans
  • Classify: "leaf contributor" (has outgoing links, no incoming) vs "truly isolated" (no links either direction)

3. Link health

  • Every wiki link in the body should resolve to an actual file
  • Dangling links = either the target was renamed/deleted, or the link is aspirational
  • Report: list of broken links with the file they appear in

4. Staleness check

  • Claims older than 180 days in fast-moving domains (health, ai-alignment, internet-finance)
  • Claims older than 365 days in slower domains (cultural-dynamics, critical-systems)
  • Cross-reference with git log: a claim file modified recently (enriched, updated) is not stale even if created is old

5. Duplicate detection

  • Compare claim titles pairwise for semantic similarity
  • Flag pairs where titles assert nearly the same thing with different wording
  • This catches extraction drift — the same insight extracted from different sources as separate claims

Phase 2: Epistemic Self-Audit (LLM-assisted, requires judgment)

Load your claims in batches (context window management — don't load all 50+ at once).

6. Contradiction scan

  • Load claims in groups of 15-20
  • For each group, ask: "Do any of these claims contradict or tension with each other without acknowledging it?"
  • Tensions are fine if explicit (challenged_by field, or acknowledged in the body). UNACKNOWLEDGED tensions are the bug.
  • Cross-check: load claims that share wiki-link targets — these are most likely to have hidden tensions

7. Confidence calibration audit

  • For each proven claim: does the body contain empirical evidence (RCTs, meta-analyses, large-N studies, mathematical proofs)? If not, it's overconfident.
  • For each speculative claim: does the body actually contain substantial evidence that might warrant upgrading to experimental?
  • For likely claims: is there counter-evidence elsewhere in the KB? If so, is it acknowledged?

8. Belief grounding check

  • Read agents/{your-name}/beliefs.md
  • For each belief, verify the depends_on claims:
    • Do they still exist? (not deleted or archived)
    • Has their confidence changed since the belief was last evaluated?
    • Have any been challenged with substantive counter-evidence?
  • Flag beliefs where supporting claims have shifted but the belief hasn't been re-evaluated

9. Gap identification

  • Map your claims by subtopic. Where do you have single claims that should be clusters?
  • Check adjacent domains: what claims in other domains reference your domain but have no corresponding claim in your territory?
  • Check your beliefs: which beliefs have the thinnest evidence base (fewest supporting claims)?
  • Rank gaps by impact: gaps that affect active positions > gaps that affect beliefs > gaps in coverage

10. Cross-domain connection audit

  • What percentage of your claims link to claims in other domains?
  • Healthy range: 15-30%. Below 15% = siloed. Above 30% = possibly under-grounded in own domain.
  • Which other domains SHOULD you connect to but don't? (Based on your beliefs and identity)

Phase 3: Report

Produce a structured report. Format:

# Self-Audit Report: {Agent Name}
**Date:** YYYY-MM-DD
**Domain:** {domain}
**Claims audited:** N
**Overall status:** healthy | warning | critical

## Structural Findings
- Schema violations: N (list)
- Orphans: N (list with classification)
- Broken links: N (list)
- Stale claims: N (list with recommended action)
- Potential duplicates: N (list pairs)

## Epistemic Findings
- Unacknowledged contradictions: N (list claim pairs with the tension)
- Confidence miscalibrations: N (list with recommended adjustment)
- Belief grounding issues: N (list beliefs with shifted dependencies)

## Knowledge Gaps (ranked by impact)
1. {Gap description} — affects belief/position X
2. {Gap description} — affects belief/position Y

## Cross-Domain Health
- Linkage ratio: X%
- Missing connections: {domains that should be linked but aren't}

## Recommended Actions (prioritized)
1. {Most impactful fix — usually an unacknowledged contradiction or belief grounding issue}
2. {Second priority}
3. ...

Phase 4: Act on Findings

  • Contradictions and miscalibrations → create PRs to fix (highest priority)
  • Orphans → add incoming links from related claims (batch into one PR)
  • Gaps → publish as frontiers in agents/{your-name}/frontier.md (invites contribution)
  • Stale claims → research whether the landscape has changed, update or challenge
  • Belief grounding issues → trigger belief re-evaluation (may cascade to positions)

What Self-Audit Does NOT Do

  • Does not evaluate whether claims are TRUE (that's the evaluate skill + domain expertise)
  • Does not modify any files (detection only)
  • Does not audit other agents' domains (each agent audits their own)
  • Does not replace Leo's cross-domain evaluation (self-audit is inward-facing)

Relationship to Other Skills

  • evaluate.md — evaluates incoming claims. Self-audit evaluates existing claims.
  • cascade.md — propagates changes through the dependency chain. Self-audit identifies WHERE cascades are needed.
  • learn-cycle.md — processes new information. Self-audit reviews accumulated knowledge.
  • synthesize.md — creates cross-domain connections. Self-audit measures whether enough connections exist.

Frequency Guidelines

Domain velocity Audit trigger Expected duration
Fast (health, AI, finance) Every 50 claims or monthly 1-2 hours
Medium (entertainment, space) Every 50 claims or quarterly 1 hour
Slow (cultural dynamics, critical systems) Every 50 claims or biannually 45 min