Teleo Agents 064cf969ad auto-fix: strip 23 broken wiki links

Pipeline auto-fixer: removed [[ ]] brackets from links
that don't resolve to existing claims in the knowledge base.

2026-03-16 12:49:36 +00:00

7 KiB

Raw Blame History

Skill: Self-Audit

Periodic self-examination of an agent's knowledge base for inconsistencies, weaknesses, and drift. Every agent runs this on their own domain.

When to Use

Every 50 claims added to your domain (condition-based trigger)
Monthly if claim volume is low
After a major belief update (cascade from upstream claim changes)
When preparing to publish positions (highest-stakes output deserves freshest audit)
On request from Leo or Cory

Principle: Detection, Not Remediation

Self-audit is read-only. You detect problems and report them. You do NOT auto-fix.

Fixes go through the standard PR process. This prevents the over-automation failure mode where silent corrections introduce new errors. The audit produces a report; the report drives PRs.

Process

Phase 1: Structural Scan (deterministic, automated)

Run these checks on all claims in your domain (domains/{your-domain}/):

1. Schema compliance

Every file has required frontmatter: type, domain, description, confidence, source, created
confidence is one of: proven, likely, experimental, speculative
domain matches the folder it lives in
Description adds information beyond the title (not a restatement)

2. Orphan detection

Build incoming-link index: for each claim, which other claims link TO it via title
Claims with 0 incoming links and created > 7 days ago are orphans
Classify: "leaf contributor" (has outgoing links, no incoming) vs "truly isolated" (no links either direction)

3. Link health

Every wiki link in the body should resolve to an actual file
Dangling links = either the target was renamed/deleted, or the link is aspirational
Report: list of broken links with the file they appear in

4. Staleness check

Claims older than 180 days in fast-moving domains (health, ai-alignment, internet-finance)
Claims older than 365 days in slower domains (cultural-dynamics, critical-systems)
Cross-reference with git log: a claim file modified recently (enriched, updated) is not stale even if created is old

5. Duplicate detection

Compare claim titles pairwise for semantic similarity
Flag pairs where titles assert nearly the same thing with different wording
This catches extraction drift — the same insight extracted from different sources as separate claims

Phase 2: Epistemic Self-Audit (LLM-assisted, requires judgment)

Load your claims in batches (context window management — don't load all 50+ at once).

6. Contradiction scan

Load claims in groups of 15-20
For each group, ask: "Do any of these claims contradict or tension with each other without acknowledging it?"
Tensions are fine if explicit (challenged_by field, or acknowledged in the body). UNACKNOWLEDGED tensions are the bug.
Cross-check: load claims that share wiki-link targets — these are most likely to have hidden tensions

7. Confidence calibration audit

For each proven claim: does the body contain empirical evidence (RCTs, meta-analyses, large-N studies, mathematical proofs)? If not, it's overconfident.
For each speculative claim: does the body actually contain substantial evidence that might warrant upgrading to experimental?
For likely claims: is there counter-evidence elsewhere in the KB? If so, is it acknowledged?

8. Belief grounding check

Read agents/{your-name}/beliefs.md
For each belief, verify the depends_on claims:
- Do they still exist? (not deleted or archived)
- Has their confidence changed since the belief was last evaluated?
- Have any been challenged with substantive counter-evidence?
Flag beliefs where supporting claims have shifted but the belief hasn't been re-evaluated

9. Gap identification

Map your claims by subtopic. Where do you have single claims that should be clusters?
Check adjacent domains: what claims in other domains reference your domain but have no corresponding claim in your territory?
Check your beliefs: which beliefs have the thinnest evidence base (fewest supporting claims)?
Rank gaps by impact: gaps that affect active positions > gaps that affect beliefs > gaps in coverage

10. Cross-domain connection audit

What percentage of your claims link to claims in other domains?
Healthy range: 15-30%. Below 15% = siloed. Above 30% = possibly under-grounded in own domain.
Which other domains SHOULD you connect to but don't? (Based on your beliefs and identity)

Phase 3: Report

Produce a structured report. Format:

# Self-Audit Report: {Agent Name}
**Date:** YYYY-MM-DD
**Domain:** {domain}
**Claims audited:** N
**Overall status:** healthy | warning | critical

## Structural Findings
- Schema violations: N (list)
- Orphans: N (list with classification)
- Broken links: N (list)
- Stale claims: N (list with recommended action)
- Potential duplicates: N (list pairs)

## Epistemic Findings
- Unacknowledged contradictions: N (list claim pairs with the tension)
- Confidence miscalibrations: N (list with recommended adjustment)
- Belief grounding issues: N (list beliefs with shifted dependencies)

## Knowledge Gaps (ranked by impact)
1. {Gap description} — affects belief/position X
2. {Gap description} — affects belief/position Y

## Cross-Domain Health
- Linkage ratio: X%
- Missing connections: {domains that should be linked but aren't}

## Recommended Actions (prioritized)
1. {Most impactful fix — usually an unacknowledged contradiction or belief grounding issue}
2. {Second priority}
3. ...

Phase 4: Act on Findings

Contradictions and miscalibrations → create PRs to fix (highest priority)
Orphans → add incoming links from related claims (batch into one PR)
Gaps → publish as frontiers in agents/{your-name}/frontier.md (invites contribution)
Stale claims → research whether the landscape has changed, update or challenge
Belief grounding issues → trigger belief re-evaluation (may cascade to positions)

What Self-Audit Does NOT Do

Does not evaluate whether claims are TRUE (that's the evaluate skill + domain expertise)
Does not modify any files (detection only)
Does not audit other agents' domains (each agent audits their own)
Does not replace Leo's cross-domain evaluation (self-audit is inward-facing)

Relationship to Other Skills

evaluate.md — evaluates incoming claims. Self-audit evaluates existing claims.
cascade.md — propagates changes through the dependency chain. Self-audit identifies WHERE cascades are needed.
learn-cycle.md — processes new information. Self-audit reviews accumulated knowledge.
synthesize.md — creates cross-domain connections. Self-audit measures whether enough connections exist.

Frequency Guidelines

Domain velocity	Audit trigger	Expected duration
Fast (health, AI, finance)	Every 50 claims or monthly	1-2 hours
Medium (entertainment, space)	Every 50 claims or quarterly	1 hour
Slow (cultural dynamics, critical systems)	Every 50 claims or biannually	45 min

7 KiB Raw Blame History