# Skill: Self-Audit Periodic self-examination of an agent's knowledge base for inconsistencies, weaknesses, and drift. Every agent runs this on their own domain. ## When to Use - Every 50 claims added to your domain (condition-based trigger) - Monthly if claim volume is low - After a major belief update (cascade from upstream claim changes) - When preparing to publish positions (highest-stakes output deserves freshest audit) - On request from Leo or Cory ## Principle: Detection, Not Remediation Self-audit is read-only. You detect problems and report them. You do NOT auto-fix. Fixes go through the standard PR process. This prevents the over-automation failure mode where silent corrections introduce new errors. The audit produces a report; the report drives PRs. ## Process ### Phase 1: Structural Scan (deterministic, automated) Run these checks on all claims in your domain (`domains/{your-domain}/`): **1. Schema compliance** - Every file has required frontmatter: `type`, `domain`, `description`, `confidence`, `source`, `created` - `confidence` is one of: `proven`, `likely`, `experimental`, `speculative` - `domain` matches the folder it lives in - Description adds information beyond the title (not a restatement) **2. Orphan detection** - Build incoming-link index: for each claim, which other claims link TO it via `title` - Claims with 0 incoming links and created > 7 days ago are orphans - Classify: "leaf contributor" (has outgoing links, no incoming) vs "truly isolated" (no links either direction) **3. Link health** - Every `wiki link` in the body should resolve to an actual file - Dangling links = either the target was renamed/deleted, or the link is aspirational - Report: list of broken links with the file they appear in **4. Staleness check** - Claims older than 180 days in fast-moving domains (health, ai-alignment, internet-finance) - Claims older than 365 days in slower domains (cultural-dynamics, critical-systems) - Cross-reference with git log: a claim file modified recently (enriched, updated) is not stale even if `created` is old **5. Duplicate detection** - Compare claim titles pairwise for semantic similarity - Flag pairs where titles assert nearly the same thing with different wording - This catches extraction drift — the same insight extracted from different sources as separate claims ### Phase 2: Epistemic Self-Audit (LLM-assisted, requires judgment) Load your claims in batches (context window management — don't load all 50+ at once). **6. Contradiction scan** - Load claims in groups of 15-20 - For each group, ask: "Do any of these claims contradict or tension with each other without acknowledging it?" - Tensions are fine if explicit (`challenged_by` field, or acknowledged in the body). UNACKNOWLEDGED tensions are the bug. - Cross-check: load claims that share wiki-link targets — these are most likely to have hidden tensions **7. Confidence calibration audit** - For each `proven` claim: does the body contain empirical evidence (RCTs, meta-analyses, large-N studies, mathematical proofs)? If not, it's overconfident. - For each `speculative` claim: does the body actually contain substantial evidence that might warrant upgrading to `experimental`? - For `likely` claims: is there counter-evidence elsewhere in the KB? If so, is it acknowledged? **8. Belief grounding check** - Read `agents/{your-name}/beliefs.md` - For each belief, verify the `depends_on` claims: - Do they still exist? (not deleted or archived) - Has their confidence changed since the belief was last evaluated? - Have any been challenged with substantive counter-evidence? - Flag beliefs where supporting claims have shifted but the belief hasn't been re-evaluated **9. Gap identification** - Map your claims by subtopic. Where do you have single claims that should be clusters? - Check adjacent domains: what claims in other domains reference your domain but have no corresponding claim in your territory? - Check your beliefs: which beliefs have the thinnest evidence base (fewest supporting claims)? - Rank gaps by impact: gaps that affect active positions > gaps that affect beliefs > gaps in coverage **10. Cross-domain connection audit** - What percentage of your claims link to claims in other domains? - Healthy range: 15-30%. Below 15% = siloed. Above 30% = possibly under-grounded in own domain. - Which other domains SHOULD you connect to but don't? (Based on your beliefs and identity) ### Phase 3: Report Produce a structured report. Format: ```markdown # Self-Audit Report: {Agent Name} **Date:** YYYY-MM-DD **Domain:** {domain} **Claims audited:** N **Overall status:** healthy | warning | critical ## Structural Findings - Schema violations: N (list) - Orphans: N (list with classification) - Broken links: N (list) - Stale claims: N (list with recommended action) - Potential duplicates: N (list pairs) ## Epistemic Findings - Unacknowledged contradictions: N (list claim pairs with the tension) - Confidence miscalibrations: N (list with recommended adjustment) - Belief grounding issues: N (list beliefs with shifted dependencies) ## Knowledge Gaps (ranked by impact) 1. {Gap description} — affects belief/position X 2. {Gap description} — affects belief/position Y ## Cross-Domain Health - Linkage ratio: X% - Missing connections: {domains that should be linked but aren't} ## Recommended Actions (prioritized) 1. {Most impactful fix — usually an unacknowledged contradiction or belief grounding issue} 2. {Second priority} 3. ... ``` ### Phase 4: Act on Findings - **Contradictions and miscalibrations** → create PRs to fix (highest priority) - **Orphans** → add incoming links from related claims (batch into one PR) - **Gaps** → publish as frontiers in `agents/{your-name}/frontier.md` (invites contribution) - **Stale claims** → research whether the landscape has changed, update or challenge - **Belief grounding issues** → trigger belief re-evaluation (may cascade to positions) ## What Self-Audit Does NOT Do - Does not evaluate whether claims are TRUE (that's the evaluate skill + domain expertise) - Does not modify any files (detection only) - Does not audit other agents' domains (each agent audits their own) - Does not replace Leo's cross-domain evaluation (self-audit is inward-facing) ## Relationship to Other Skills - **evaluate.md** — evaluates incoming claims. Self-audit evaluates existing claims. - **cascade.md** — propagates changes through the dependency chain. Self-audit identifies WHERE cascades are needed. - **learn-cycle.md** — processes new information. Self-audit reviews accumulated knowledge. - **synthesize.md** — creates cross-domain connections. Self-audit measures whether enough connections exist. ## Frequency Guidelines | Domain velocity | Audit trigger | Expected duration | |----------------|--------------|-------------------| | Fast (health, AI, finance) | Every 50 claims or monthly | 1-2 hours | | Medium (entertainment, space) | Every 50 claims or quarterly | 1 hour | | Slow (cultural dynamics, critical systems) | Every 50 claims or biannually | 45 min |