From 3fbb9d1b61999e394ef4a20a58a3768563930547 Mon Sep 17 00:00:00 2001 From: m3taversal Date: Mon, 16 Mar 2026 12:19:45 +0000 Subject: [PATCH] Auto: skills/self-audit.md | 1 file changed, 150 insertions(+) --- skills/self-audit.md | 150 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 150 insertions(+) create mode 100644 skills/self-audit.md diff --git a/skills/self-audit.md b/skills/self-audit.md new file mode 100644 index 00000000..3f384537 --- /dev/null +++ b/skills/self-audit.md @@ -0,0 +1,150 @@ +# Skill: Self-Audit + +Periodic self-examination of an agent's knowledge base for inconsistencies, weaknesses, and drift. Every agent runs this on their own domain. + +## When to Use + +- Every 50 claims added to your domain (condition-based trigger) +- Monthly if claim volume is low +- After a major belief update (cascade from upstream claim changes) +- When preparing to publish positions (highest-stakes output deserves freshest audit) +- On request from Leo or Cory + +## Principle: Detection, Not Remediation + +Self-audit is read-only. You detect problems and report them. You do NOT auto-fix. + +Fixes go through the standard PR process. This prevents the over-automation failure mode where silent corrections introduce new errors. The audit produces a report; the report drives PRs. + +## Process + +### Phase 1: Structural Scan (deterministic, automated) + +Run these checks on all claims in your domain (`domains/{your-domain}/`): + +**1. Schema compliance** +- Every file has required frontmatter: `type`, `domain`, `description`, `confidence`, `source`, `created` +- `confidence` is one of: `proven`, `likely`, `experimental`, `speculative` +- `domain` matches the folder it lives in +- Description adds information beyond the title (not a restatement) + +**2. Orphan detection** +- Build incoming-link index: for each claim, which other claims link TO it via `[[title]]` +- Claims with 0 incoming links and created > 7 days ago are orphans +- Classify: "leaf contributor" (has outgoing links, no incoming) vs "truly isolated" (no links either direction) + +**3. Link health** +- Every `[[wiki link]]` in the body should resolve to an actual file +- Dangling links = either the target was renamed/deleted, or the link is aspirational +- Report: list of broken links with the file they appear in + +**4. Staleness check** +- Claims older than 180 days in fast-moving domains (health, ai-alignment, internet-finance) +- Claims older than 365 days in slower domains (cultural-dynamics, critical-systems) +- Cross-reference with git log: a claim file modified recently (enriched, updated) is not stale even if `created` is old + +**5. Duplicate detection** +- Compare claim titles pairwise for semantic similarity +- Flag pairs where titles assert nearly the same thing with different wording +- This catches extraction drift — the same insight extracted from different sources as separate claims + +### Phase 2: Epistemic Self-Audit (LLM-assisted, requires judgment) + +Load your claims in batches (context window management — don't load all 50+ at once). + +**6. Contradiction scan** +- Load claims in groups of 15-20 +- For each group, ask: "Do any of these claims contradict or tension with each other without acknowledging it?" +- Tensions are fine if explicit (`challenged_by` field, or acknowledged in the body). UNACKNOWLEDGED tensions are the bug. +- Cross-check: load claims that share wiki-link targets — these are most likely to have hidden tensions + +**7. Confidence calibration audit** +- For each `proven` claim: does the body contain empirical evidence (RCTs, meta-analyses, large-N studies, mathematical proofs)? If not, it's overconfident. +- For each `speculative` claim: does the body actually contain substantial evidence that might warrant upgrading to `experimental`? +- For `likely` claims: is there counter-evidence elsewhere in the KB? If so, is it acknowledged? + +**8. Belief grounding check** +- Read `agents/{your-name}/beliefs.md` +- For each belief, verify the `depends_on` claims: + - Do they still exist? (not deleted or archived) + - Has their confidence changed since the belief was last evaluated? + - Have any been challenged with substantive counter-evidence? +- Flag beliefs where supporting claims have shifted but the belief hasn't been re-evaluated + +**9. Gap identification** +- Map your claims by subtopic. Where do you have single claims that should be clusters? +- Check adjacent domains: what claims in other domains reference your domain but have no corresponding claim in your territory? +- Check your beliefs: which beliefs have the thinnest evidence base (fewest supporting claims)? +- Rank gaps by impact: gaps that affect active positions > gaps that affect beliefs > gaps in coverage + +**10. Cross-domain connection audit** +- What percentage of your claims link to claims in other domains? +- Healthy range: 15-30%. Below 15% = siloed. Above 30% = possibly under-grounded in own domain. +- Which other domains SHOULD you connect to but don't? (Based on your beliefs and identity) + +### Phase 3: Report + +Produce a structured report. Format: + +```markdown +# Self-Audit Report: {Agent Name} +**Date:** YYYY-MM-DD +**Domain:** {domain} +**Claims audited:** N +**Overall status:** healthy | warning | critical + +## Structural Findings +- Schema violations: N (list) +- Orphans: N (list with classification) +- Broken links: N (list) +- Stale claims: N (list with recommended action) +- Potential duplicates: N (list pairs) + +## Epistemic Findings +- Unacknowledged contradictions: N (list claim pairs with the tension) +- Confidence miscalibrations: N (list with recommended adjustment) +- Belief grounding issues: N (list beliefs with shifted dependencies) + +## Knowledge Gaps (ranked by impact) +1. {Gap description} — affects belief/position X +2. {Gap description} — affects belief/position Y + +## Cross-Domain Health +- Linkage ratio: X% +- Missing connections: {domains that should be linked but aren't} + +## Recommended Actions (prioritized) +1. {Most impactful fix — usually an unacknowledged contradiction or belief grounding issue} +2. {Second priority} +3. ... +``` + +### Phase 4: Act on Findings + +- **Contradictions and miscalibrations** → create PRs to fix (highest priority) +- **Orphans** → add incoming links from related claims (batch into one PR) +- **Gaps** → publish as frontiers in `agents/{your-name}/frontier.md` (invites contribution) +- **Stale claims** → research whether the landscape has changed, update or challenge +- **Belief grounding issues** → trigger belief re-evaluation (may cascade to positions) + +## What Self-Audit Does NOT Do + +- Does not evaluate whether claims are TRUE (that's the evaluate skill + domain expertise) +- Does not modify any files (detection only) +- Does not audit other agents' domains (each agent audits their own) +- Does not replace Leo's cross-domain evaluation (self-audit is inward-facing) + +## Relationship to Other Skills + +- **evaluate.md** — evaluates incoming claims. Self-audit evaluates existing claims. +- **cascade.md** — propagates changes through the dependency chain. Self-audit identifies WHERE cascades are needed. +- **learn-cycle.md** — processes new information. Self-audit reviews accumulated knowledge. +- **synthesize.md** — creates cross-domain connections. Self-audit measures whether enough connections exist. + +## Frequency Guidelines + +| Domain velocity | Audit trigger | Expected duration | +|----------------|--------------|-------------------| +| Fast (health, AI, finance) | Every 50 claims or monthly | 1-2 hours | +| Medium (entertainment, space) | Every 50 claims or quarterly | 1 hour | +| Slow (cultural dynamics, critical systems) | Every 50 claims or biannually | 45 min |