From 0f0a983cb03f28ec769ccbabb44bb62db0658543 Mon Sep 17 00:00:00 2001 From: m3taversal Date: Wed, 11 Mar 2026 18:56:57 +0000 Subject: [PATCH] Auto: schemas/attribution.md | 1 file changed, 141 insertions(+) --- schemas/attribution.md | 141 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 141 insertions(+) create mode 100644 schemas/attribution.md diff --git a/schemas/attribution.md b/schemas/attribution.md new file mode 100644 index 00000000..059f9c34 --- /dev/null +++ b/schemas/attribution.md @@ -0,0 +1,141 @@ +# Attribution Schema + +Attribution tracks who contributed what to the knowledge base. Every claim traces back to the people and agents who produced it. Attribution is PUBLIC from day 1 — contributor profiles show a graphic of contributions over time. + +## Design Principles + +1. **Trace everything**: every claim should trace back to who suggested the research mission that produced it +2. **Role-specific**: different contribution types have different value — attribution records the role, not just the name +3. **Pseudonymous-first**: contributors use handles, not legal names. Handles persist across contributions. +4. **Git-native**: the Pentagon-Agent trailer in git commits is the foundation. External contributor attribution extends this same pattern into YAML frontmatter. +5. **Cumulative**: a contributor's full history is reconstructable from the knowledge base. No contribution is invisible. + +## The Five Contributor Roles + +| Role | What They Do | Example | +|------|-------------|---------| +| **sourcer** | Identifies the source material or research direction that led to this claim | "Look into Kalshi's revenue model" or shares an article | +| **extractor** | Extracts the specific claim from source material — separates signal from noise, writes the prose-as-title | Agent or human who reads the source and produces the claim file | +| **challenger** | Tests the claim through counter-evidence, boundary conditions, or adversarial review | "This doesn't hold when markets are thin" | +| **synthesizer** | Connects this claim to other claims, producing cross-domain insight | "This mechanism is isomorphic to X in health domain" | +| **reviewer** | Evaluates claim quality against the KB quality gates and approves/rejects | Leo's eval role, or peer reviewers | + +A single person/agent can hold multiple roles on the same claim. A claim can have multiple people in the same role. + +## Claim Frontmatter Extension + +Add an `attribution` block to claim YAML frontmatter: + +```yaml +--- +type: claim +domain: internet-finance +description: "..." +confidence: likely +source: "Theia Research 2025 annual letter, analysis by Rio" +created: 2026-03-11 + +# Attribution (new) +attribution: + sourcer: + - handle: "m3taversal" + context: "directed research into Theia's investment thesis" + - handle: "@theiaresearch" + context: "published the annual letter" + extractor: + - handle: "rio" + agent_id: "760F7FE7-5D50-4C2E-8B7C-9F1A8FEE8A46" + challenger: [] + synthesizer: [] + reviewer: + - handle: "leo" + agent_id: "294C3CA1-0205-4668-82FA-B984D54F48AD" +--- +``` + +## Attribution Fields + +### Per-role entry + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| handle | string | yes | Contributor's persistent pseudonymous identity | +| agent_id | UUID | if agent | Pentagon agent UUID (agents only) | +| context | string | no | What specifically this contributor did in this role | +| date | date | no | When the contribution was made (defaults to claim created date) | + +### Role-specific notes + +- **sourcer**: can be external (X handle, author name) or internal (agent, m3taversal). The `context` field records what research direction or source they provided. +- **extractor**: usually an agent. The `agent_id` field links to the Pentagon agent. For automated extraction pipelines, record the extraction model in `context` (e.g., "MiniMax M2.5 extract → Haiku 4.5 review"). +- **challenger**: populated when someone challenges the claim and the challenge is substantive (not just disagreement, but counter-evidence or boundary conditions). Empty array until challenged. +- **synthesizer**: populated when someone connects this claim to claims in other domains. Cross-domain synthesis is the highest-value contribution type. +- **reviewer**: populated during PR review. Records who evaluated and approved. + +## Backwards Compatibility + +The existing `source` field continues to serve as a human-readable one-liner for quick reference. The `attribution` block provides the structured, queryable version. Both coexist: + +- `source`: "Theia Research 2025 annual letter, analysis by Rio" (human-readable) +- `attribution`: structured role-by-role breakdown (machine-readable) + +For claims created before attribution was introduced, `source` remains the only attribution data. No backfill required, but claims can be enriched with `attribution` blocks as they're updated. + +## Git Trailer Integration + +Agent contributions are also recorded in git commit trailers: + +``` +Pentagon-Agent: Rio <760F7FE7-5D50-4C2E-8B7C-9F1A8FEE8A46> +``` + +The git trailer records WHO committed the change. The YAML attribution records WHO contributed WHAT in WHICH ROLE. These are complementary: +- Git trailer = "who made this change to the repository" +- YAML attribution = "who produced this knowledge and in what capacity" + +A single commit may create 10 claims. The trailer says Rio committed them. The attribution on each claim may credit different sourcers, different original research directions, different external authors. + +## Contributor Profiles + +Contributor profiles are reconstructed from the knowledge base, not stored separately. To build a profile: + +1. **Query**: search all claim `attribution` blocks for a given `handle` +2. **Aggregate**: count contributions by role, domain, confidence level, date +3. **Visualize**: contribution-over-time graphic showing when and how they contributed + +This means: +- No separate "contributor database" to maintain +- Profiles are always consistent with the actual KB state +- New contributions automatically appear in profiles +- Attribution disputes are resolved by editing claim frontmatter + +### Person Entity Bridge + +When a contributor has enough contributions to warrant tracking, their person entity (`entities/{domain}/{handle}.md`) gains `contributor: true` and links to their contributions: + +```yaml +# In person entity +contributor: true +contributions: + - role: sourcer + claim: "futarchy is manipulation-resistant..." + date: 2026-01-15 + - role: challenger + claim: "token voting DAOs offer no minority protection..." + date: 2026-02-20 +first_contribution: 2026-01-15 +attribution_handle: "@theiaresearch" +``` + +## Governance + +- Attribution is added at extraction time (extractor + sourcer) and updated during review (reviewer) and challenge (challenger) +- Synthesizer attribution is added when cross-domain connections are made, which may happen well after initial creation +- Disputes about attribution are resolved through the normal PR process +- Removing attribution requires justification (e.g., the sourcer was misidentified) + +## Implementation Priority + +1. **Now**: Add `attribution` block to new claims going forward. No backfill required. +2. **Soon**: Rhea builds contributor profile aggregation for the website. +3. **Later**: Automated attribution from the extraction pipeline (MiniMax → Haiku → agent).