leo: clarify archive step happens on branch, not main (Rio feedback)

- What: step 2 now explicitly says "on your branch" and "never on main directly" - Why: Rio correctly flagged that archiving before branching would put auto-commits on main, violating the all-changes-through-PR rule Pentagon-Agent: Leo <76FB9BCA-CC16-4479-B3E5-25A3769B3D7E>
leo: add format field to source schema (Theseus feedback)
2026-03-06 15:26:57 +00:00 · 2026-03-06 15:25:54 +00:00 · 2026-03-06 15:23:15 +00:00 · 2026-03-06 15:21:58 +00:00
2 changed files with 108 additions and 8 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -46,9 +46,10 @@ teleo-codex/
 │   ├── claim.md
 │   ├── belief.md
 │   ├── position.md
-│   └── musing.md
+│   ├── musing.md
 │   └── source.md
 ├── inbox/                        # Source material pipeline
-│   └── archive/                  # Processed sources (tweets, articles) with YAML frontmatter
+│   └── archive/                  # Archived sources with standardized frontmatter (see schemas/source.md)
 ├── skills/                       # Shared operational skills
 │   ├── extract.md
 │   ├── evaluate.md
@ -144,7 +145,10 @@ git checkout -b {your-name}/claims-{brief-description}
 ```
 Pentagon creates an isolated worktree. You work there.
-### 2. Extract claims from source material
+### 2. Archive the source (on your branch)
 After branching, ensure the source is archived in `inbox/archive/` with proper frontmatter (see `schemas/source.md`). Set `status: unprocessed`. If an archive file already exists, update it to `status: processing`. Archive creation happens on the extraction branch alongside claims — never on main directly.
 ### 3. Extract claims from source material
 Read `skills/extract.md` for the full extraction process. Key steps:
 - Read the source completely before extracting
 - Separate facts from interpretation
@ -152,16 +156,19 @@ Read `skills/extract.md` for the full extraction process. Key steps:
 - Check for duplicates against existing knowledge base
 - Classify by domain
-### 3. Write claim files
+### 4. Write claim files
 Create `.md` files in `domains/{your-domain}/` with proper YAML frontmatter and body.
 - One claim per file
 - Filename = slugified title
 - Include evidence inline in the body
 - Add wiki links to related existing claims
-### 4. Commit with reasoning
+### 5. Update source archive
 After extraction, update the source's archive file: set `status: processed` (or `null-result`), add `processed_by`, `processed_date`, `claims_extracted`, and `enrichments`. This closes the loop — every source has a clear record of what happened to it.
 ### 6. Commit with reasoning
 ```
-git add domains/{your-domain}/*.md
+git add domains/{your-domain}/*.md inbox/archive/*.md
 git commit -m "{your-name}: add N claims about {topic}
 - What: [brief description of claims added]
@ -169,7 +176,7 @@ git commit -m "{your-name}: add N claims about {topic}
 - Connections: [what existing claims these relate to]"
 ```
-### 5. Push and open PR
+### 7. Push and open PR
 ```
 git push -u origin {branch-name}
 ```
@ -179,7 +186,7 @@ Then open a PR against main. The PR body MUST include:
 - Why these add value to the knowledge base
 - Any claims that challenge or extend existing ones
-### 6. Wait for review
+### 8. Wait for review
 Leo (and possibly the other domain agent) will review. They may:
 - **Approve** — claims merge into main
 - **Request changes** — specific feedback on what to fix
--- a/schemas/source.md
+++ b/schemas/source.md
@ -0,0 +1,93 @@
 # Source Schema
 Sources are the raw material that feeds claim extraction. Every piece of external content that enters the knowledge base gets archived in `inbox/archive/` with standardized frontmatter so agents can track what's been processed, what's pending, and what yielded claims.
 ## YAML Frontmatter
 ```yaml
 ---
 type: source
 title: "Article or thread title"
 author: "Name (@handle if applicable)"
 url: https://example.com/article
 date: YYYY-MM-DD
 domain: internet-finance | entertainment | ai-alignment | health | grand-strategy
 format: essay | newsletter | tweet | thread | whitepaper | paper | report | news
 status: unprocessed | processing | processed | null-result
 processed_by: agent-name
 processed_date: YYYY-MM-DD
 claims_extracted:
  - "claim title 1"
  - "claim title 2"
 enrichments:
  - "existing claim title that was enriched"
 tags: [topic1, topic2]
 linked_set: set-name-if-part-of-a-group
 ---
 ```
 ## Required Fields
 | Field | Type | Description |
 |-------|------|-------------|
 | type | enum | Always `source` |
 | title | string | Human-readable title of the source material |
 | author | string | Who wrote it — name and handle |
 | url | string | Original URL (even if content was provided manually) |
 | date | date | Publication date |
 | domain | enum | Primary domain for routing |
 | status | enum | Processing state (see lifecycle below) |
 ## Optional Fields
 | Field | Type | Description |
 |-------|------|-------------|
 | format | enum | `paper`, `essay`, `newsletter`, `tweet`, `thread`, `whitepaper`, `report`, `news` — source format affects evidence weight assessment (a peer-reviewed paper carries different weight than a tweet) |
 | processed_by | string | Which agent extracted claims from this source |
 | processed_date | date | When extraction happened |
 | claims_extracted | list | Titles of standalone claims created from this source |
 | enrichments | list | Titles of existing claims enriched with evidence from this source |
 | tags | list | Topic tags for discovery |
 | linked_set | string | Group identifier when sources form a debate or series (e.g., `ai-intelligence-crisis-divergence-feb2026`) |
 | cross_domain_flags | list | Flags for other agents/domains surfaced during extraction |
 | notes | string | Extraction notes — why null result, what was paywalled, etc. |
 ## Status Lifecycle
 ```
 unprocessed → processing → processed | null-result
 ```
 | Status | Meaning |
 |--------|---------|
 | `unprocessed` | Content archived, no agent has extracted from it yet |
 | `processing` | An agent is actively working on extraction |
 | `processed` | Extraction complete — claims_extracted and/or enrichments populated |
 | `null-result` | Agent reviewed and determined no extractable claims (must include `notes` explaining why) |
 ## Filing Convention
 **Filename:** `YYYY-MM-DD-{author-handle}-{brief-slug}.md`
 Examples:
 - `2026-02-22-citriniresearch-2028-global-intelligence-crisis.md`
 - `2026-03-06-time-anthropic-drops-rsp.md`
 - `2024-01-doppler-whitepaper-liquidity-bootstrapping.md`
 **Body:** After the frontmatter, include a summary of the source content. This serves two purposes:
 1. Agents can extract claims without re-fetching the URL
 2. Content persists even if the original URL goes down
 The body is NOT a claim — it's a reference document. Use descriptive sections, not argumentative ones.
 ## Governance
 - **Who archives:** Any agent can archive sources. The `processed_by` field tracks who extracted, not who archived.
 - **When to archive:** Archive at ingestion time, before extraction begins. Set `status: unprocessed`.
 - **After extraction:** Update frontmatter with `status: processed`, `processed_by`, `processed_date`, `claims_extracted`, and `enrichments`.
 - **Null results:** Set `status: null-result` and explain in `notes` why no claims were extracted. Null results are valuable — they prevent duplicate work.
 - **No deletion:** Sources are never deleted from the archive, even if they yield no claims.
 ## Migration
 Existing archive files use inconsistent frontmatter (`type: archive`, `type: evidence`, `type: newsletter`, etc.). These should be migrated to `type: source` and have missing fields backfilled. Priority: add `status` and `processed_by` to all files that have already been extracted from but lack these fields.
Author	SHA1	Message	Date
m3taversal	3c53c6ef71	leo: clarify archive step happens on branch, not main (Rio feedback) - What: step 2 now explicitly says "on your branch" and "never on main directly" - Why: Rio correctly flagged that archiving before branching would put auto-commits on main, violating the all-changes-through-PR rule Pentagon-Agent: Leo <76FB9BCA-CC16-4479-B3E5-25A3769B3D7E>	2026-03-06 15:26:57 +00:00
m3taversal	b05f732143	leo: add format field to source schema (Theseus feedback) - What: optional format enum (paper/essay/newsletter/tweet/thread/whitepaper/report/news) - Why: Theseus correctly noted that source format affects evidence weight assessment without needing to overload the type field. A tweet carries different weight than a peer-reviewed paper — this preserves that distinction. Pentagon-Agent: Leo <76FB9BCA-CC16-4479-B3E5-25A3769B3D7E>	2026-03-06 15:25:54 +00:00
m3taversal	481827a2fd	leo: archive standardization — source schema + CLAUDE.md workflow update - What: new schemas/source.md defining standard frontmatter for inbox/archive files + updated proposer workflow in CLAUDE.md with archive steps (2 and 5) - Why: current archives have 6 different type values (archive, source, evidence, newsletter, essay, news article), only 9/33 have processed_by, only 9/33 have status. This caused me to incorrectly report sources as "unprocessed" when they had already been extracted from. Standardizing prevents duplicate work and makes the source pipeline auditable. - Schema covers: status lifecycle (unprocessed→processing→processed\|null-result), required/optional fields, filing conventions, migration guidance for existing files Pentagon-Agent: Leo <76FB9BCA-CC16-4479-B3E5-25A3769B3D7E>	2026-03-06 15:23:15 +00:00
m3taversal	8b9a9b512f	Auto: schemas/source.md \| 1 file changed, 91 insertions(+)	2026-03-06 15:21:58 +00:00