Auto: schemas/source.md | 1 file changed, 20 insertions(+)

2026-03-14 17:17:03 +00:00 · 2026-03-14 17:17:03 +00:00 · a74306f56c
commit a74306f56c
parent 9ee4163803
1 changed files with 20 additions and 0 deletions
--- a/schemas/source.md
+++ b/schemas/source.md
@ -2,6 +2,20 @@
 Sources are the raw material that feeds claim extraction. Every piece of external content that enters the knowledge base gets archived in `inbox/archive/` with standardized frontmatter so agents can track what's been processed, what's pending, and what yielded claims.
 ## Source Intake Tiers
 Every source is classified by how it enters the system. The tier determines extraction priority and process.
 | Tier | Label | Description | Extraction approach |
 |------|-------|-------------|-------------------|
 | 1 | **Directed** | Contributor provides a rationale — WHY this source matters, what question it answers, which claim it challenges | Agent extracts with the contributor's rationale as the directive. Highest priority. |
 | 2 | **Undirected** | Source submitted without rationale. Agent decides the lens. | Agent extracts open-ended. Lower priority than directed. |
 | 3 | **Research task** | Proactive — agents or team identify a gap and seek sources to fill it | The gap identification IS the rationale. Agent extracts against the research question. |
 **The rationale IS the contribution.** A contributor who says "this contradicts Rio's claim about launch pricing because the data shows Dutch auctions don't solve cold-start" has done the hardest intellectual work — identifying what's relevant and why. The agent's job is extraction and integration, not relevance judgment.
 **X intake flow:** Someone replies to a claim tweet with a source link and says why it matters. The reply IS the extraction directive.
 ## YAML Frontmatter
 ```yaml
@ -12,6 +26,9 @@ author: "Name (@handle if applicable)"
 url: https://example.com/article
 date: YYYY-MM-DD
 domain: internet-finance | entertainment | ai-alignment | health | grand-strategy
 intake_tier: directed | undirected | research-task
 rationale: "Why this source matters — what question it answers, which claim it challenges"
 proposed_by: "contributor name or handle"
 format: essay | newsletter | tweet | thread | whitepaper | paper | report | news
 status: unprocessed | processing | processed | null-result
 processed_by: agent-name
@ -36,12 +53,15 @@ linked_set: set-name-if-part-of-a-group
 | url | string | Original URL (even if content was provided manually) |
 | date | date | Publication date |
 | domain | enum | Primary domain for routing |
 | intake_tier | enum | `directed`, `undirected`, or `research-task` (see intake tiers above) |
 | status | enum | Processing state (see lifecycle below) |
 ## Optional Fields
 | Field | Type | Description |
 |-------|------|-------------|
 | rationale | string | WHY this source matters — what question it answers, which claim it challenges. Required for `directed` tier, serves as extraction directive. |
 | proposed_by | string | Who submitted this source (contributor name/handle). For attribution tracking. |
 | format | enum | `paper`, `essay`, `newsletter`, `tweet`, `thread`, `whitepaper`, `report`, `news` — source format affects evidence weight assessment (a peer-reviewed paper carries different weight than a tweet) |
 | processed_by | string | Which agent extracted claims from this source |
 | processed_date | date | When extraction happened |