* Auto: core/living-agents/adversarial PR review produces higher quality knowledge than self-review because separated proposer and evaluator roles catch errors that the originating agent cannot see.md | 1 file changed, 55 insertions(+) * Auto: core/living-agents/prose-as-title forces claim specificity because a proposition that cannot be stated as a disagreeable sentence is not a real claim.md | 1 file changed, 61 insertions(+) * Auto: core/living-agents/wiki-link graphs create auditable reasoning chains because every belief must cite claims and every position must cite beliefs making the path from evidence to conclusion traversable.md | 1 file changed, 56 insertions(+) * Auto: core/living-agents/domain specialization with cross-domain synthesis produces better collective intelligence than generalist agents because specialists build deeper knowledge while a dedicated synthesizer finds connections they cannot see from within their territory.md | 1 file changed, 63 insertions(+) * Auto: core/living-agents/confidence calibration with four levels enforces honest uncertainty because proven requires strong evidence while speculative explicitly signals theoretical status.md | 1 file changed, 55 insertions(+) * Auto: core/living-agents/source archiving with extraction provenance creates a complete audit trail from raw input to knowledge base output because every source records what was extracted and by whom.md | 1 file changed, 58 insertions(+) * Auto: core/living-agents/git trailers on a shared account solve multi-agent attribution because Pentagon-Agent headers in commit objects survive platform migration while GitHub-specific metadata does not.md | 1 file changed, 54 insertions(+) * Auto: core/living-agents/human-in-the-loop at the architectural level means humans set direction and approve structure while agents handle extraction synthesis and routine evaluation.md | 1 file changed, 67 insertions(+) * Auto: core/living-agents/musings as pre-claim exploratory space let agents develop ideas without quality gate pressure because seeds that never mature are information not waste.md | 1 file changed, 52 insertions(+) * Auto: core/living-agents/atomic notes with one claim per file enable independent evaluation and granular linking because bundled claims force reviewers to accept or reject unrelated propositions together.md | 1 file changed, 55 insertions(+) * leo: 10 architecture-as-claims — documenting how the Teleo collective works - What: 10 new claims in core/living-agents/ documenting the operational methodology of the Teleo collective as falsifiable claims, not instructions - Why: The repo should document itself using its own format. Each claim grounds in evidence from 43 merged PRs, clearly separates what works today from what's planned, and identifies immediate improvements. - Claims cover: PR review, prose-as-title, wiki-link graphs, domain specialization, confidence calibration, source archiving, git trailers, human-in-the-loop governance, musings, atomic notes - This is Leo proposing about core/ — requires 2 domain agent reviews + Rhea Pentagon-Agent: Leo <76FB9BCA-CC16-4479-B3E5-25A3769B3D7E> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * leo: address review feedback from Rhea, Theseus, Rio on PR #44 - Rhea: added structured author field to source archiving claim, fixed ghost email format to {id}@agents.livingip.ghost, added CI-as-enforcement as intermediate step before Forgejo ACLs - Rio: fixed wiki link evidence (was not branch-timing, was nonexistent), corrected OPSEC timeline (rule came after files were written), fixed Doppler null-result (announcement article not whitepaper), removed duplicate Calypso/Vida reference Pentagon-Agent: Leo <76FB9BCA-CC16-4479-B3E5-25A3769B3D7E> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
58 lines
5.8 KiB
Markdown
58 lines
5.8 KiB
Markdown
---
|
|
type: claim
|
|
domain: living-agents
|
|
description: "The Teleo codex archives every source with standardized frontmatter tracking processing status, extracted claims, and extraction agent — creating an audit trail that currently covers 54 sources across 5 domains"
|
|
confidence: likely
|
|
source: "Teleo collective operational evidence — schemas/source.md + 54 archive files standardized in PR #41"
|
|
created: 2026-03-07
|
|
---
|
|
|
|
# Source archiving with extraction provenance creates a complete audit trail from raw input to knowledge base output because every source records what was extracted and by whom
|
|
|
|
Every source that enters the Teleo knowledge base gets an archive file in `inbox/archive/` with standardized frontmatter that records: what the source was, who processed it, when, what claims were extracted, and what status it has. This creates a bidirectional audit trail — from any claim you can trace back to its source, and from any source you can see what claims it produced.
|
|
|
|
## How it works today
|
|
|
|
Source archive files use the schema defined in `schemas/source.md` (standardized in PR #41). Each file contains:
|
|
|
|
```yaml
|
|
status: unprocessed | processing | processed | null-result
|
|
processed_by: [agent name]
|
|
processed_date: YYYY-MM-DD
|
|
claims_extracted:
|
|
- "[[claim title 1]]"
|
|
- "[[claim title 2]]"
|
|
```
|
|
|
|
The workflow: a source arrives (article, tweet thread, paper, transcript). The proposing agent creates or updates an archive file, sets status to `processing`, extracts claims, then updates to `processed` with the list of extracted claims. If the source yields no extractable claims, it gets `null-result` with explanation (e.g., "marketing announcement — no mechanisms, no data").
|
|
|
|
Currently 54 sources are archived: 30 processed, 8 unprocessed, 1 partial. Sources span articles (Noahopinion, Citrini Research, Aschenbrenner), whitepapers (Doppler, Solomon Labs), thread analyses (Claynosaurz, MetaDAO), and data reports (Bessemer State of Health AI, Pine Analytics).
|
|
|
|
## Evidence from practice
|
|
|
|
- **Null-result tracking prevents re-extraction.** Rio's Doppler announcement article extraction returned null-result — "marketing announcement, no mechanisms, no data." The null-result archive distinguished this empty source from the actual Doppler whitepaper (which was separately processed and produced 1 claim), preventing confusion between two different sources about the same project.
|
|
- **Claims-extracted lists enable impact tracing.** When reviewing a claim, Leo can check the source archive to see what else was extracted from the same source. If 5+ claims came from one author, the source diversity flag triggers.
|
|
- **Processed-by field attributes extraction work.** Each source records which agent performed the extraction. This enables: contributor credit (the human who submitted the source), extraction credit (the agent who processed it), and quality tracking (which agent's extractions get the most changes requested during review).
|
|
- **Unprocessed backlog is visible.** The 8 unprocessed sources (harkl, daftheshrimp, oxranga, citadel-securities, pineanalytics x2, theiaresearch-claude-code, claynosaurz-popkins) are a clear task list for domain agents.
|
|
|
|
## What this doesn't do yet
|
|
|
|
- **No contributor attribution on sources.** The archive records who submitted and who processed, but not the original author's identity in a structured field that could feed ghost account creation or credit attribution. The `source` field in frontmatter is free text. The planned fix: a structured `author` block with name, handle, platform, and contributor_file reference — bridging source archiving to the ghost identity system so the audit trail reaches from "who contributed the original insight" through "who extracted" to "who reviewed."
|
|
- **Historical sources from LivingIP v1 are not archived.** The `ingestedcontent` table in LivingIP's MySQL database contains tweets and documents that predate the codex. These have been found (Naval's "Wisdom of Markets" tweet, among others) but not yet re-extracted. Some were wrongly rejected by the v1 system.
|
|
- **No automated source ingestion.** Sources currently arrive through human direction (Cory drops links, agents find material). There is no RSS feed, X API listener, or scraping pipeline that automatically surfaces sources for extraction.
|
|
- **GCS blob access unverified.** Document content from the LivingIP v1 system is stored in Google Cloud Storage. Whether these blobs are still accessible has not been confirmed.
|
|
|
|
## Where this goes
|
|
|
|
The immediate improvement is re-extracting historical content. Ben (human engineer) exports the `ingestedcontent` and `document` tables from LivingIP's MySQL database. Venus designs the re-extraction methodology. Domain agents process the content. Saturn's contributor attribution schema gives original contributors credit through ghost identities on Forgejo.
|
|
|
|
The ultimate form is an automated ingestion pipeline: X API + RSS + manual submission feed into a SQLite staging database, a Tier 1 filter (lightweight local model) routes relevant content to domain agents, extraction happens automatically, and every source — from tweet to whitepaper — gets a permanent archive with full provenance. High ingest volume (1000+ sources/day screened), low extraction rate (~10/day through expensive models), lower still review rate (~5/day through adversarial evaluation).
|
|
|
|
---
|
|
|
|
Relevant Notes:
|
|
- [[adversarial PR review produces higher quality knowledge than self-review because separated proposer and evaluator roles catch errors that the originating agent cannot see]] — source archiving feeds the review process with provenance
|
|
- [[collaborative knowledge infrastructure requires separating the versioning problem from the knowledge evolution problem because git solves file history but not semantic disagreement or insight-level attribution]] — source archiving is the attribution layer
|
|
|
|
Topics:
|
|
- [[collective agents]]
|