# Skill: Ingest Research your domain, find source material, and archive it in inbox/ with context notes. Extraction happens separately on the VPS — your job is to find and archive good sources, not to extract claims. **Archive everything.** The inbox is a library, not a filter. If it's relevant to any Teleo domain, archive it. Null-result sources (no extractable claims) are still valuable — they prevent duplicate work and build domain context. ## Usage ``` /ingest # Research loop: pull tweets, find sources, archive with notes /ingest @username # Pull and archive a specific X account's content /ingest url # Archive a paper, article, or thread from URL /ingest scan # Scan your network for new content since last pull ``` ## Prerequisites - API key at `~/.pentagon/secrets/twitterapi-io-key` - Your network file at `~/.pentagon/workspace/collective/x-ingestion/{your-name}-network.json` - Forgejo token at `~/.pentagon/secrets/forgejo-{your-name}-token` ## The Loop ### Step 1: Research Find source material relevant to your domain. Sources include: - **X/Twitter** — tweets, threads, debates from your network accounts - **Papers** — academic papers, preprints, whitepapers - **Articles** — blog posts, newsletters, news coverage - **Reports** — industry reports, data releases, government filings - **Conversations** — podcast transcripts, interview notes, voicenote transcripts For X accounts, use `/x-research pull @{username}` to pull tweets, then scan for anything worth archiving. Don't just archive the "best" tweets — archive anything substantive. A thread arguing a wrong position is as valuable as one arguing a right one. ### Step 2: Archive with notes For each source, create an archive file on your branch: **Filename:** `inbox/archive/YYYY-MM-DD-{author-handle}-{brief-slug}.md` ```yaml --- type: source title: "Descriptive title of the content" author: "Display Name (@handle)" twitter_id: "numeric_id_from_author_object" # X sources only url: https://original-url date: YYYY-MM-DD domain: internet-finance | entertainment | ai-alignment | health | space-development | grand-strategy secondary_domains: [other-domain] # if cross-domain format: tweet | thread | essay | paper | whitepaper | report | newsletter | news | transcript status: unprocessed priority: high | medium | low tags: [topic1, topic2] flagged_for_rio: ["reason"] # if relevant to another agent's domain --- ``` **Body:** Include the full source text, then your research notes. ```markdown ## Content [Full text of tweet/thread/article. For long papers, include abstract + key sections.] ## Agent Notes **Why this matters:** [1-2 sentences — what makes this worth archiving] **KB connections:** [Which existing claims does this relate to, support, or challenge?] **Extraction hints:** [What claims might the extractor pull from this? Flag specific passages.] **Context:** [Anything the extractor needs to know — who the author is, what debate this is part of, etc.] ``` The "Agent Notes" section is where you add value. The VPS extractor is good at mechanical extraction but lacks your domain context. Your notes guide it. ### Step 3: Cross-domain flagging When you find sources outside your domain: - Archive them anyway (you're already reading them) - Set the `domain` field to the correct domain, not yours - Add `flagged_for_{agent}: ["brief reason"]` to frontmatter - Set `priority: high` if it's urgent or challenges existing claims ### Step 4: Branch, commit, push ```bash # Branch git checkout -b {your-name}/sources-{date}-{brief-slug} # Stage all archive files git add inbox/archive/*.md # Commit git commit -m "{your-name}: archive {N} sources — {brief description} - What: {N} sources from {list of authors/accounts} - Domains: {which domains these cover} - Priority: {any high-priority items flagged} Pentagon-Agent: {Name} <{UUID}>" # Push FORGEJO_TOKEN=$(cat ~/.pentagon/secrets/forgejo-{your-name}-token) git push -u https://{your-name}:${FORGEJO_TOKEN}@git.livingip.xyz/teleo/teleo-codex.git {branch-name} ``` Open a PR: ```bash curl -s -X POST "https://git.livingip.xyz/api/v1/repos/teleo/teleo-codex/pulls" \ -H "Authorization: token ${FORGEJO_TOKEN}" \ -H "Content-Type: application/json" \ -d '{ "title": "{your-name}: archive {N} sources — {brief description}", "body": "## Sources archived\n{numbered list with titles and domains}\n\n## High priority\n{any flagged items}\n\n## Cross-domain flags\n{any items flagged for other agents}", "base": "main", "head": "{branch-name}" }' ``` Source-only PRs should merge fast — they don't change claims, just add to the library. ## What Happens After You Archive A cron job on the VPS checks inbox/ for `status: unprocessed` sources every 15 minutes. For each one it: 1. Reads the source + your agent notes 2. Runs extraction (skills/extract.md) via Claude headless 3. Creates claim files in the correct domain 4. Opens a PR with the extracted claims 5. Updates the source to `status: processed` 6. The eval pipeline reviews the extraction PR **You don't need to wait for this.** Archive and move on. The VPS handles the rest. ## Network Management Your network file (`{your-name}-network.json`) lists X accounts to monitor: ```json { "agent": "your-name", "domain": "your-domain", "accounts": [ {"username": "example", "tier": "core", "why": "Reason this account matters"}, {"username": "example2", "tier": "extended", "why": "Secondary but useful"} ] } ``` **Tiers:** - `core` — Pull every session. High signal-to-noise. - `extended` — Pull weekly or when specifically relevant. - `watch` — Pull once to evaluate, then promote or drop. Agents without a network file should create one as their first task. Start with 5-10 seed accounts. ## Quality Controls - **Archive everything substantive.** Don't self-censor. The extractor decides what yields claims. - **Write good notes.** Your domain context is the difference between a useful source and a pile of text. - **Check for duplicates.** Don't re-archive sources already in `inbox/archive/`. - **Flag cross-domain.** If you see something relevant to another agent, flag it — don't assume they'll find it. - **Log API costs.** Every X pull gets logged to `~/.pentagon/workspace/collective/x-ingestion/pull-log.jsonl`.