Commit graph

109 commits

Author SHA1 Message Date
0d3fe95522 Add config.lock retry with jitter to both worktree-add sites
Some checks are pending
CI / lint-and-test (push) Waiting to run
Parallel domain merges race on the bare repo's config file. The single
retry only covered one of two worktree-add call sites and used fixed
delay. Now both sites retry up to 3 times with increasing jitter.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 17:13:32 +01:00
1755580b95 Harden already-merged detection to exact string match
Some checks are pending
CI / lint-and-test (push) Waiting to run
Ganymede review nit: substring match on "already" could false-positive
on future return strings. Pin to the two known values from cherry_pick().

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 17:06:20 +01:00
ad7ee0831e fix(evaluate): set domain + auto_merge on all 5 approval paths
Some checks are pending
CI / lint-and-test (push) Waiting to run
Musings bypass and batch both_approve set status='approved' without
domain or auto_merge. Merge gate requires domain IS NOT NULL and
prefix match OR auto_merge=1. Result: agent PRs deadlocked for 20+ hours.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 17:03:42 +01:00
10b4e27c28 fix: tighten output gate patterns to eliminate false positives on public content
5 patterns were too broad — matched common English words:
- "extraction" (concept) matched pipeline extraction pattern
- "class X" (English) matched Python class definition pattern
- ".md " (product name) matched file extension pattern
- "threshold" (concept) matched internal metrics pattern

Fixes:
- extraction: require pipeline context words (queue/PR/branch/cron)
- class/def/import: require line-start (actual code, not prose)
- .py/.yaml/.json: require path-like prefix (not bare .md)
- threshold: require pipeline context (cosine/vector/Qdrant)

All 3 Hermes dry-run drafts now pass. 18/18 tests pass.
11/11 system content regression tests pass.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 17:02:08 +01:00
2b58ffc765 Harden output gate: add missing filter patterns for agent names, coordination language, infrastructure domains, UUIDs
Patterns added per Hermes audit:
- All agent names (Epimetheus, Ganymede, Hermes, etc.) as standalone
- Leo/Rio with coordination context (avoids false positives on common words)
- Pentagon, m3ta references
- Coordination language (craft review, substance review, skill graph, eval rubric)
- Infrastructure domains (teleo-codex, livingip.xyz)
- UUID pattern (catches conversation IDs, agent IDs)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 17:02:08 +01:00
50ef90e7d3 Add X content pipeline: output gates + tweet queue + pluggable approval
Output gate (output_gate.py): Deterministic classifier that blocks system/pipeline
messages from reaching public outputs. Pattern-based detection of PR numbers,
deploy logs, diagnostics, infrastructure references.

Tweet queue (x_publisher.py): Submit drafts through output gate + OPSEC filter,
enter approval_queue, auto-post to X via Twitter API v2 on Cory's approval.

Pluggable approval stages (approval_stages.py): Extensible architecture where
adding a new approval stage = implementing ApprovalStage.check(). Current stages:
OutputGate (stage 0), OPSEC (stage 1), Human (stage 10). Designed for future
agent voting, multi-human approval, and decision markets.

Also syncs approvals.py from VPS to local repo (was deployed but never committed).

18 tests pass.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 17:02:08 +01:00
f38b1e3c01 fix: handle already-merged PRs + retry worktree config.lock
Some checks are pending
CI / lint-and-test (push) Waiting to run
Two fixes for the 18-PR merge blockage:

1. When cherry-pick returns "already merged" (all commits empty because
   content is already on main), close the PR directly instead of trying
   to push the stale branch SHA to main. The branch ref points at old
   commits that aren't descendants of current main, so the push would
   always fail as non-fast-forward.

2. Retry worktree add once with jittered delay when config.lock
   contention occurs from parallel domain merges.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 16:57:28 +01:00
ff357c4bbc fix: remove --force-with-lease from main push to unblock 16 PRs
Some checks are pending
CI / lint-and-test (push) Waiting to run
Forgejo categorically blocks --force-with-lease on protected branches,
even for fast-forward pushes. The cherry-picked branch is already a
descendant of origin/main, so a regular push is a fast-forward by
definition. Non-ff is rejected by default — same safety guarantee.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 16:52:39 +01:00
25062cf130 Fix health check: accept HTTP 503 (stalled) as healthy
Some checks are pending
CI / lint-and-test (push) Waiting to run
Pipeline /health returns 503 when idle/stalled, which is a valid
running state. Also increase post-restart wait from 15s to 30s
for pipeline HTTP server initialization.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 16:26:03 +01:00
fe996c3299 feat: add auto-deploy script and systemd units for teleo-infrastructure
Some checks are pending
CI / lint-and-test (push) Waiting to run
Auto-deploy watches teleo-infrastructure (not teleo-codex) and syncs to
VPS working directories. New checkout path: deploy-infra/ (parallel to
existing deploy/ for 48h rollback). Path mapping updated for reorganized
repo structure (lib/, diagnostics/, telegram/ etc.).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 14:27:23 +01:00
81afcd319f fix: sync all code from VPS — repo is now authoritative source of truth
Some checks are pending
CI / lint-and-test (push) Waiting to run
24 files: 8 pipeline lib modules, 6 diagnostics updates, 4 new diagnostics
modules, telegram bot fix, 5 active operational scripts. Key changes:
- Security: SQL injection prevention (alerting.py), SSL verification
  (review_queue.py), path traversal guard (extract.py)
- Cost tracking: per-PR cost accumulation in evaluate.py
- Auto-recovery: watchdog tier0 reset with retry cap + cooldown
- Extraction: structured edge fields, post-write vector connection
- New modules: vitality, research_tracking, research_routes

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 13:18:01 +01:00
d2aec7fee3 feat: reorganize repo with clear directory boundaries and agent ownership
Some checks are pending
CI / lint-and-test (push) Waiting to run
Move scattered root-level files into categorized directories:
- deploy/ — deployment + mirror scripts (Ship)
- scripts/ — one-off backfills + migrations (Ship)
- research/ — nightly research + prompts (Ship)
- docs/ — all operational documentation (shared)

Delete 3 dead cron scripts replaced by pipeline daemon:
- batch-extract-50.sh, evaluate-trigger.sh, extract-cron.sh

Add CODEOWNERS mapping every path to its owning agent.
Add README with directory structure, ownership table, and VPS layout.
Update deploy.sh paths to match new structure.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 18:20:13 +01:00
681afad506 Consolidate pipeline code from teleo-codex + VPS into single repo
Some checks failed
CI / lint-and-test (push) Has been cancelled
Sources merged:
- teleo-codex/ops/pipeline-v2/ (11 newer lib files, 5 new lib modules)
- teleo-codex/ops/ (agent-state, diagnostics expansion, systemd units, ops scripts)
- VPS /opt/teleo-eval/telegram/ (10 new bot files, agent configs)
- VPS /opt/teleo-eval/pipeline/ops/ (vector-gc, backfill-descriptions)
- VPS /opt/teleo-eval/sync-mirror.sh (Bug 2 + Step 2.5 fixes)

Non-trivial merges:
- connect.py: kept codex threshold (0.65) + added infra domain parameter
- watchdog.py: kept infra version (stale_pr integration, superset of codex)
- deploy.sh: codex rsync version (interim, until VPS git clone migration)
- diagnostics/app.py: codex decomposed dashboard (14 new route modules)

81 files changed, +17105/-200 lines

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 16:52:26 +01:00
95f637491e fix: Ganymede review — explicit staging, push after commit, challenged_by reciprocal
Some checks failed
CI / lint-and-test (push) Has been cancelled
Three fixes from Ganymede's review of extract-time-connection:
1. Replace git add -A with explicit file staging in _reciprocal_edges
2. Push to origin/main immediately after commit (survive batch-extract reset)
3. RECIPROCAL_EDGE_MAP: challenges→challenged_by (not symmetric)
   Added challenged_by to REWEAVE_EDGE_FIELDS, EDGE_FIELDS, EDGE_WEIGHTS

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 15:46:47 +01:00
be010e666a feat: extract-time connection + post-merge reciprocal edges
Some checks are pending
CI / lint-and-test (push) Waiting to run
Two-part fix for 58% orphan ratio:

1. Prompt-time prior art: Qdrant lookup before extraction injects
   existing claims as connection candidates. LLM classifies edges
   as supports/challenges/related. reconstruct_claim_content writes
   typed edges in frontmatter.

2. Post-merge reciprocal edges: _reciprocal_edges() runs after
   cherry-pick merge, reads new claims' outgoing edges, writes
   reciprocal edges on target files. Ensures every new claim has
   incoming links.

Files: lib/extraction_prompt.py, lib/merge.py, openrouter-extract-v2.py
Tests: 214 passed (3 failures + 3 errors pre-existing)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 15:25:31 +01:00
84cb001dd6 fix: handle indented YAML list items in _serialize_edge_fields
The skip loop only matched `- ` (no indent) but YAML list items are
commonly written as `  - item` (2-space indent). This caused old list
items to persist alongside new ones, corrupting frontmatter on merge.

Fix: consume any line starting with space or dash as part of the current
field's value block.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 14:01:34 +01:00
16e798f6a2 fix: eliminate dead code + add stale worktree pre-cleanup in _merge_reweave_pr
- Combined superset assertion and merge computation into single loop
  (removed duplicate scalar-to-list normalization)
- Added worktree remove --force before worktree add to handle prior
  crash leaving stale worktree (SIGKILL, OOM, power loss)
2026-04-04 13:50:28 +01:00
b091642146 fix: string-level edge splicing in reweave merge — no yaml.dump reformatting
Two fixes from Ganymede review:
1. CRITICAL: blank line before closing --- compounded on repeat reweaves.
   Body starts with \n---, so \n{body} created \n\n---. Fixed by checking
   body prefix.
2. Replaced yaml.dump round-trip with _serialize_edge_fields() that splices
   only edge arrays into raw frontmatter text. Non-edge fields (title,
   confidence, type, quotes, flow styles) stay byte-identical to main HEAD.

_parse_yaml_frontmatter now returns 3-tuple: (dict, raw_fm_text, body).
_serialize_frontmatter takes (raw_fm_text, merged_edges_dict, body).

26 tests pass including idempotency (5x serialize), formatting preservation,
and no-blank-line regression test.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 13:48:44 +01:00
6b3a5833df feat: per-file frontmatter union for reweave PR merge
Reweave PRs modify existing files (appending YAML edges). Cherry-pick
fails ~75% when main moves between PR creation and merge.

_merge_reweave_pr() reads each changed file from both main HEAD and
branch HEAD, unions the edge arrays (order-preserving, main-first),
and writes the result. Eliminates merge conflicts structurally.

Key design decisions (Ganymede + Theseus approved):
- Order-preserving dedup: main's edges first, branch-new appended
- Superset assertion: logs warning if branch missing main edges
- Uses main's body text (reweave only touches frontmatter)
- Loud failure on parse errors (no cherry-pick fallback)
- Append-only contract: reweave adds edges, never removes

18 tests covering parse, union, serialize, superset, and full workflow.
2026-04-04 13:43:32 +01:00
2253f48993 fix: rename eval.py to eval_checks.py to avoid shadowing stdlib eval
Some checks failed
CI / lint-and-test (push) Has been cancelled
Also fixes _is_entity path check to use Path.parts instead of string
containment, preventing false positives on paths like "domains/entities-overview/".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 13:44:04 +01:00
ff68ebc561 Remove extra blank line in _group_into_windows
Some checks are pending
CI / lint-and-test (push) Waiting to run
Ganymede review cleanup — duplicate by_chat block was already resolved
during consolidation, this removes the leftover cosmetic blank line.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 13:36:06 +01:00
d89fb29c9e chore: commit untracked decomposition modules, docs, and ops scripts
- telegram/retrieval.py: RRF merge, query decomposition, vector search
- telegram/response.py: system prompt builder, response parser
- docs/tool-registry-spec.md: Ganymede's tool registry spec
- ops/nightly-reweave.sh: cron wrapper for nightly orphan reweave
- prompts/: changelog and rio system prompt

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 13:22:09 +01:00
5e0cdfc63a feat: consolidate eval pipeline, reweave fixes, enrichment dedup, cherry-pick merge, TG batching
Merges all work from epimetheus/enrichment-dedup-fix and epimetheus/eval-and-reweave-fixes:

- Eval pipeline: _LLMResponse in call_openrouter, URL fabrication check, confidence floor, cost alerts
- Reweave fixes: _is_entity gate, _same_source filter, temp 0.3, blank line sanitization
- Enrichment dedup: three-layer fix (source-slug, PR-number, post-rebase scan)
- Cherry-pick merge: replaces rebase-retry, --ours entity conflict resolution
- TG batching: group by chat_id + time proximity, force-split on unparseable timestamps
- Schema migration v10: response_audit columns for cost/confidence/blocking

67 tests pass.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 13:21:59 +01:00
9e42c34271 fix: TG message batching — group by chat_id + time proximity
Root cause: _group_into_windows never checked time gaps or chat_id.
All messages went into one stream, capped at 10 per window. 120 msgs
from one chat → 12 windows → 12 source files → 12 extraction branches.

Fix:
- Group by chat_id first (different chats = different windows always)
- Split on actual time gaps (>window_seconds between messages)
- Cap at 50 messages per window (not 10)
- Consolidate substantive windows from same chat into one source file
  at triage time (one source per chat per triage cycle)

6 tests in tests/test_tg_batching.py.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 13:19:35 +01:00
f25a4093c2 fix: replace broken _rebase_and_push call with cherry-pick in conflict retry
_retry_conflict_prs called _rebase_and_push which was never defined,
causing NameError on every conflict retry. Now uses _cherry_pick_onto_main
consistent with the primary merge path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 13:18:30 +01:00
686ef3fd7f Replace rebase-retry with cherry-pick merge mechanism
- _cherry_pick_onto_main replaces _rebase_and_push: creates fresh branch
  from origin/main, cherry-picks extraction commits, force-pushes
- Eliminates ~23% merge failure rate from rebase race conditions
- Agent branch protection: PIPELINE_OWNED_PREFIXES filter in SQL prevents
  auto-merge of agent-owned branches (theseus/*, rio/*, etc.)
- Empty-commit handling: skips already-merged content gracefully
- Entity conflict auto-resolution preserved for cherry-pick path
- Post-pick evidence dedup runs as safety net (same as post-rebase)
- Separate fetch calls for main and branch (fixes long branch name issue)

Fixes: PRs #2141, #157, #2142, #2180 (agent branch orphaning)
Fixes: ~23% merge failure rate (rebase race condition)
Related: PRs #1751, #1752 (enrichment dedup shares root cause)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 13:18:26 +01:00
f43f8f923f fix: enrichment idempotency — three-layer dedup prevents duplicate evidence blocks
Layer 1: Insertion-time dedup in openrouter-extract-v2.py — skip if source_slug
already appears in claim content.
Layer 2: Insertion-time dedup in entity_batch.py — skip if PR number already
enriched this claim.
Layer 3: Post-rebase dedup in merge.py — scan rebased files for duplicate
evidence blocks (same source reference) and remove them before force-push.

Root cause: multiple enrichment branches modify the same claim at the same
insertion point. When rebased sequentially, evidence blocks are duplicated.
(Leo: PRs #1751, #1752)

lib/dedup.py: standalone module — parses evidence headers, deduplicates by
source key, preserves trailing content (Relevant Notes, Topics sections).
9 tests covering all patterns including the real PR #1751 duplication case.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 13:18:23 +01:00
ad48d7384e Merge pull request 'feat: two-pass retrieval with sort order and graph expansion' (#5) from epimetheus/two-pass-retrieval into main
Some checks are pending
CI / lint-and-test (push) Waiting to run
2026-03-30 11:32:32 +00:00
b92d2af1ac Merge pull request 'feat: atomic extract-and-connect + stale PR monitor + response audit' (#4) from epimetheus/atomic-connect-and-stale-monitor into main
Some checks are pending
CI / lint-and-test (push) Waiting to run
2026-03-30 11:03:34 +00:00
e17e6c25db feat: two-pass retrieval with sort order and graph expansion
Some checks failed
CI / lint-and-test (pull_request) Has been cancelled
lib/search.py — shared search library:
- Pass 1 (default): top 5 from Qdrant, score >= 0.70, no expansion
- Pass 2 (expand=True): next 5 via offset=5, score >= 0.60, plus
  graph expansion from YAML frontmatter edges. Hard cap 10 total.
- Sort order: cosine desc → challenged_by → other graph-expanded
- result_type internal tag for stable sort (direct/challenge/graph)
- Module-level constants for easy threshold tuning post-calibration
- Structural file exclusion (_map.md, _overview.md)
- Within-vector dedup via _dedup_hits()

Caller updates:
- kb_retrieval.py: retrieve_vector_context() calls search(expand=True)
- diagnostics/app.py: search endpoint passes expand query param
- Argus imports from lib/search.py via sys.path (no longer owns search)

Tests: 5 new tests covering pass1-only, pass2 expansion, hard cap,
sort order, challenges-before-other-expansion.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 22:34:45 +00:00
5f554bc2de feat: atomic extract-and-connect + stale PR monitor + response audit
Some checks failed
CI / lint-and-test (pull_request) Has been cancelled
Atomic extract-and-connect (lib/connect.py):
- After extraction writes claim files, each new claim is embedded via
  OpenRouter, searched against Qdrant, and top-5 neighbors (cosine > 0.55)
  are added as `related` edges in the claim's frontmatter
- Edges written on NEW claim only — avoids merge conflicts
- Cross-domain connections enabled, non-fatal on Qdrant failure
- Wired into openrouter-extract-v2.py post-extraction step

Stale PR monitor (lib/stale_pr.py):
- Every watchdog cycle checks open extract/* PRs
- If open >30 min AND 0 claim files → auto-close with comment
- After 2 stale closures → marks source as extraction_failed
- Wired into watchdog.py as check #6

Response audit system:
- response_audit table (migration v8), persistent audit conn in bot.py
- 90-day retention cleanup, tool_calls JSON column
- Confidence tag stripping, systemd ReadWritePaths for pipeline.db

Supporting infrastructure:
- reweave.py: nightly edge reconnection for orphan claims
- reconcile-sources.py: source status reconciliation
- backfill-domains.py: domain classification backfill
- ops/reconcile-source-status.sh: operational reconciliation script
- Attribution improvements, post-extract enrichments, merge improvements

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 22:34:20 +00:00
0457c49094 fix: zombie retry loop + cost tracking
Gate 3 in batch-extract-50.sh: query pipeline.db for closed PRs before
re-extracting. Sources with >=3 closed PRs are skipped (zombie protection).

Cost tracking: openrouter_call() now returns (text, usage) tuple with
prompt_tokens and completion_tokens from the OpenRouter API response.
All callers updated to unpack and pass tokens to costs.record_usage().
Added missing triage cost recording. Fixed batch domain review recording
cost once per batch instead of once per PR.

Pentagon-Agent: Epimetheus <0144398e-4ed3-4fe2-95a3-3d72e1abf887>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 11:29:58 +00:00
89692fda2d feat: embed-on-merge — auto-index new claims into Qdrant after PR merge
After a PR merges successfully, _embed_merged_claims() diffs the merged SHA
against its parent to find new/changed .md files in knowledge directories
(domains/, core/, foundations/, decisions/, entities/). Each file is embedded
via embed-claims.py --file (OpenRouter, text-embedding-3-small).

Non-fatal: embedding failure logs a warning but does not block the merge
pipeline. This keeps vector search current without requiring manual re-embeds.

Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-26 17:53:18 +00:00
f5b27ccd73 feat: Qdrant vector search — bulk embed script + OpenRouter embeddings
- embed-claims.py: bulk embeds all claims/decisions/entities into Qdrant
  via OpenRouter (openai/text-embedding-3-small, 1536 dims)
- diagnostics/app.py: search endpoint switched from OpenAI direct to
  OpenRouter (same key as LLM calls, no new credentials)
- Qdrant running on VPS (Docker, port 6333, persistent storage)
- Collection: teleo-claims, cosine distance, 1536 dims

854 files to embed. Bulk backfill running.

Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-26 17:44:34 +00:00
47fa33fd53 feat: source author backfill — credits intellectual foundations of KB
Parses source: frontmatter across 616 claims, matches against entity
files + manual author map, credits sourcer_count. 33 authors matched,
8 new contributor entries created.

Bostrom (9), Shapiro (8), Hanson (6), Conitzer (7) etc. now visible
on the leaderboard as sourcers.

Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-26 15:26:04 +00:00
2b49b17eb2 doc: label backfill as one-shot, not cron (Ganymede review)
Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-26 15:09:47 +00:00
305445b164 feat: domain breakdown on dashboard — contributions by domain with top contributors
New _domain_breakdown() function cross-references merged PRs with
contributor principals. Dashboard shows per-domain knowledge PR counts
and top 3 contributors for each domain. API: GET /api/domains returns
full breakdown.

Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-26 15:05:48 +00:00
ae1cce730c feat: CI backfill script — reclassifies 614 PRs, attributes sourcer to m3taversal
484 knowledge PRs, 130 pipeline PRs (excluded from CI).
m3taversal credited as sourcer for all knowledge PRs.
Principal roll-up: 540 claims, CI 75.4.

Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-26 15:02:27 +00:00
4b5c5841ce doc: mixed PR classification priority note (Ganymede review)
Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-26 14:57:11 +00:00
cfb80d3496 feat: CI scoring overhaul — principal roll-up, commit-type filter, new weights
Step 1: principal column + commit_type column in pipeline.db. Static map
populates principal for local agents (rio→m3taversal etc.). VPS agents
(epimetheus, argus) have no principal.

Step 2: _classify_commit_type in merge.py. Pipeline commits (inbox/,
entities/, agents/) get commit_type='pipeline' and skip CI attribution
entirely. Knowledge commits (domains/, core/, foundations/, decisions/)
get full attribution.

Step 3 (Argus): Dashboard has dual view — by-principal (default,
governance) and by-agent (drill-down). Already implemented by Argus.

CI weights updated (Cory-approved):
- Challenger: 0.35 (was 0.20)
- Synthesizer: 0.25 (was 0.15)
- Reviewer: 0.20 (was 0.10)
- Sourcer: 0.15 (unchanged)
- Extractor: 0.05 (was 0.40)

Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-26 14:53:54 +00:00
1dfc6dcc5c feat: author handle domain signal + conversation skip at source (Ganymede)
1. Author handle map: known X accounts (MetaDAO, Anthropic, SpaceX etc.)
   count as 1 keyword match toward domain routing threshold. Lightweight,
   no URL parsing.

2. Conversation archives now write to conversations/ subdir instead of
   top-level staging dir. The cron only moves top-level *.md to queue,
   so conversations never enter the extraction pipeline. Skip happens
   at write time, not at batch-extract read time — eliminates wasted I/O
   every 15 minutes.

Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-26 14:39:15 +00:00
b5aabe0364 feat: content classification — domain routing + sub-tags for sources
All source creation functions now classify content by domain and
sub-topic instead of hardcoding internet-finance.

Domain routing: keyword matching (2+ hits) routes to ai-alignment,
health, space-development, entertainment. Default: internet-finance.

Sub-tags for internet-finance: futarchy, ownership-coins, defi,
governance, market-analysis, crypto-infra. Added to source frontmatter
tags array for granular filtering.

Applied to: standalone sources, inline SOURCE:/CLAIM:, conversation
archives, research archives.

Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-26 14:34:33 +00:00
0854375fd0 fix: skip format: conversation in extraction — archive directly instead
Conversation archives produce low-quality claims (26x schema failures,
22x near-duplicates in 24h). Valuable content from conversations now
enters through three other paths:
1. Standalone sources (URLs shared → x-article/x-tweet files)
2. Inline tags (SOURCE:/CLAIM: → curated source files)
3. Transcript review (1-hour JSONL dumps → periodic safety net)

Conversations moved to inbox/archive/telegram/ for provenance without
burning extraction cycles.

Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-26 12:02:57 +00:00
1019602eec fix: transcript dump uses append-only JSONL, not full rewrite (Ganymede)
Each dump was rewriting the full accumulated history — growing unbounded.
Now: append-only JSONL (one line per message), only new entries since
last dump. One file per chat per day. No dedup needed downstream.

Also verified ARCHIVE_DIR path is correct (staging dir, not worktree).

Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-25 13:39:43 +00:00
66bc742979 feat: full transcript archival + SOURCE:/CLAIM: inline tags
Transcript system:
- All messages in all chats captured to chat_transcripts store
- 1-hour dump job writes per-chat JSON to /opt/teleo-eval/transcripts/
- Includes internal reasoning (KB matches, searches, learnings)
- Transcripts accumulate over session (no clear on dump)
- Per-chat directories: transcripts/{chat-slug}/{date-hour}.json

Inline contribution tags:
- SOURCE: creates inbox source file with verbatim user content
- CLAIM: creates draft claim file attributed to contributor
- Both strip tag from displayed response
- Full user message preserved verbatim (Rio decides context, can't alter)

Also: multi-URL processing (up to 5 per message)

Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-25 13:35:10 +00:00
0759655688 fix: process all URLs in a message, not just the first
When a user shared two X links in one message (sjdedic + knimkar),
only the first got a standalone source. Now processes up to 5 URLs
per message, each getting its own standalone source file.

Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-25 13:21:26 +00:00
102d97859c fix: auto-research sends follow-up message with findings
When Opus triggers RESEARCH: tag, the search ran silently and archived
results but never sent a follow-up. User saw "let me look into it" then
nothing. Now: searches, sends concise summary of top 5 results back to
the chat, then archives for pipeline.

Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-25 13:14:38 +00:00
e4d7ca42ac fix: Gate 2 PR lookup — Forgejo head= filter returns wrong PR
Forgejo API head=teleo:$BRANCH filter is unreliable — returns unrelated
PRs. All 13 queued sources were matching PR #1838 (Leo's research) instead
of their own PRs. Fixed: fetch all open PRs and filter locally by
head.ref match.

Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-25 11:09:24 +00:00
02c86e9050 fix: split long messages for Telegram 4096 char limit
Bot crashed with "Message is too long" when sending full DP-00002 text
(8K+ chars). Now splits on paragraph boundaries. Also prevents silent
message drops from unhandled BadRequest exceptions.

Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-24 16:22:53 +00:00
458cd7dfda fix: Opus now knows research results are from a live search it ran
Bot said "I don't have the ability to run live X searches" despite Haiku
finding 10 tweets. Two issues: (1) prompt section header didn't make clear
these were LIVE results, (2) learnings taught deflection ("say drop links
here" instead of acknowledging search capability).

Fixed: section header now says "LIVE X Search Results (you just searched
for X — cite these directly)". Learnings updated to acknowledge search
capability. Stale Robin Hanson learning removed again (re-synced from git).

Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-24 16:19:52 +00:00