Commit graph

8 commits

Author SHA1 Message Date
12078c8707 Reduce near-duplicate and frontmatter schema rejections
Near-duplicate (159+ rejections):
- Add extract-time dedup gate: SequenceMatcher check before file write ($0)
- Strengthen extraction prompt: high-similarity matches (>=0.75) get explicit
  "DO NOT extract, use enrichment instead" warning
- Strip [[wiki link]] brackets from related_claims field

Frontmatter schema (129+ rejections):
- Normalize LLM confidence aliases (high→likely, medium→experimental, etc.)
  in both _build_claim_content and validate_schema
- Strip code fences (```markdown/```yaml) from entity content in extract.py
  and from diff content in validate.py tier0.5 check
- Code fences were root cause of "no_frontmatter" failures: parser sees
  ```markdown as first line, not ---

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-20 18:03:26 +01:00
687f3d3151 fix: prevent broken wiki links in extraction (226 rejections)
Some checks are pending
CI / lint-and-test (push) Waiting to run
Two changes to address the #1 rejection reason:

1. extraction_prompt.py: Explicitly tell LLM NOT to use [[wiki links]]
   in body text — use connections/related_claims JSON fields instead.
   Remove misleading "post-processor handles wiki links" language.

2. extract.py _get_kb_index(): Expand KB index to include entity stems
   from entities/{domain}/ so the LLM knows what entities exist when
   building connections. Previously only showed domain claims.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 14:28:58 +01:00
716cc43890 extraction quality: trust hierarchy + verified tagging + telegram review endpoint
Some checks are pending
CI / lint-and-test (push) Waiting to run
Three fixes for conversation-sourced claim quality:

1. Trust hierarchy in extraction prompt: bot-generated numbers are
   flagged as unverified context, not evidence. Directional claims
   are extractable but specific figures require external verification.
   Prevents laundering bot guesses into the KB as evidence.

2. Conversation-sourced claims tagged with verified: false and
   source_type: conversation in frontmatter. Downstream consumers
   (Leo, dashboard) can filter/flag these for verification.

3. GET /api/telegram-extractions endpoint for daily spot-checking.
   Shows recent Telegram-sourced PRs with claim titles, status,
   merge rate, and eval issues. Quick review surface.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 12:38:39 +01:00
1e0c1cd788 Write enrichments as file modifications; strengthen correction extraction
Some checks are pending
CI / lint-and-test (push) Waiting to run
Two changes:
1. extract.py: Enrichments now modify existing claim files by appending
   evidence sections. Previously enrichment-only extractions were
   discarded as null-result even when they contained valuable challenges.
2. extraction_prompt.py: Corrections should produce BOTH a claim (the
   corrected knowledge) AND an enrichment (linking to what it corrects).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 12:12:29 +01:00
d073e22e8d Add conversation-aware extraction for Telegram sources
Some checks are pending
CI / lint-and-test (push) Waiting to run
When source format is "conversation", inject specialized extraction
rules that prioritize human corrections/pushback as highest-value
content. Fixes null-result on short but high-signal correction
messages. Maps corrections to existing KB claims as challenges.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 12:05:51 +01:00
681afad506 Consolidate pipeline code from teleo-codex + VPS into single repo
Some checks failed
CI / lint-and-test (push) Has been cancelled
Sources merged:
- teleo-codex/ops/pipeline-v2/ (11 newer lib files, 5 new lib modules)
- teleo-codex/ops/ (agent-state, diagnostics expansion, systemd units, ops scripts)
- VPS /opt/teleo-eval/telegram/ (10 new bot files, agent configs)
- VPS /opt/teleo-eval/pipeline/ops/ (vector-gc, backfill-descriptions)
- VPS /opt/teleo-eval/sync-mirror.sh (Bug 2 + Step 2.5 fixes)

Non-trivial merges:
- connect.py: kept codex threshold (0.65) + added infra domain parameter
- watchdog.py: kept infra version (stale_pr integration, superset of codex)
- deploy.sh: codex rsync version (interim, until VPS git clone migration)
- diagnostics/app.py: codex decomposed dashboard (14 new route modules)

81 files changed, +17105/-200 lines

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 16:52:26 +01:00
be010e666a feat: extract-time connection + post-merge reciprocal edges
Some checks are pending
CI / lint-and-test (push) Waiting to run
Two-part fix for 58% orphan ratio:

1. Prompt-time prior art: Qdrant lookup before extraction injects
   existing claims as connection candidates. LLM classifies edges
   as supports/challenges/related. reconstruct_claim_content writes
   typed edges in frontmatter.

2. Post-merge reciprocal edges: _reciprocal_edges() runs after
   cherry-pick merge, reads new claims' outgoing edges, writes
   reciprocal edges on target files. Ensures every new claim has
   incoming links.

Files: lib/extraction_prompt.py, lib/merge.py, openrouter-extract-v2.py
Tests: 214 passed (3 failures + 3 errors pre-existing)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 15:25:31 +01:00
d79ff60689 epimetheus: sync VPS-deployed code to repo — Mar 18-20 reliability + features
Pipeline reliability (8 fixes, reviewed by Ganymede+Rhea+Leo+Rio):
1. Merge API recovery — pre-flight approval check, transient/permanent distinction, jitter
2. Ghost PR detection — ls-remote branch check in reconciliation, network guard
3. Source status contract — directory IS status, no code change needed
4. Batch-state markers eliminated — two-gate skip (archive-check + batched branch-check)
5. Branch SHA tracking — batched ls-remote, auto-reset verdicts, dismiss stale reviews
6. Mirror pre-flight permissions — chown check in sync-mirror.sh
7. Telegram archive commit-after-write — git add/commit/push with rebase --abort fallback
8. Post-merge source archiving — queue/ → archive/{domain}/ after merge

Pipeline fixes:
- merge_cycled flag — eval attempts preserved during merge-failure cycling (Ganymede+Rhea)
- merge_failures diagnostic counter
- Startup recovery preserves eval_attempts (was incorrectly resetting to 0)
- No-diff PRs auto-closed by eval (root cause of 17 zombie PRs)
- GC threshold aligned with substantive fixer budget (was 2, now 4)
- Conflict retry with 3-attempt budget + permanent conflict handler
- Local ff-merge fallback for Forgejo 405 errors

Telegram bot:
- KB retrieval: 3-layer (entity resolution → claim search → agent context)
- Reply-to-bot handler (context.bot.id check)
- Tag regex: @teleo|@futairdbot
- Prompt rewrite for natural analyst voice
- Market data API integration (Ben's token price endpoint)
- Conversation windows (5-message unanswered counter, per-user-per-chat)
- Conversation history in prompt (last 5 exchanges)
- Worktree file lock for archive writes

Infrastructure:
- worktree_lock.py — file-based lock (flock) for main worktree coordination
- backfill-sources.py — source DB registration for Argus funnel
- batch-extract-50.sh v3 — two-gate skip, batched ls-remote, network guard
- sync-mirror.sh — auto-PR creation for mirrored GitHub branches, permission pre-flight
- Argus dashboard — conflicts + reviewing in backlog, queue count in funnel
- Enrichment-inside-frontmatter bug fix (regex anchor, not --- split)

Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-20 20:17:27 +00:00