Commit graph

263 commits

Author SHA1 Message Date
twentyOne2x
1a71efcde2
Add Teleo research eval schema
Adds graph schema prerequisite plus research-eval schema/docs/tests for Leo tool-use benchmarks and x402 research telemetry. Validated by full local pytest and green CI.
2026-06-24 14:21:03 +02:00
twentyOne2x
533295d38c
Gate Telegram market context for Leo research (#18) 2026-06-23 19:16:24 +02:00
twentyOne2x
bfc28e084b
Wire Leo Telegram x402 smart research (#17)
* Wire Leo Telegram x402 smart research

* Suppress token-bearing Telegram HTTP logs

* Keep Telegram typing visible during Leo proxy calls

* Allow Leo Telegram social research spend cap

* Route contextual Leo research prompts to smart research

* Generalize Leo smart research intent routing

* Resume Leo smart research from paid work orders
2026-06-23 18:37:33 +02:00
twentyOne2x
30544dce05
Route Telegram smart research commands (#16) 2026-06-23 11:56:06 +02:00
twentyOne2x
d4f2530284
Merge pull request #15 from living-ip/codex/leo-wallet-test-existing-token-20260622
Use existing Leo wallet-test Telegram bot
2026-06-23 11:18:19 +02:00
twentyOne2x
b593dda1cf Use existing Leo wallet-test Telegram bot 2026-06-22 21:53:37 +02:00
twentyOne2x
754f5aeee7
Add Leo wallet-test Telegram runtime verifier (#14) 2026-06-22 21:41:17 +02:00
twentyOne2x
595f977a94
Add Telegram smart research gate installer (#13) 2026-06-22 21:33:56 +02:00
twentyOne2x
dba8a21e74
Allow per-agent Telegram env files (#12) 2026-06-22 21:27:32 +02:00
twentyOne2x
d0a4f518d5
Add Leo Telegram smart research bridge (#11) 2026-06-22 21:24:00 +02:00
twentyOne2x
2433ed2d8a
Merge pull request #10 from living-ip/codex/leo-wallet-test-token-installer-20260622
Add safe Telegram agent token installer
2026-06-22 20:49:06 +02:00
twentyOne2x
84e1269900 Add safe Telegram agent token installer 2026-06-22 20:47:44 +02:00
twentyOne2x
988e8581d6
Merge pull request #9 from living-ip/codex/teleo-ci-pyyaml-20260622
Declare PyYAML dependency for CI
2026-06-22 20:34:16 +02:00
twentyOne2x
9c29322972 Fix optional Telegram approval imports in tests 2026-06-22 20:33:02 +02:00
twentyOne2x
e4c0621538 Declare PyYAML dependency for CI 2026-06-22 20:31:39 +02:00
twentyOne2x
e1f2834c23
Merge pull request #8 from living-ip/codex/leo-wallet-test-telegram-20260622
Add Leo wallet-test Telegram agent config
2026-06-22 20:30:25 +02:00
twentyOne2x
adbdf4dbba Add Leo wallet test Telegram agent config 2026-06-22 20:29:52 +02:00
twentyOne2x
f3c63e2f8d Restart Leo agent after Telegram deploy changes 2026-06-19 23:39:23 +02:00
twentyOne2x
2e7d4e7450 Add Leo Telegram x402 bridge 2026-06-19 19:27:12 +02:00
twentyOne2x
71ea7a625c Add decision engine replay harness
- Add source-linked model discovery registry for bakeoff candidates
- Add Rio, Theseus, and KB interop fixtures with deterministic replay proof
- Gate CI on replay output; verify with 424-test suite

`.crabbox.yaml`
`.github/workflows/ci.yml`
`docs/llm-refinement-decision-engine.md`
`docs/model-discovery-registry.md`
`fixtures/decision-engine-eval/kb_interop_propose_only.json`
`fixtures/decision-engine-eval/rio_meteora_lp_incentives.json`
`fixtures/decision-engine-eval/theseus_live_model_switch_reject.json`
`scripts/check_llm_refinement_contract.py`
`scripts/replay_decision_engine_eval.py`
`tests/test_decision_engine_replay.py`
2026-06-01 17:37:38 +02:00
twentyOne2x
27e48f3e16 Add KB interop from transcript
- Encode transcript requirements for model discovery and Pentagon boundary
- Add KB read/propose skill for Hermes, OpenClaw, and Claude-style agents
- Extend LLM contract checks; verify with 422-test suite

`.agents/skills/living-ip-kb-interop/SKILL.md`
`.agents/skills/nousresearch-hermes-agent/SKILL.md`
`.agents/skills/openclaw-agent/SKILL.md`
`docs/llm-refinement-decision-engine.md`
`scripts/check_llm_refinement_contract.py`
2026-06-01 17:16:46 +02:00
twentyOne2x
aee534e686 Add decision engine refinement contracts
- Define Rio and Theseus as economics and model-integrity evaluators
- Add DB, Hermes, and OpenClaw skills with no-secret defaults
- Gate CI on LLM refinement contracts; verify with 422-test suite

`.agents/skills/decision-engine-refinement/SKILL.md`
`.agents/skills/nousresearch-hermes-agent/SKILL.md`
`.agents/skills/openclaw-agent/SKILL.md`
`.agents/skills/teleo-db-operator/SKILL.md`
`.crabbox.yaml`
`.github/workflows/ci.yml`
`docs/llm-refinement-decision-engine.md`
`scripts/check_llm_refinement_contract.py`
2026-06-01 15:50:48 +02:00
twentyOne2x
a2620c1f19 Add Crabbox CI contract gate 2026-06-01 15:36:03 +02:00
twentyOne2x
69b4987415 Add Crabbox remote proof layer 2026-05-30 01:53:10 +02:00
twentyOne2x
59951346b2 Prove phase 1b local e2e 2026-05-29 15:08:09 +02:00
twentyOne2x
cdb0b1498d Add phase 1b local review guide 2026-05-29 14:17:28 +02:00
twentyOne2x
ca96f5f8e3 Harden local phase 1b review path 2026-05-29 14:16:12 +02:00
twentyOne2x
b9cb965591 Record phase 1b staging blocker 2026-05-29 14:01:21 +02:00
twentyOne2x
7390e1e843 Implement phase 1b agent routing 2026-05-29 14:00:13 +02:00
Fawaz
377924dabe
feat(phase1-step3): rewire critical scripts Forgejo -> GitHub (decision-engine)
Phase 1 Step 3 — migrate research-session.sh and pipeline-health-check.py off Forgejo onto GitHub living-ip/decision-engine. eval-dispatcher.sh / eval-worker.sh documented as dead code (replaced by daemon).
2026-05-22 21:43:08 -04:00
aaab659900 Merge pull request 'fix(activity-feed): emit Forgejo pr_url fallback so every event has a clickthrough' (#12) from fix/forgejo-pr-url-fallback into main
Some checks failed
CI / lint-and-test (push) Has been cancelled
Reviewed-on: #12
2026-05-13 04:40:38 +00:00
Teleo Agents
e78308862a fix(activity-feed): emit Forgejo pr_url fallback so every event has a clickthrough
Some checks failed
CI / lint-and-test (pull_request) Has been cancelled
Previously _github_pr_url() only returned a URL when prs.github_pr was
populated. That field is set on only 3 of 4094 merged PRs (the rare cases
mirrored to the public GitHub repo), so pr_url was null for ~100% of the
feed. The frontend whole-row PR overlay (livingip-web PR #30) renders
only when pr_url is non-null, so until now no rows had the overlay.

Pipeline-attributed events (reweave/*, ingestion/*) are the most visible
victim: their /contributors/pipeline link lands on a sparse stub, with
no way to reach the actual commit/PR they refer to.

Fix: rename _github_pr_url -> _pr_url and fall back to the canonical
Forgejo URL (git.livingip.xyz/teleo/teleo-codex/pulls/{number}) when no
GitHub mirror exists. Verified 200 OK against a sample (#10568). GitHub
URL still wins when available.

Result: 1972/1972 events in _build_events now carry a pr_url. Whole-row
overlay starts working for everything including pipeline events.
2026-05-13 04:29:54 +00:00
a54f52234a Merge pull request 'fix(attribution): classify submitted_by by branch prefix at PR discovery' (#11) from fix/reattribute-by-branch-prefix into main
Some checks are pending
CI / lint-and-test (push) Waiting to run
Reviewed-on: #11
2026-05-13 03:57:04 +00:00
Teleo Agents
c9515c770a fix(attribution): classify submitted_by by branch prefix at PR discovery
Some checks failed
CI / lint-and-test (pull_request) Has been cancelled
reweave.py and ingestion run as the operator Forgejo token, so the prior
opener-based classifier set submitted_by=m3taversal for every system
maintenance PR. backfill_submitted_by.py never overrides non-NULL rows,
so this misattribution accumulated: ~2,748 reweave/ingestion PRs and
~3,706 <agent>/ research/entity PRs were credited to the operator on
the leaderboard and contribution_events table.

Two parts:

1. lib/merge.py: at PR discovery, classify by branch prefix first.
     reweave/, ingestion/             -> submitted_by = 'pipeline'
     <agent>/ (per _AGENT_NAMES)      -> submitted_by = '<agent>'
     otherwise human                  -> submitted_by = author.lower()
     otherwise pipeline               -> submitted_by = None
                                         (extract.py sets from proposed_by)
   Origin flag updated so domain detection and priority still fire for
   branch-classified pipeline PRs. Human PRs lowercased to maintain the
   canonical-handle contract enforced in PR #9.

2. scripts/reattribute-by-branch-prefix.py: historical cleanup.
   Per affected PR (atomic):
     - UPDATE prs.submitted_by  -> target
     - UPDATE sources.submitted_by where source_path matches
     - UPDATE contribution_events handle ('m3taversal',role='author')
       -> target, kind='agent'. Collision (target already has author
       event for PR) deletes the m3ta row; target wins.

   Scope is deliberately conservative: extract/ branches stay attributed
   to m3taversal because proposed_by-missing legitimately defaults to the
   operator (telegram drops). Only reweave/, ingestion/, and <agent>/.

   Dry-run shows 6,454 PRs + 284 events to move. Pre-flight collision
   query returns 0; pre-flight kind check confirms m3ta has only role=author
   events on this set (no challenger/synthesizer/evaluator).

   Idempotent. Dry-run by default. Run with --apply after deploy + DB
   snapshot.
2026-05-13 03:49:10 +00:00
3dca3aab5f Merge pull request 'docs: rewrite public README' (#8) from ship/readme-public-rewrite into main
Some checks are pending
CI / lint-and-test (push) Waiting to run
Reviewed-on: #8
2026-05-13 03:20:02 +00:00
2ee9dd5150 Merge pull request 'fix(activity-feed): canonicalize contributor handle so profile links resolve' (#9) from fix/activity-feed-canonical-handle into main
Some checks are pending
CI / lint-and-test (push) Waiting to run
Reviewed-on: #9
2026-05-13 03:19:41 +00:00
b29ec95dd8 Merge pull request 'fix(attribution): canonicalize submitted_by at write time + historical normalizer' (#10) from fix/canonicalize-submitted-by into main
Some checks are pending
CI / lint-and-test (push) Waiting to run
Reviewed-on: #10
2026-05-13 03:19:27 +00:00
Teleo Agents
74bf0461e8 fix(attribution): canonicalize submitted_by at write time + historical normalizer
Some checks failed
CI / lint-and-test (pull_request) Has been cancelled
Companion / write-side fix to fix/activity-feed-canonical-handle.

The activity-feed canonicalization was a read-side guard. The bug at the
source is that extract.py and two backfill scripts write decorated
strings (Vida (self-directed), pipeline (reweave), @m3taversal) into
prs.submitted_by and sources.submitted_by. Downstream readers
(lib.contributor.insert_contribution_event, scripts/scoring_digest,
diagnostics/activity_feed_api) all strip the decorator on read — but
anything that reads the column verbatim (like /api/activity-feed before
the read-side fix) 404s on /contributors/{decorated-handle}.

Stop writing the decorator. The self-directed signal is already carried
by intake_tier == research-task plus the prs.agent column; the suffix
is redundant string noise that costs us correctness at every consumer
that forgets to strip.

Changes:

- lib/extract.py:690 — write canonical handle via attribution.normalize_handle.
  Direct elif for intake_tier == research-task now stores just agent_name.
  @m3taversal -> m3taversal.

- diagnostics/backfill_submitted_by.py — same fix in two branches plus
  the reweave branch (pipeline (reweave) -> pipeline).

- scripts/backfill-research-session-attribution.py — UPDATE prs sets
  agent handle alone, no suffix. Docstring + log line updated.

- scripts/normalize-submitted-by.py (new) — one-time backfill that
  canonicalizes existing prs.submitted_by and sources.submitted_by rows.
  Strips trailing parenthetical decorators, lowercases, drops @. Defaults
  to dry-run; --apply to commit. Skips rows that would normalize to
  invalid handles (no garbage falls through silently).

Dry-run against live pipeline.db:
  prs:     3008 rows need normalization (clean mappings, 0 invalid)
  sources: 730 rows need normalization (clean mappings, 0 invalid)
  Total:   3738 rows. All map to existing handle column values.

After this lands + auto-deploys, the operator should run
  python3 scripts/normalize-submitted-by.py --apply
once to clean historical rows. The read-side canonicalization in
diagnostics/activity_feed_api.py (fix/activity-feed-canonical-handle)
becomes redundant defense-in-depth instead of load-bearing.

No KB writes.
2026-05-13 02:56:50 +00:00
Teleo Agents
01097da22c fix(activity-feed): canonicalize contributor handle so profile links resolve
Some checks failed
CI / lint-and-test (pull_request) Has been cancelled
The activity feed was returning decorated strings like "Vida (self-directed)"
and "@m3taversal" in the contributor field. The frontend uses that field as
both display label and routing handle, so /contributors/Vida%20(self-directed)
404s — Next fires notFound() in [handle]/page.tsx.

Root cause: _normalize_contributor only stripped @ and whitespace; it did not
lowercase or strip the " (self-directed)" suffix that extract.py and the
older backfill_submitted_by.py wrote into prs.submitted_by. Mixed-case
agent names (Vida vs vida) and pipeline decorators ("pipeline (reweave)")
both fell through.

Fix: lowercase + strip any trailing parenthetical decorator. Valid handles
match ^[a-z0-9][a-z0-9_-]{0,38}$ per attribution._HANDLE_RE and cannot
contain parens, so the strip is lossless.

DB simulation against 3612 merged-PR events: 0 orphan handles after
normalization (was 12 orphan label-variants before).

No KB writes — pure read-side normalization in the API layer.
2026-05-13 02:39:18 +00:00
6c66da33e4 feat(activity-feed): add pr_url field for GitHub PR clickthrough
Some checks failed
CI / lint-and-test (push) Has been cancelled
2026-05-11 20:58:36 -04:00
c3f2010a42 feat(activity-feed): add kind + target_url, fix research-session pseudo-slugs
Some checks are pending
CI / lint-and-test (push) Waiting to run
The /api/activity-feed event shape didn't give the frontend a reliable
clickability signal. Two failure modes:

1. Source-archive events (extract/* PRs that filed a paper into
   inbox/archive/ but didn't extract a claim) returned claim_slug="".
   Frontend rendered <Link href="/claims/"> which Next normalized to
   /claims and redirected to /knowledge-base. Wrong page.

2. Research/entity session commits (e.g. astra/research-2026-05-11)
   with empty descriptions fell through to "create" classification with
   a pseudo-slug like research-2026-05-11. Frontend rendered
   /claims/research-2026-05-11 -> 404.

Fix:

- Add `kind` enum (canonical): claim_merged | claim_enriched |
  claim_challenged | source_archived | session_digest. Replaces the
  internal `type` for downstream consumers; `type` kept populated for
  in-flight callers during migration.

- Add `target_url`: explicit clickability signal. Frontend renders
  <Link> when non-null, <span> when null. No special-casing needed.
    * claim_* events -> /claims/{slug}
    * source_archived -> Forgejo blob URL at inbox/archive/{domain}/{slug}.md
    * session_digest -> null (no clickthrough surface yet)

- Detect research/entity commits with empty descriptions as
  session_digest in _classify_event, instead of synthesizing a phantom
  create event with a date-shaped pseudo-slug.

- type filter accepts both legacy `type` and new `kind` values so
  callers migrate at their own pace.

Verified live: source events resolve to inbox/archive/{domain}/...
Forgejo URLs, session-digest rows return target_url=null,
claim_merged events keep /claims/{slug} unchanged.
2026-05-11 12:36:25 +01:00
ed4893e837 fix(claims): unwrap ```markdown code fences + 404 for fragments
Some checks are pending
CI / lint-and-test (push) Waiting to run
Two issues Ship hit on the Montreal Protocol claim:

1. 500 on canonical stem lookup. File starts with ```markdown wrapper
   instead of bare --- frontmatter delimiter. _split_frontmatter checked
   startswith("---") and bailed, returning "frontmatter parse failed".
   Same wrapper exists on 6 other claim files (audit grep). Now strip
   the wrapper before frontmatter detection.

2. 404 on long activity-feed slug. Same root cause — _build_indexes
   couldn't read the file's title from frontmatter, so by_title never
   indexed it, so title-fallback resolution had nothing to match against.
   Both bugs collapse once we unwrap.

Also: switched "file exists but has no frontmatter" from 500 to 404 with
reason=file_no_frontmatter. These are stray enrichment fragments living
in domains/ that never got merged into a parent claim. From the API
caller's perspective there's no claim at that slug — 500 implied
"server bug, retry later" which isn't actionable.

Verified: 3/3 wrapped claims (montreal, medicare, dod) now return 200
warm-cache ~13ms. Long-slug repro (montreal) resolves via title fallback
to canonical stem. Negative test (nonsense slug) still 404.
2026-05-11 12:02:54 +01:00
73880e138d fix(claims): resolve long activity-feed slugs to canonical file stems
Some checks are pending
CI / lint-and-test (push) Waiting to run
Activity feed emits slugs derived from PR description (the slugified claim
title), which can be longer than the on-disk file stem (agents pick shorter
hand-chosen filenames). Pure exact-stem lookup 404s on those.

Three-tier resolution in handle_claim_detail:
1. Exact stem match (existing behavior)
2. Title fallback: normalize requested slug, look up via by_title index
   (already populated from frontmatter title during _build_indexes)
3. Prefix fallback: longest common prefix among stems, anchored at 32 chars
   to prevent spurious hits

Response slug returns the canonical on-disk stem so frontend share-links
and caches converge to one form.

Repro: GET /api/claims/spacex-and-amazon-kuiper-non-endorsement-of-wef-debris-
guidelines-demonstrates-systemic-voluntary-governance-failure-at-the-scale-
where-it-matters-most was 404; now 200, returns shorter on-disk slug
'...-governance-failure'. Negative case (nonsense slug) still 404s.

Reported by Ship — Cory-facing demo path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:51:41 +01:00
1bc541ac93 fix(reaper): tighten research-session pattern to literal YYYY-MM-DD shape
Some checks are pending
CI / lint-and-test (push) Waiting to run
Apply Ganymede review of 50b888a:

MUST-FIX — pattern %/research-2% was broader than the comment claimed.
Matched anything/research-2[anything] including agent-named branches like
theseus/research-2nd-attempt-on-X or vida/research-2024-revisited. The
documented invariant said "date suffix only" but the SQL didn't enforce
it. Defense-in-depth was the framing; pattern needed to match the
framing.

Fix uses SQLite `_` single-char wildcards: research-20__-__-__ requires
exactly research-20[2-char][-][2-char][-][2-char], i.e. literal
YYYY-MM-DD shape. Threads the needle:
  - theseus/research-2026-04-30  ✓ (catches all 15 currently stuck)
  - rio/research-2099-12-31      ✓ (good through 2099)
  - theseus/research-2nd-attempt ✗ (correctly excluded)
  - vida/research-2024-revisited ✗ (correctly excluded — no -MM-DD shape)
  - rio/research-batch-agents-... ✗ (no date prefix at all)

NIT — comment said "Three classes qualify" then listed four. Off-by-one
fixed; comment now correctly says "Four classes."

Pre-deploy verified: tighter pattern catches all 15 currently-stuck
research PRs (clay/leo/astra/theseus/vida/rio research-2026-{04-28
through 05-02}). Zero false-positive risk on current branch namespace.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:10:49 +01:00
50b888a751 fix(reaper): extend allowlist to */research-2* daily-cron sessions
Apply Step 1 of stuck-PR triage. The May 7 reaper allowlist (extract/,
reweave/, fix/) deliberately excluded all agent-prefix branches per
Ganymede's review nit #3 — the rationale being that agent branches are
WIP feature work owned by the agent and shouldn't be auto-closed.

That decision was correct for theseus/feature-foo style branches.
It's wrong for {agent}/research-{YYYY-MM-DD} branches: those are daily
cron output, categorically disposable, regenerated by tomorrow's session.
Same shape as extract/ — content the pipeline-cron created and can
recreate, not feature work owned by the agent.

Production impact: 15 of 16 currently-stuck PRs are research-session
verdict-deadlocks aged 8-12 days. Without this change they sit forever
because the substantive_fixer can't classify (eval_issues=[] or
mechanical-only) and the reaper allowlist excludes them. Once live, next
hourly reaper cycle picks them up under the standard 24h-deadlock gate.

Pattern choice: %/research-2* (date-suffix) over %/research-% (loose).
Verified 15/15 stuck PRs match the tight pattern; sanity-check found
rio/research-batch-agents-memory-harnesses (manually-named, not date-
suffixed) which the loose pattern would catch and the tight pattern
correctly excludes. Closed-status today, but a future hand-named research
thesis branch sitting in request_changes for 24h would have been at risk.
The date prefix '2' threads the needle until 2030 and ages naturally.

Documented as an allowlist invariant ("disposable pipeline-generated
branches") rather than a list, per Step 3 of the plan — future additions
should match the invariant or update it explicitly.

Verified live before pushing:
- 15/15 currently stuck research PRs match the new pattern
- Zero false positives on existing branch namespace (closed branches
  excluded by status='open' guard regardless)
- Existing extract/ reweave/ fix/ allowlist members unchanged

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:00:48 +01:00
0eb26327fc feat(claims): /api/claims/{slug} canonical detail endpoint
Some checks are pending
CI / lint-and-test (push) Waiting to run
Implements Ship's claim detail contract — one round-trip, all data
resolved server-side. Replaces thin domain-only stub with full tree walk
(domains/ + foundations/ + core/), DB joins for PRs and reviews, and
server-side wikilink resolution to eliminate frontend N+1 cascades.

Response shape (Ship brief 2026-04-29):
  slug, title, domain, secondary_domains, confidence, description,
  created, last_review, body (raw markdown), sourced_from, reviews,
  prs, edges {supports,challenges,related,depends_on}, wikilinks

Wikilink resolution:
- Builds title→stem index from frontmatter title field, fallback to
  filename stem normalized via _normalize_for_match
- Returns flat {link_text: slug_or_null} map; unresolved → null so
  frontend can render plain text
- Inline normalization (lowercase, hyphen↔space, collapse whitespace,
  strip punctuation). Note: lib/attribution.py exposes only
  normalize_handle today, not the title normalizer Ship referenced.
  If a canonical helper lands later, point at it.

Caches:
- title→slug index: 60s TTL (warm cache <20ms p50 verified)
- list endpoint: 5min TTL (preserved from prior)
- Cold: ~3.3s for tree walk of 1,866 files; warm: 13-17ms

Bug fixed in second pass:
- _resolve_sourced_from defaulted title="" which leaked LIKE '%%'
  matching every PR. Now requires non-empty title+stem; handler falls
  back to slug.replace("-"," ") when frontmatter title is missing.

Verified live on VPS:
- AI diagnostic triage claim (no fm.title): sourced_from=1, prs=0
  (correct — Feb claim, pre-description-tracking)
- Recent extract PR claim: sourced_from=1 with URL, prs=1, reviews=1,
  last_review populated, edges 3 supports + 7 related, wikilinks 0
- 404 on missing slug: correct
- Claim with [[maps/...]] wikilink: 5/6 resolved (correct null on map)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 17:37:26 +01:00
fc002354d4 fix(substantive_fixer): json_valid guard in front of json_each
Some checks are pending
CI / lint-and-test (push) Waiting to run
Ganymede review of 5db6a02 (msg 2 of 3): json_each(invalid_json) throws
'malformed JSON' and propagates up through EXISTS, failing the SELECT.
The fix-cycle call site at teleo-pipeline.py:104 isn't try/except wrapped
(the reaper at line 109-116 is, the substantive cycle isn't), so a single
corrupt eval_issues row would trip the fix-stage breaker after 5 occurrences.

Fix is one line — AND json_valid(eval_issues) before the EXISTS clause.
json_valid(NULL) returns NULL (false in WHERE), json_valid(invalid) returns 0,
json_valid(valid) returns 1. SQLite 3.9+, predates VPS 3.45.1.

WARN-on-corrupt-JSON path kept per Ganymede's Q3 — json_valid and json.loads
use technically distinct parsers, cost is ~3 rows × parse-empty-string per
cycle, journal entry names the failure mode if SQLite ever surfaces a row
that passes both SQL guards but fails json.loads.

Comment updated to reflect new guard ordering.
2026-05-08 13:12:25 -04:00
5db6a0248c fix(substantive_fixer): SQL-side actionable-tag filter, eliminate head-of-line
Step 4 of the stuck-PR triage. Push the FIXABLE/CONVERTIBLE/UNFIXABLE_TAGS
intersection from a post-fetch Python loop into the SELECT WHERE clause via
json_each + EXISTS. LIMIT 3 now always returns 3 actionable rows (or fewer if
that's all there are), eliminating the head-of-line block where 3 oldest
empty-eval_issues PRs occupied the slots forever.

Background: 11 hours of post-deploy logs showed substantive_fix_cycle stuck
emitting "0 actionable from 3 candidate(s) — head-of-line: [(3922, []), (3926,
[]), (3940, [])]" every cycle. Reaper closed those three on schedule, then a
new triple of empty-eval_issues PRs took their place. Reaper-as-primary-clearance
worked but is defense-in-depth, not the right architecture. Source of the block
is upstream in this SELECT.

Implementation choice: json_each + EXISTS over LIKE. Robust against tag-name
substring overlap, future-proof against tag renames, and SQLite 3.45.1 on VPS
fully supports it. Verified live: returns 13 of 28 currently-stuck PRs as
actionable, 15 fall through to reaper as before.

Tag list builds from the routing constants at runtime so adding a new tag
auto-updates the SELECT filter — no two-place edit footgun.

WARN-on-corrupt-JSON path retained as defense-in-depth (json_each and
json.loads use different parsers; technically possible for a row to pass one
but not the other).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 12:52:12 -04:00
4b2b59b184 fix(reaper): branch allowlist for disposable pipeline-managed branches
Some checks are pending
CI / lint-and-test (push) Waiting to run
Apply Ganymede review nit #3 from f97dd15 review (the deferred close_on_forgejo
fix already landed in e14b5f2 — Ganymede was reviewing the older commit).

SQL gate previously had no branch filter — empirically all 92 candidates were
extract/* but structurally any agent branch in the deadlock shape was a
candidate. Positive allowlist for extract/, reweave/, fix/ scopes the reaper
to disposable pipeline-managed branches that the pipeline created and can
recreate. Agent branches (theseus/, vida/, epimetheus/, etc.) are WIP feature
work and must not be reaped — owners review their own PRs on their own cadence.

Cheap target-class lock complementing the LIMIT 50 blast-radius cap.
Same scoping principle as PIPELINE_OWNED_PREFIXES, but tighter — epimetheus/
review branches are pipeline-owned for merge purposes but NOT disposable.

Items 2-4 from this review:
- WARNING #2 (audit_log idx_audit_event_ts): defer to followup branch alongside
  sync-mirror migration cleanup, as Ganymede suggested.
- NIT #3 (this commit): branch allowlist applied.
- NIT #4 (token asymmetry comment=admin/close=leo): confirmed established
  codebase pattern. merge.py:946-948 does the same — comment system-toned,
  close attributed to Leo for verdict-source UI clarity. Not accidental.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 23:43:53 -04:00
ba234ec4b3 fix(reaper): apply Ganymede review — dual-PATCH drift, breaker isolation, env config
Followup to f97dd15. Four fixes from review:

MUST-FIX #1 — Forgejo double-PATCH drift
  reaper closes PR via forgejo_api PATCH at line 689, then close_pr() at
  line 700 issued a second PATCH (default close_on_forgejo=True). On
  transient failure of the second PATCH, close_pr returns False without
  updating the DB → status='open' even though Forgejo is closed. Pass
  close_on_forgejo=False so DB close is unconditional after the explicit
  Forgejo PATCH succeeds.

MUST-FIX #2 — reaper exception trips fix breaker
  Unhandled exception in verdict_deadlock_reaper_cycle propagated to
  stage_loop, recording fix-stage failures. After 5 reaper failures the
  fix breaker would open and block mechanical+substantive for 15 min.
  Wrap reaper call in try/except in fix_cycle (same exception-isolation
  pattern as ingest_cycle's extract_cycle wrapper). Defense-in-depth
  must never block primary paths.

WARNING #1 — throttle SQL full-scan
  audit_log only has idx_audit_stage. Filtering on event alone caused
  full-table scans every 60s. Added stage='reaper' so the planner uses
  the existing index — reaper writes audit rows under stage='reaper'
  already so the filter is correct.

WARNING #2 — REAPER_DRY_RUN as code constant
  Flipping dry-run → live required edit + commit + push + deploy +
  restart. Moved REAPER_DRY_RUN, REAPER_DEADLOCK_AGE_HOURS,
  REAPER_INTERVAL_SECONDS, REAPER_MAX_PER_RUN to lib/config.py with
  os.environ.get() overrides. Operator now flips via systemctl edit
  teleo-pipeline.service (Environment=REAPER_DRY_RUN=false) + restart.
  Defaults remain safe: dry-run, 24h age, hourly throttle, 50/run cap.

NIT — dry-run counter naming
  Renamed local `closed` counter in dry-run path to `would_close` so the
  heartbeat audit ("X closed, Y would-close") and journal log are
  unambiguous. Function still returns closed + would_close so callers
  see total work done.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 23:43:53 -04:00