- Encode transcript requirements for model discovery and Pentagon boundary
- Add KB read/propose skill for Hermes, OpenClaw, and Claude-style agents
- Extend LLM contract checks; verify with 422-test suite
`.agents/skills/living-ip-kb-interop/SKILL.md`
`.agents/skills/nousresearch-hermes-agent/SKILL.md`
`.agents/skills/openclaw-agent/SKILL.md`
`docs/llm-refinement-decision-engine.md`
`scripts/check_llm_refinement_contract.py`
- Define Rio and Theseus as economics and model-integrity evaluators
- Add DB, Hermes, and OpenClaw skills with no-secret defaults
- Gate CI on LLM refinement contracts; verify with 422-test suite
`.agents/skills/decision-engine-refinement/SKILL.md`
`.agents/skills/nousresearch-hermes-agent/SKILL.md`
`.agents/skills/openclaw-agent/SKILL.md`
`.agents/skills/teleo-db-operator/SKILL.md`
`.crabbox.yaml`
`.github/workflows/ci.yml`
`docs/llm-refinement-decision-engine.md`
`scripts/check_llm_refinement_contract.py`
Phase 1 Step 3 — migrate research-session.sh and pipeline-health-check.py off Forgejo onto GitHub living-ip/decision-engine. eval-dispatcher.sh / eval-worker.sh documented as dead code (replaced by daemon).
Previously _github_pr_url() only returned a URL when prs.github_pr was
populated. That field is set on only 3 of 4094 merged PRs (the rare cases
mirrored to the public GitHub repo), so pr_url was null for ~100% of the
feed. The frontend whole-row PR overlay (livingip-web PR #30) renders
only when pr_url is non-null, so until now no rows had the overlay.
Pipeline-attributed events (reweave/*, ingestion/*) are the most visible
victim: their /contributors/pipeline link lands on a sparse stub, with
no way to reach the actual commit/PR they refer to.
Fix: rename _github_pr_url -> _pr_url and fall back to the canonical
Forgejo URL (git.livingip.xyz/teleo/teleo-codex/pulls/{number}) when no
GitHub mirror exists. Verified 200 OK against a sample (#10568). GitHub
URL still wins when available.
Result: 1972/1972 events in _build_events now carry a pr_url. Whole-row
overlay starts working for everything including pipeline events.
reweave.py and ingestion run as the operator Forgejo token, so the prior
opener-based classifier set submitted_by=m3taversal for every system
maintenance PR. backfill_submitted_by.py never overrides non-NULL rows,
so this misattribution accumulated: ~2,748 reweave/ingestion PRs and
~3,706 <agent>/ research/entity PRs were credited to the operator on
the leaderboard and contribution_events table.
Two parts:
1. lib/merge.py: at PR discovery, classify by branch prefix first.
reweave/, ingestion/ -> submitted_by = 'pipeline'
<agent>/ (per _AGENT_NAMES) -> submitted_by = '<agent>'
otherwise human -> submitted_by = author.lower()
otherwise pipeline -> submitted_by = None
(extract.py sets from proposed_by)
Origin flag updated so domain detection and priority still fire for
branch-classified pipeline PRs. Human PRs lowercased to maintain the
canonical-handle contract enforced in PR #9.
2. scripts/reattribute-by-branch-prefix.py: historical cleanup.
Per affected PR (atomic):
- UPDATE prs.submitted_by -> target
- UPDATE sources.submitted_by where source_path matches
- UPDATE contribution_events handle ('m3taversal',role='author')
-> target, kind='agent'. Collision (target already has author
event for PR) deletes the m3ta row; target wins.
Scope is deliberately conservative: extract/ branches stay attributed
to m3taversal because proposed_by-missing legitimately defaults to the
operator (telegram drops). Only reweave/, ingestion/, and <agent>/.
Dry-run shows 6,454 PRs + 284 events to move. Pre-flight collision
query returns 0; pre-flight kind check confirms m3ta has only role=author
events on this set (no challenger/synthesizer/evaluator).
Idempotent. Dry-run by default. Run with --apply after deploy + DB
snapshot.
Companion / write-side fix to fix/activity-feed-canonical-handle.
The activity-feed canonicalization was a read-side guard. The bug at the
source is that extract.py and two backfill scripts write decorated
strings (Vida (self-directed), pipeline (reweave), @m3taversal) into
prs.submitted_by and sources.submitted_by. Downstream readers
(lib.contributor.insert_contribution_event, scripts/scoring_digest,
diagnostics/activity_feed_api) all strip the decorator on read — but
anything that reads the column verbatim (like /api/activity-feed before
the read-side fix) 404s on /contributors/{decorated-handle}.
Stop writing the decorator. The self-directed signal is already carried
by intake_tier == research-task plus the prs.agent column; the suffix
is redundant string noise that costs us correctness at every consumer
that forgets to strip.
Changes:
- lib/extract.py:690 — write canonical handle via attribution.normalize_handle.
Direct elif for intake_tier == research-task now stores just agent_name.
@m3taversal -> m3taversal.
- diagnostics/backfill_submitted_by.py — same fix in two branches plus
the reweave branch (pipeline (reweave) -> pipeline).
- scripts/backfill-research-session-attribution.py — UPDATE prs sets
agent handle alone, no suffix. Docstring + log line updated.
- scripts/normalize-submitted-by.py (new) — one-time backfill that
canonicalizes existing prs.submitted_by and sources.submitted_by rows.
Strips trailing parenthetical decorators, lowercases, drops @. Defaults
to dry-run; --apply to commit. Skips rows that would normalize to
invalid handles (no garbage falls through silently).
Dry-run against live pipeline.db:
prs: 3008 rows need normalization (clean mappings, 0 invalid)
sources: 730 rows need normalization (clean mappings, 0 invalid)
Total: 3738 rows. All map to existing handle column values.
After this lands + auto-deploys, the operator should run
python3 scripts/normalize-submitted-by.py --apply
once to clean historical rows. The read-side canonicalization in
diagnostics/activity_feed_api.py (fix/activity-feed-canonical-handle)
becomes redundant defense-in-depth instead of load-bearing.
No KB writes.
The activity feed was returning decorated strings like "Vida (self-directed)"
and "@m3taversal" in the contributor field. The frontend uses that field as
both display label and routing handle, so /contributors/Vida%20(self-directed)
404s — Next fires notFound() in [handle]/page.tsx.
Root cause: _normalize_contributor only stripped @ and whitespace; it did not
lowercase or strip the " (self-directed)" suffix that extract.py and the
older backfill_submitted_by.py wrote into prs.submitted_by. Mixed-case
agent names (Vida vs vida) and pipeline decorators ("pipeline (reweave)")
both fell through.
Fix: lowercase + strip any trailing parenthetical decorator. Valid handles
match ^[a-z0-9][a-z0-9_-]{0,38}$ per attribution._HANDLE_RE and cannot
contain parens, so the strip is lossless.
DB simulation against 3612 merged-PR events: 0 orphan handles after
normalization (was 12 orphan label-variants before).
No KB writes — pure read-side normalization in the API layer.
The /api/activity-feed event shape didn't give the frontend a reliable
clickability signal. Two failure modes:
1. Source-archive events (extract/* PRs that filed a paper into
inbox/archive/ but didn't extract a claim) returned claim_slug="".
Frontend rendered <Link href="/claims/"> which Next normalized to
/claims and redirected to /knowledge-base. Wrong page.
2. Research/entity session commits (e.g. astra/research-2026-05-11)
with empty descriptions fell through to "create" classification with
a pseudo-slug like research-2026-05-11. Frontend rendered
/claims/research-2026-05-11 -> 404.
Fix:
- Add `kind` enum (canonical): claim_merged | claim_enriched |
claim_challenged | source_archived | session_digest. Replaces the
internal `type` for downstream consumers; `type` kept populated for
in-flight callers during migration.
- Add `target_url`: explicit clickability signal. Frontend renders
<Link> when non-null, <span> when null. No special-casing needed.
* claim_* events -> /claims/{slug}
* source_archived -> Forgejo blob URL at inbox/archive/{domain}/{slug}.md
* session_digest -> null (no clickthrough surface yet)
- Detect research/entity commits with empty descriptions as
session_digest in _classify_event, instead of synthesizing a phantom
create event with a date-shaped pseudo-slug.
- type filter accepts both legacy `type` and new `kind` values so
callers migrate at their own pace.
Verified live: source events resolve to inbox/archive/{domain}/...
Forgejo URLs, session-digest rows return target_url=null,
claim_merged events keep /claims/{slug} unchanged.
Two issues Ship hit on the Montreal Protocol claim:
1. 500 on canonical stem lookup. File starts with ```markdown wrapper
instead of bare --- frontmatter delimiter. _split_frontmatter checked
startswith("---") and bailed, returning "frontmatter parse failed".
Same wrapper exists on 6 other claim files (audit grep). Now strip
the wrapper before frontmatter detection.
2. 404 on long activity-feed slug. Same root cause — _build_indexes
couldn't read the file's title from frontmatter, so by_title never
indexed it, so title-fallback resolution had nothing to match against.
Both bugs collapse once we unwrap.
Also: switched "file exists but has no frontmatter" from 500 to 404 with
reason=file_no_frontmatter. These are stray enrichment fragments living
in domains/ that never got merged into a parent claim. From the API
caller's perspective there's no claim at that slug — 500 implied
"server bug, retry later" which isn't actionable.
Verified: 3/3 wrapped claims (montreal, medicare, dod) now return 200
warm-cache ~13ms. Long-slug repro (montreal) resolves via title fallback
to canonical stem. Negative test (nonsense slug) still 404.
Activity feed emits slugs derived from PR description (the slugified claim
title), which can be longer than the on-disk file stem (agents pick shorter
hand-chosen filenames). Pure exact-stem lookup 404s on those.
Three-tier resolution in handle_claim_detail:
1. Exact stem match (existing behavior)
2. Title fallback: normalize requested slug, look up via by_title index
(already populated from frontmatter title during _build_indexes)
3. Prefix fallback: longest common prefix among stems, anchored at 32 chars
to prevent spurious hits
Response slug returns the canonical on-disk stem so frontend share-links
and caches converge to one form.
Repro: GET /api/claims/spacex-and-amazon-kuiper-non-endorsement-of-wef-debris-
guidelines-demonstrates-systemic-voluntary-governance-failure-at-the-scale-
where-it-matters-most was 404; now 200, returns shorter on-disk slug
'...-governance-failure'. Negative case (nonsense slug) still 404s.
Reported by Ship — Cory-facing demo path.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Apply Ganymede review of 50b888a:
MUST-FIX — pattern %/research-2% was broader than the comment claimed.
Matched anything/research-2[anything] including agent-named branches like
theseus/research-2nd-attempt-on-X or vida/research-2024-revisited. The
documented invariant said "date suffix only" but the SQL didn't enforce
it. Defense-in-depth was the framing; pattern needed to match the
framing.
Fix uses SQLite `_` single-char wildcards: research-20__-__-__ requires
exactly research-20[2-char][-][2-char][-][2-char], i.e. literal
YYYY-MM-DD shape. Threads the needle:
- theseus/research-2026-04-30 ✓ (catches all 15 currently stuck)
- rio/research-2099-12-31 ✓ (good through 2099)
- theseus/research-2nd-attempt ✗ (correctly excluded)
- vida/research-2024-revisited ✗ (correctly excluded — no -MM-DD shape)
- rio/research-batch-agents-... ✗ (no date prefix at all)
NIT — comment said "Three classes qualify" then listed four. Off-by-one
fixed; comment now correctly says "Four classes."
Pre-deploy verified: tighter pattern catches all 15 currently-stuck
research PRs (clay/leo/astra/theseus/vida/rio research-2026-{04-28
through 05-02}). Zero false-positive risk on current branch namespace.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Apply Step 1 of stuck-PR triage. The May 7 reaper allowlist (extract/,
reweave/, fix/) deliberately excluded all agent-prefix branches per
Ganymede's review nit #3 — the rationale being that agent branches are
WIP feature work owned by the agent and shouldn't be auto-closed.
That decision was correct for theseus/feature-foo style branches.
It's wrong for {agent}/research-{YYYY-MM-DD} branches: those are daily
cron output, categorically disposable, regenerated by tomorrow's session.
Same shape as extract/ — content the pipeline-cron created and can
recreate, not feature work owned by the agent.
Production impact: 15 of 16 currently-stuck PRs are research-session
verdict-deadlocks aged 8-12 days. Without this change they sit forever
because the substantive_fixer can't classify (eval_issues=[] or
mechanical-only) and the reaper allowlist excludes them. Once live, next
hourly reaper cycle picks them up under the standard 24h-deadlock gate.
Pattern choice: %/research-2* (date-suffix) over %/research-% (loose).
Verified 15/15 stuck PRs match the tight pattern; sanity-check found
rio/research-batch-agents-memory-harnesses (manually-named, not date-
suffixed) which the loose pattern would catch and the tight pattern
correctly excludes. Closed-status today, but a future hand-named research
thesis branch sitting in request_changes for 24h would have been at risk.
The date prefix '2' threads the needle until 2030 and ages naturally.
Documented as an allowlist invariant ("disposable pipeline-generated
branches") rather than a list, per Step 3 of the plan — future additions
should match the invariant or update it explicitly.
Verified live before pushing:
- 15/15 currently stuck research PRs match the new pattern
- Zero false positives on existing branch namespace (closed branches
excluded by status='open' guard regardless)
- Existing extract/ reweave/ fix/ allowlist members unchanged
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Implements Ship's claim detail contract — one round-trip, all data
resolved server-side. Replaces thin domain-only stub with full tree walk
(domains/ + foundations/ + core/), DB joins for PRs and reviews, and
server-side wikilink resolution to eliminate frontend N+1 cascades.
Response shape (Ship brief 2026-04-29):
slug, title, domain, secondary_domains, confidence, description,
created, last_review, body (raw markdown), sourced_from, reviews,
prs, edges {supports,challenges,related,depends_on}, wikilinks
Wikilink resolution:
- Builds title→stem index from frontmatter title field, fallback to
filename stem normalized via _normalize_for_match
- Returns flat {link_text: slug_or_null} map; unresolved → null so
frontend can render plain text
- Inline normalization (lowercase, hyphen↔space, collapse whitespace,
strip punctuation). Note: lib/attribution.py exposes only
normalize_handle today, not the title normalizer Ship referenced.
If a canonical helper lands later, point at it.
Caches:
- title→slug index: 60s TTL (warm cache <20ms p50 verified)
- list endpoint: 5min TTL (preserved from prior)
- Cold: ~3.3s for tree walk of 1,866 files; warm: 13-17ms
Bug fixed in second pass:
- _resolve_sourced_from defaulted title="" which leaked LIKE '%%'
matching every PR. Now requires non-empty title+stem; handler falls
back to slug.replace("-"," ") when frontmatter title is missing.
Verified live on VPS:
- AI diagnostic triage claim (no fm.title): sourced_from=1, prs=0
(correct — Feb claim, pre-description-tracking)
- Recent extract PR claim: sourced_from=1 with URL, prs=1, reviews=1,
last_review populated, edges 3 supports + 7 related, wikilinks 0
- 404 on missing slug: correct
- Claim with [[maps/...]] wikilink: 5/6 resolved (correct null on map)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Ganymede review of 5db6a02 (msg 2 of 3): json_each(invalid_json) throws
'malformed JSON' and propagates up through EXISTS, failing the SELECT.
The fix-cycle call site at teleo-pipeline.py:104 isn't try/except wrapped
(the reaper at line 109-116 is, the substantive cycle isn't), so a single
corrupt eval_issues row would trip the fix-stage breaker after 5 occurrences.
Fix is one line — AND json_valid(eval_issues) before the EXISTS clause.
json_valid(NULL) returns NULL (false in WHERE), json_valid(invalid) returns 0,
json_valid(valid) returns 1. SQLite 3.9+, predates VPS 3.45.1.
WARN-on-corrupt-JSON path kept per Ganymede's Q3 — json_valid and json.loads
use technically distinct parsers, cost is ~3 rows × parse-empty-string per
cycle, journal entry names the failure mode if SQLite ever surfaces a row
that passes both SQL guards but fails json.loads.
Comment updated to reflect new guard ordering.
Step 4 of the stuck-PR triage. Push the FIXABLE/CONVERTIBLE/UNFIXABLE_TAGS
intersection from a post-fetch Python loop into the SELECT WHERE clause via
json_each + EXISTS. LIMIT 3 now always returns 3 actionable rows (or fewer if
that's all there are), eliminating the head-of-line block where 3 oldest
empty-eval_issues PRs occupied the slots forever.
Background: 11 hours of post-deploy logs showed substantive_fix_cycle stuck
emitting "0 actionable from 3 candidate(s) — head-of-line: [(3922, []), (3926,
[]), (3940, [])]" every cycle. Reaper closed those three on schedule, then a
new triple of empty-eval_issues PRs took their place. Reaper-as-primary-clearance
worked but is defense-in-depth, not the right architecture. Source of the block
is upstream in this SELECT.
Implementation choice: json_each + EXISTS over LIKE. Robust against tag-name
substring overlap, future-proof against tag renames, and SQLite 3.45.1 on VPS
fully supports it. Verified live: returns 13 of 28 currently-stuck PRs as
actionable, 15 fall through to reaper as before.
Tag list builds from the routing constants at runtime so adding a new tag
auto-updates the SELECT filter — no two-place edit footgun.
WARN-on-corrupt-JSON path retained as defense-in-depth (json_each and
json.loads use different parsers; technically possible for a row to pass one
but not the other).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Apply Ganymede review nit #3 from f97dd15 review (the deferred close_on_forgejo
fix already landed in e14b5f2 — Ganymede was reviewing the older commit).
SQL gate previously had no branch filter — empirically all 92 candidates were
extract/* but structurally any agent branch in the deadlock shape was a
candidate. Positive allowlist for extract/, reweave/, fix/ scopes the reaper
to disposable pipeline-managed branches that the pipeline created and can
recreate. Agent branches (theseus/, vida/, epimetheus/, etc.) are WIP feature
work and must not be reaped — owners review their own PRs on their own cadence.
Cheap target-class lock complementing the LIMIT 50 blast-radius cap.
Same scoping principle as PIPELINE_OWNED_PREFIXES, but tighter — epimetheus/
review branches are pipeline-owned for merge purposes but NOT disposable.
Items 2-4 from this review:
- WARNING #2 (audit_log idx_audit_event_ts): defer to followup branch alongside
sync-mirror migration cleanup, as Ganymede suggested.
- NIT #3 (this commit): branch allowlist applied.
- NIT #4 (token asymmetry comment=admin/close=leo): confirmed established
codebase pattern. merge.py:946-948 does the same — comment system-toned,
close attributed to Leo for verdict-source UI clarity. Not accidental.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Followup to f97dd15. Four fixes from review:
MUST-FIX #1 — Forgejo double-PATCH drift
reaper closes PR via forgejo_api PATCH at line 689, then close_pr() at
line 700 issued a second PATCH (default close_on_forgejo=True). On
transient failure of the second PATCH, close_pr returns False without
updating the DB → status='open' even though Forgejo is closed. Pass
close_on_forgejo=False so DB close is unconditional after the explicit
Forgejo PATCH succeeds.
MUST-FIX #2 — reaper exception trips fix breaker
Unhandled exception in verdict_deadlock_reaper_cycle propagated to
stage_loop, recording fix-stage failures. After 5 reaper failures the
fix breaker would open and block mechanical+substantive for 15 min.
Wrap reaper call in try/except in fix_cycle (same exception-isolation
pattern as ingest_cycle's extract_cycle wrapper). Defense-in-depth
must never block primary paths.
WARNING #1 — throttle SQL full-scan
audit_log only has idx_audit_stage. Filtering on event alone caused
full-table scans every 60s. Added stage='reaper' so the planner uses
the existing index — reaper writes audit rows under stage='reaper'
already so the filter is correct.
WARNING #2 — REAPER_DRY_RUN as code constant
Flipping dry-run → live required edit + commit + push + deploy +
restart. Moved REAPER_DRY_RUN, REAPER_DEADLOCK_AGE_HOURS,
REAPER_INTERVAL_SECONDS, REAPER_MAX_PER_RUN to lib/config.py with
os.environ.get() overrides. Operator now flips via systemctl edit
teleo-pipeline.service (Environment=REAPER_DRY_RUN=false) + restart.
Defaults remain safe: dry-run, 24h age, hourly throttle, 50/run cap.
NIT — dry-run counter naming
Renamed local `closed` counter in dry-run path to `would_close` so the
heartbeat audit ("X closed, Y would-close") and journal log are
unambiguous. Function still returns closed + would_close so callers
see total work done.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Defense-in-depth for PRs that substantive_fixer can't make progress on.
Targets two stuck-verdict shapes empirically observed in production:
1. leo:request_changes + domain:approve
Leo asked for substantive fix; fixer either failed silently
(no_claim_files / no_review_comments / etc.) or the issue tag isn't
in FIXABLE | CONVERTIBLE | UNFIXABLE.
2. leo:skipped + domain:request_changes
Eval bypassed Leo (eval_attempts >= MAX). Domain rejected with no
structured eval_issues. fixer can't classify the issue.
92 PRs match this gate today, oldest at 2026-04-24 (13d stuck).
Behavior:
- Hourly throttle via audit_log sentinel ('verdict_deadlock_reaper_run').
- REAPER_DRY_RUN=True default — first deploy emits 'would_close' audit
events only. No DB writes. No Forgejo writes. (Ship Apr 24 directive.)
- 24h cooldown, oldest-first, capped at 50 per run.
- Heartbeat audit fires whether dry-run or live, so throttle works.
- Live mode: posts comment + closes Forgejo PR + close_pr() in DB.
Audits 'verdict_deadlock_closed' per PR.
- Forgejo PATCH None → skip DB close (avoid drift).
Wired into fix_cycle() in teleo-pipeline.py. Runs after mechanical
and substantive fixes, never blocks them.
Followup (post first-run audit verification):
- Operator inspects 'verdict_deadlock_would_close' audit rows
- Flips REAPER_DRY_RUN to False, redeploys
- Reaper actually closes on next hourly tick
Third silent return path in substantive_fix_cycle — JSON-decode except
at the eval_issues parse drops rows that don't reach skipped_no_tags
or substantive_rows. If all 3 LIMIT-3 candidates have corrupt JSON,
cycle returns 0,0 with no log entry.
WARN level (not INFO): corrupt JSON is abnormal (post-merge column
drift, hand-edited DB row, partial write during crash). If this fires,
ops want to chase the upstream column-write path. If it never fires,
baseline noise stays at zero.
Closes the visibility gap on ALL silent returns in this function, not
just the two patched in 3f8666e.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two silent paths in substantive_fix_cycle masked a 13-day stall:
1. Filter strips all candidates → return 0,0 with no log. With LIMIT 3
ordered created_at ASC, if the oldest 3 have no fixer-actionable tags
(e.g. eval_issues=[] from leo:skipped+domain:request_changes), the
cycle silently picks the same head-of-line every tick.
2. _fix_pr early-returns logged at DEBUG only — invisible without
fleet-wide DEBUG. Skip reasons (no_claim_files, no_review_comments,
not_open lock, worktree_failed, etc.) never surfaced in journalctl.
Patch: log skipped candidate eval_issues when no actionable rows
found (path 1); promote DEBUG→INFO for per-PR skip reasons (path 2).
Zero behavior change — observability only.
Diagnosis context: 98 PRs stuck >3d, last successful substantive_fixer
event 2026-04-24. Need journal evidence to choose between (a) one-line
fix to the cycle, (b) larger _fix_pr regression. (Ship Step 2 directive.)
Per Ganymede review: silent fall-through with no log entry is the
failure mode that bites. SELECT redirects stderr to $LOG, falls back
to empty string on failure. INSERT wrapped in if-not branch with WARN
log naming the (branch, sha, pr_number) so duplicate auto-create
possibility is visible.
Matches the Step 0/0b/4.5 observability pattern from prior reviews.
Behavior unchanged on the success path; failures now greppable.
Diagnosis (per Ganymede pushback): the original mechanism story was wrong.
Vida and Leo show 100+ PRs at 0 merge failures — luck doesn't produce
that. Real cause is sync-mirror's auto-create loop, not session spawning.
Verified data:
- vida/research-2026-04-30: 1 commit on branch, 303 PRs in DB
- reweave/2026-04-29: 1 commit on branch, 840 PRs in DB
- Cron fires once/day per agent; reweave fires once/day at 01:00 UTC
- Forgejo currently has 0 PRs for vida (all merged/closed); 3 distinct
SHAs total across reweave's history (PRs replay same SHA repeatedly)
Mechanism (confirmed in /opt/teleo-eval/logs/sync.log):
1. Pipeline merges PR → calls _delete_remote_branch on Forgejo
2. Next sync cycle: git fetch forgejo --prune drops the local Forgejo
ref; refs/remotes/origin still has it (GitHub copy untouched)
3. comm sees branch GitHub-only → re-pushes to Forgejo at original SHA
4. HAS_PR check uses ?state=closed&limit=50 — closed PR for this branch
scrolled out of pagination window long ago → returns "no"
5. Auto-create POST → fresh Forgejo PR (e.g. #7295 created at 21:46 for
branch SHA from 04:12)
6. Pipeline merges (cherry-pick is empty no-op since content's on main;
reweave union produces "already up to date" via the empty-diff guard
shipped in 923454c) → _delete_remote_branch → loop
Fix (per Ganymede design point #2: "right place is discovery, not
_claim_next_pr"): SHA-based tracker in pipeline.db. Records (branch, sha)
after every successful auto-create. Subsequent cycles see the same
(branch, sha) → skip the entire push+create sequence. Cheap O(1) sqlite
lookup per branch per cycle.
Why SHA, not branch: research-session.sh and nightly-reweave.sh both use
--force push, so a branch can legitimately get new commits over time.
Tracker keys on SHA so genuine new commits produce a tracker miss → PR
creation proceeds normally. No regression on legitimate branch reuse.
Why pipeline.db, not flat file: shared with discover_external_prs +
audit_log + the agent's own tooling; survives sync-mirror restarts;
ACID-safe under the cron's 2-min cadence. CREATE IF NOT EXISTS is
inline (no migration needed) because this table is private to
sync-mirror — pipeline daemon doesn't read it.
Validated against /tmp/pipeline-test.db copy: gate fires on known
(branch, sha), misses on new SHA (correctly allows new content).
Defense-in-depth — leaves existing HAS_PR check in place. Tracker is
the durable signal; HAS_PR is best-effort and may catch cases the
tracker hasn't seen yet (e.g. PR created out-of-band).
Reweave numbers (Ganymede point #3): same shape, same fix. Both research
and reweave loops killed by the same gate.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two nits from Ganymede review of ed4af4d:
1. Archive-basename filter depends on basename-uniqueness across queue+archive.
Current naming (date-prefix + topic-slug) makes collisions rare, but if
short generic names like "notes.md" enter the queue, the filter silently
false-positives. Comment block names the assumption.
2. Archive walk now skips _-prefixed files, matching the standing convention
everywhere else (search.py STRUCTURAL_FILES, reweave wiki-link skip, Layer
0 entity exclusion). Defensive — no _*.md exists under inbox/archive/
today, but consistent with codebase convention if a future operator drops
_README.md to document the directory.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Daemon re-extracted same source every ~4h cycle when research-session
commits on agent branches re-introduced already-archived queue files.
Existing daemon filters (DB-status, open-PR, 4h cooldown) all missed
this pattern because the queue file gets a fresh sources row at
status='unprocessed' on each re-add, the cooldown lapses exactly at
the cycle interval, and the open-PR filter only catches in-flight
extractions.
Add an archive-basename filter immediately after the queue scan: if
a file with this basename exists anywhere under inbox/archive/, skip.
Archive copy is the source of truth — once extracted, the queue copy
is stale by definition.
Validation against pipeline.db (last 7d):
78 sources had multiple extract PRs (32% duplicate rate)
73/78 (94%) carry an archive copy and would have been caught.
Current queue: 35/99 sources (35%) have archive duplicates today.
Pentagon-Agent: Epimetheus <0144398e-4ed3-4fe2-95a3-3d72e1abf887>
Two nits from Ganymede line-level review of 7741c1e:
1. Comment at lines 562-565 said --force-with-lease but code is plain
--force. Comment now describes the actual behavior: bot-owned per-PR
audit ref, intentional overwrite on stale refs from prior aborted
attempts, no concurrent writer to lease against.
2. Sentinel-regex extraction in _merge_domain_queue dispatch had no
graceful-failure log. If the _merge_no_ff_external success-message
contract drifts and any of the three regexes (M, audit_ref, external
PR #) miss, dispatch silently builds a comment with None values and
writes audit_log JSON with null fields. Added a warning log when any
regex misses — signal-only, doesn't gate the close path since the
merge already succeeded.
Branch: epimetheus/external-merge-flow-bug1
Parent: 7741c1e (Ship Msg 3 architecture review close)
Diff: +11/-3, single file lib/merge.py
Ganymede: 3-message protocol Msg 3 (nits applied, ball returned).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 2 review fix#1 (architectural pushback): replace force-push of
contributor's gh-pr-N/* branch with a three-step synthetic-branch flow:
1. Worktree on local branch _merged-{slug} from origin/main
2. git merge --no-ff origin/{branch} into the local branch
3. Push merge commit to origin/_merged/{branch} (synthetic audit ref)
4. Function ff-pushes merge_sha → origin/main directly
Contributor's gh-pr-N/* branch on Forgejo is now never touched.
Force-pushing it would have rewritten the tip with a merge commit the
contributor didn't author — confusing bot force-push in Forgejo PR UI.
Mirrors the _clean/* synthetic branch pattern in cherry-pick.
Function now owns the push to main (was dispatch's job for cherry-pick
and reweave). Returns sentinel "merged --no-ff (external PR #N, M=<sha>,
audit_ref=...)" that dispatch detects to skip its ff-push and route
directly to PR-close + mark_merged + audit. Audit detail JSON now
includes merge_commit_sha + audit_ref + github_pr (Ship review #5).
Smoke-tested in scratch repo end-to-end:
- contributor branch tip unchanged ✓
- audit ref _merged/gh-pr-90/... carries merge SHA ✓
- main tip equals merge SHA (ff-push, no force) ✓
- contributor SHA ancestor of main → GitHub badge fires ✓
Sentinel return parsed via 3 regexes in dispatch (full 40-char SHA in
return string for durability). Branch comment in dispatch explicitly
notes contributor branch is left in place — sync-mirror keeps the
GitHub PR <-> Forgejo PR link observable through it.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
External GitHub fork PRs need their contributor commit SHA in main's history
for GitHub's "merged" badge to fire. Cherry-pick rewrites the SHA, breaking
that detection. New _merge_no_ff_external function preserves the SHA via a
true merge commit.
Mechanics (mirrors _cherry_pick_onto_main shape):
1. Fetch origin/main + origin/{branch}
2. Detached worktree at origin/main, git merge --no-ff origin/{branch}
with verbose message: "Merge external GitHub PR #{N}: {branch_slug}"
3. Force-push merge commit M as origin/{branch}, replacing branch tip
4. Dispatch's existing ff-push origin/{branch} → main propagates M to main
M has parents [main_sha, contributor_sha]. M is a fast-forward descendant
of main_sha (first-parent chain), so the ff-push to main is valid without
--force. Contributor SHA reachable from main → GitHub recognizes merged.
Conflict handling: same auto-resolve as cherry-pick — entity-only conflicts
take main's version (--ours = current worktree HEAD = main), other conflicts
abort with detail.
Backout: config.EXTERNAL_PR_NO_FF_MERGE = True (default). Set False to fall
back to cherry-pick if no-ff destabilizes throughput one week pre-Accelerate.
Branch dispatch in _merge_domain_queue:
- reweave/* → _merge_reweave_pr (existing)
- gh-pr-N/* AND config.EXTERNAL_PR_NO_FF_MERGE → _merge_no_ff_external (new)
- everything else → _cherry_pick_onto_main (existing default)
Verified end-to-end in scratch repo:
- merge commit M has [main_sha, contributor_sha] as parents
- contributor SHA is ancestor of M
- after ff-push, contributor SHA is in main's history (GitHub badge fires)
- regex parses 8 cases correctly (real fork PR + edge cases reject cleanly)
Architecture per Ship Msg 3 / doc v3 (537cfd5 on epimetheus/external-merge-flow-design).
Phase 1 (sync-mirror self-heal) deployed yesterday. Phase 3 (FwazB PR #90 cleanup)
queued behind this deploy.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Ganymede review nit on commit 1eb259d:
- Regex changed from [0-9]* (zero-or-more) to [0-9][0-9]* (one-or-more,
portable BRE form of [0-9]+ that works on both GNU and BSD sed).
- Empty/non-numeric branches now fail at parse, not just at the empty-guard
below — SQL-integer-safety load-bearing on the regex alone.
- Comment above the UPDATE notes the integer-validation invariants
(INTEGER `number` column + regex-validated gh_pr_num) since bash sqlite3
has no parametric binding.
Smoke tested: gh-pr-/foo, gh-pr-abc/foo no longer parse to non-empty.
gh-pr-90/main, gh-pr-4066/contrib/x, gh-pr-1/x all parse correctly.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Step 0 (new): runs once per cron tick before per-repo work. Selects PR rows
where branch matches gh-pr-% but github_pr IS NULL, parses the PR number
from the branch name, and updates github_pr + source_channel='github'.
Recovers from races and transient failures in the existing Step 4.5 link
UPDATE — no retry path before. The sweep IS the backfill: same SELECT/UPDATE
heals historical orphans (FwazB PR 4066 picked up on first cron tick) AND
future races on subsequent ticks. No separate one-shot script needed.
Properties:
- Idempotent: SELECT empty when clean, zero work
- No API calls: branch name encodes the GitHub PR number deterministically
- Bounded log volume: one line per actually-healed row
- Runs before any sync_repo work, ahead of branch-mirror loop and the
auto-create-PR block in Step 4 — same-cycle convergence on fresh races
Closes the Bug #2 path that left FwazB's PR 4066 with github_pr=NULL,
preventing on_merged() from posting comment + closing the GitHub PR.
Verified end-to-end on live DB snapshot:
- before: 4066 had github_pr=NULL
- after sweep: 4066 has github_pr=90, source_channel='github'
- second run: zero output (idempotent)
Phase 1 of docs/external-contributor-merge-flow.md (v2, sweep-only).
Ship architecturally approved Msg 2/2.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the directory-listing format with one that explains what the
pipeline does and shows production scale. Verified all numbers against
production (1,546 claims, 13 domains, 1,975 merged PRs, 508 last-7d
throughput, 94% approval, ~\$0.10/merged claim incl. all stages).
Removes the VPS layout section (IP + paths + username) per Epimetheus
review — that detail moves to the private teleo-ops repo. Generalizes
deploy targets without naming the host.
Adds two Mermaid diagrams (pipeline flow + review tier matrix), both
syntactically safe across GitHub and Forgejo 9 / Gitea 1.22.
Drops the per-directory ownership table — CODEOWNERS is the single
source of truth on review authority. Keeps the high-level role map
for orientation only.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Initial setup-infra-mirror.sh did `git push origin --all`, which contradicted
the main_only mode protection landed in b9c4947 — agent review branches
(epimetheus/*, ganymede/*) ended up publicly visible on the new GitHub
teleo-infrastructure mirror until I deleted them.
Initial push now mirrors the recurring sync's main_only path: refs/heads/main
+ tags only. Re-running the setup script is now idempotent at branch level —
won't redo the agent-branch leak.
Cleanup applied to live GitHub teleo-infrastructure: 18 stale agent review
branches deleted via single batched push (epimetheus/* x14, ganymede/* x3,
ship/metadao-scraper). Only main remains. Codex bidirectional mirror unchanged.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Finding #1 (recommendation, applied): infra-mode now pushes only main + tags
to GitHub. Agent review branches (epimetheus/*, ganymede/*) stay Forgejo-only.
Public GitHub history reflects merged work, not pre-review WIP with internal
agent context.
Bidirectional mode unchanged — codex still mirrors all branches so external
contributors can fork from any branch.
Nit #4: setup script m3taversal username has a comment explaining it's a
placeholder for fine-grained PAT auth, mirrors the existing teleo-codex remote.
Two pre-existing nits filed for follow-up branch:
- hardcoded `living-ip:` in GH_PR_NUM head filter (line 273)
- spurious CRITICAL log on GH→forgejo→GH cycles (re-fetch forgejo after Step 2.5)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wraps the per-repo body in sync_repo() and loops over MIRROR_REPOS at the
bottom. teleo-codex stays bidirectional (full PR roundtrip + pipeline.db
linking). teleo-infrastructure runs main_only: branch+tag sync Forgejo→
GitHub, ff-only GitHub→Forgejo on main, divergence alerting per-repo.
Steps 2.1 (fork PR refs) and 4 (Forgejo PR auto-create + DB link) gated
on MODE=bidirectional.
Setup script (deploy/setup-infra-mirror.sh) initializes the bare repo at
/opt/teleo-eval/mirror/teleo-infrastructure.git, configures remotes,
performs initial Forgejo→GitHub push. Idempotent. Pre-flight checks both
GitHub repo (must be created manually first — fine-grained PAT can't
create repos in the org) and Forgejo repo are accessible.
Per-repo divergence state file (.divergence-count.<repo>) so each repo
has independent counter + alert state. Also pulls in the source_channel
update from Apr 6 that lived only on VPS (line 215 added 'github').
Not deployed yet — pending Ganymede review and GitHub repo creation.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>