Commit graph

230 commits

Author SHA1 Message Date
Teleo Agents
c9515c770a fix(attribution): classify submitted_by by branch prefix at PR discovery
Some checks are pending
CI / lint-and-test (pull_request) Waiting to run
reweave.py and ingestion run as the operator Forgejo token, so the prior
opener-based classifier set submitted_by=m3taversal for every system
maintenance PR. backfill_submitted_by.py never overrides non-NULL rows,
so this misattribution accumulated: ~2,748 reweave/ingestion PRs and
~3,706 <agent>/ research/entity PRs were credited to the operator on
the leaderboard and contribution_events table.

Two parts:

1. lib/merge.py: at PR discovery, classify by branch prefix first.
     reweave/, ingestion/             -> submitted_by = 'pipeline'
     <agent>/ (per _AGENT_NAMES)      -> submitted_by = '<agent>'
     otherwise human                  -> submitted_by = author.lower()
     otherwise pipeline               -> submitted_by = None
                                         (extract.py sets from proposed_by)
   Origin flag updated so domain detection and priority still fire for
   branch-classified pipeline PRs. Human PRs lowercased to maintain the
   canonical-handle contract enforced in PR #9.

2. scripts/reattribute-by-branch-prefix.py: historical cleanup.
   Per affected PR (atomic):
     - UPDATE prs.submitted_by  -> target
     - UPDATE sources.submitted_by where source_path matches
     - UPDATE contribution_events handle ('m3taversal',role='author')
       -> target, kind='agent'. Collision (target already has author
       event for PR) deletes the m3ta row; target wins.

   Scope is deliberately conservative: extract/ branches stay attributed
   to m3taversal because proposed_by-missing legitimately defaults to the
   operator (telegram drops). Only reweave/, ingestion/, and <agent>/.

   Dry-run shows 6,454 PRs + 284 events to move. Pre-flight collision
   query returns 0; pre-flight kind check confirms m3ta has only role=author
   events on this set (no challenger/synthesizer/evaluator).

   Idempotent. Dry-run by default. Run with --apply after deploy + DB
   snapshot.
2026-05-13 03:49:10 +00:00
3dca3aab5f Merge pull request 'docs: rewrite public README' (#8) from ship/readme-public-rewrite into main
Some checks are pending
CI / lint-and-test (push) Waiting to run
Reviewed-on: #8
2026-05-13 03:20:02 +00:00
2ee9dd5150 Merge pull request 'fix(activity-feed): canonicalize contributor handle so profile links resolve' (#9) from fix/activity-feed-canonical-handle into main
Some checks are pending
CI / lint-and-test (push) Waiting to run
Reviewed-on: #9
2026-05-13 03:19:41 +00:00
b29ec95dd8 Merge pull request 'fix(attribution): canonicalize submitted_by at write time + historical normalizer' (#10) from fix/canonicalize-submitted-by into main
Some checks are pending
CI / lint-and-test (push) Waiting to run
Reviewed-on: #10
2026-05-13 03:19:27 +00:00
Teleo Agents
74bf0461e8 fix(attribution): canonicalize submitted_by at write time + historical normalizer
Some checks are pending
CI / lint-and-test (pull_request) Waiting to run
Companion / write-side fix to fix/activity-feed-canonical-handle.

The activity-feed canonicalization was a read-side guard. The bug at the
source is that extract.py and two backfill scripts write decorated
strings (Vida (self-directed), pipeline (reweave), @m3taversal) into
prs.submitted_by and sources.submitted_by. Downstream readers
(lib.contributor.insert_contribution_event, scripts/scoring_digest,
diagnostics/activity_feed_api) all strip the decorator on read — but
anything that reads the column verbatim (like /api/activity-feed before
the read-side fix) 404s on /contributors/{decorated-handle}.

Stop writing the decorator. The self-directed signal is already carried
by intake_tier == research-task plus the prs.agent column; the suffix
is redundant string noise that costs us correctness at every consumer
that forgets to strip.

Changes:

- lib/extract.py:690 — write canonical handle via attribution.normalize_handle.
  Direct elif for intake_tier == research-task now stores just agent_name.
  @m3taversal -> m3taversal.

- diagnostics/backfill_submitted_by.py — same fix in two branches plus
  the reweave branch (pipeline (reweave) -> pipeline).

- scripts/backfill-research-session-attribution.py — UPDATE prs sets
  agent handle alone, no suffix. Docstring + log line updated.

- scripts/normalize-submitted-by.py (new) — one-time backfill that
  canonicalizes existing prs.submitted_by and sources.submitted_by rows.
  Strips trailing parenthetical decorators, lowercases, drops @. Defaults
  to dry-run; --apply to commit. Skips rows that would normalize to
  invalid handles (no garbage falls through silently).

Dry-run against live pipeline.db:
  prs:     3008 rows need normalization (clean mappings, 0 invalid)
  sources: 730 rows need normalization (clean mappings, 0 invalid)
  Total:   3738 rows. All map to existing handle column values.

After this lands + auto-deploys, the operator should run
  python3 scripts/normalize-submitted-by.py --apply
once to clean historical rows. The read-side canonicalization in
diagnostics/activity_feed_api.py (fix/activity-feed-canonical-handle)
becomes redundant defense-in-depth instead of load-bearing.

No KB writes.
2026-05-13 02:56:50 +00:00
Teleo Agents
01097da22c fix(activity-feed): canonicalize contributor handle so profile links resolve
Some checks are pending
CI / lint-and-test (pull_request) Waiting to run
The activity feed was returning decorated strings like "Vida (self-directed)"
and "@m3taversal" in the contributor field. The frontend uses that field as
both display label and routing handle, so /contributors/Vida%20(self-directed)
404s — Next fires notFound() in [handle]/page.tsx.

Root cause: _normalize_contributor only stripped @ and whitespace; it did not
lowercase or strip the " (self-directed)" suffix that extract.py and the
older backfill_submitted_by.py wrote into prs.submitted_by. Mixed-case
agent names (Vida vs vida) and pipeline decorators ("pipeline (reweave)")
both fell through.

Fix: lowercase + strip any trailing parenthetical decorator. Valid handles
match ^[a-z0-9][a-z0-9_-]{0,38}$ per attribution._HANDLE_RE and cannot
contain parens, so the strip is lossless.

DB simulation against 3612 merged-PR events: 0 orphan handles after
normalization (was 12 orphan label-variants before).

No KB writes — pure read-side normalization in the API layer.
2026-05-13 02:39:18 +00:00
6c66da33e4 feat(activity-feed): add pr_url field for GitHub PR clickthrough
Some checks failed
CI / lint-and-test (push) Has been cancelled
2026-05-11 20:58:36 -04:00
c3f2010a42 feat(activity-feed): add kind + target_url, fix research-session pseudo-slugs
Some checks are pending
CI / lint-and-test (push) Waiting to run
The /api/activity-feed event shape didn't give the frontend a reliable
clickability signal. Two failure modes:

1. Source-archive events (extract/* PRs that filed a paper into
   inbox/archive/ but didn't extract a claim) returned claim_slug="".
   Frontend rendered <Link href="/claims/"> which Next normalized to
   /claims and redirected to /knowledge-base. Wrong page.

2. Research/entity session commits (e.g. astra/research-2026-05-11)
   with empty descriptions fell through to "create" classification with
   a pseudo-slug like research-2026-05-11. Frontend rendered
   /claims/research-2026-05-11 -> 404.

Fix:

- Add `kind` enum (canonical): claim_merged | claim_enriched |
  claim_challenged | source_archived | session_digest. Replaces the
  internal `type` for downstream consumers; `type` kept populated for
  in-flight callers during migration.

- Add `target_url`: explicit clickability signal. Frontend renders
  <Link> when non-null, <span> when null. No special-casing needed.
    * claim_* events -> /claims/{slug}
    * source_archived -> Forgejo blob URL at inbox/archive/{domain}/{slug}.md
    * session_digest -> null (no clickthrough surface yet)

- Detect research/entity commits with empty descriptions as
  session_digest in _classify_event, instead of synthesizing a phantom
  create event with a date-shaped pseudo-slug.

- type filter accepts both legacy `type` and new `kind` values so
  callers migrate at their own pace.

Verified live: source events resolve to inbox/archive/{domain}/...
Forgejo URLs, session-digest rows return target_url=null,
claim_merged events keep /claims/{slug} unchanged.
2026-05-11 12:36:25 +01:00
ed4893e837 fix(claims): unwrap ```markdown code fences + 404 for fragments
Some checks are pending
CI / lint-and-test (push) Waiting to run
Two issues Ship hit on the Montreal Protocol claim:

1. 500 on canonical stem lookup. File starts with ```markdown wrapper
   instead of bare --- frontmatter delimiter. _split_frontmatter checked
   startswith("---") and bailed, returning "frontmatter parse failed".
   Same wrapper exists on 6 other claim files (audit grep). Now strip
   the wrapper before frontmatter detection.

2. 404 on long activity-feed slug. Same root cause — _build_indexes
   couldn't read the file's title from frontmatter, so by_title never
   indexed it, so title-fallback resolution had nothing to match against.
   Both bugs collapse once we unwrap.

Also: switched "file exists but has no frontmatter" from 500 to 404 with
reason=file_no_frontmatter. These are stray enrichment fragments living
in domains/ that never got merged into a parent claim. From the API
caller's perspective there's no claim at that slug — 500 implied
"server bug, retry later" which isn't actionable.

Verified: 3/3 wrapped claims (montreal, medicare, dod) now return 200
warm-cache ~13ms. Long-slug repro (montreal) resolves via title fallback
to canonical stem. Negative test (nonsense slug) still 404.
2026-05-11 12:02:54 +01:00
73880e138d fix(claims): resolve long activity-feed slugs to canonical file stems
Some checks are pending
CI / lint-and-test (push) Waiting to run
Activity feed emits slugs derived from PR description (the slugified claim
title), which can be longer than the on-disk file stem (agents pick shorter
hand-chosen filenames). Pure exact-stem lookup 404s on those.

Three-tier resolution in handle_claim_detail:
1. Exact stem match (existing behavior)
2. Title fallback: normalize requested slug, look up via by_title index
   (already populated from frontmatter title during _build_indexes)
3. Prefix fallback: longest common prefix among stems, anchored at 32 chars
   to prevent spurious hits

Response slug returns the canonical on-disk stem so frontend share-links
and caches converge to one form.

Repro: GET /api/claims/spacex-and-amazon-kuiper-non-endorsement-of-wef-debris-
guidelines-demonstrates-systemic-voluntary-governance-failure-at-the-scale-
where-it-matters-most was 404; now 200, returns shorter on-disk slug
'...-governance-failure'. Negative case (nonsense slug) still 404s.

Reported by Ship — Cory-facing demo path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:51:41 +01:00
1bc541ac93 fix(reaper): tighten research-session pattern to literal YYYY-MM-DD shape
Some checks are pending
CI / lint-and-test (push) Waiting to run
Apply Ganymede review of 50b888a:

MUST-FIX — pattern %/research-2% was broader than the comment claimed.
Matched anything/research-2[anything] including agent-named branches like
theseus/research-2nd-attempt-on-X or vida/research-2024-revisited. The
documented invariant said "date suffix only" but the SQL didn't enforce
it. Defense-in-depth was the framing; pattern needed to match the
framing.

Fix uses SQLite `_` single-char wildcards: research-20__-__-__ requires
exactly research-20[2-char][-][2-char][-][2-char], i.e. literal
YYYY-MM-DD shape. Threads the needle:
  - theseus/research-2026-04-30  ✓ (catches all 15 currently stuck)
  - rio/research-2099-12-31      ✓ (good through 2099)
  - theseus/research-2nd-attempt ✗ (correctly excluded)
  - vida/research-2024-revisited ✗ (correctly excluded — no -MM-DD shape)
  - rio/research-batch-agents-... ✗ (no date prefix at all)

NIT — comment said "Three classes qualify" then listed four. Off-by-one
fixed; comment now correctly says "Four classes."

Pre-deploy verified: tighter pattern catches all 15 currently-stuck
research PRs (clay/leo/astra/theseus/vida/rio research-2026-{04-28
through 05-02}). Zero false-positive risk on current branch namespace.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:10:49 +01:00
50b888a751 fix(reaper): extend allowlist to */research-2* daily-cron sessions
Apply Step 1 of stuck-PR triage. The May 7 reaper allowlist (extract/,
reweave/, fix/) deliberately excluded all agent-prefix branches per
Ganymede's review nit #3 — the rationale being that agent branches are
WIP feature work owned by the agent and shouldn't be auto-closed.

That decision was correct for theseus/feature-foo style branches.
It's wrong for {agent}/research-{YYYY-MM-DD} branches: those are daily
cron output, categorically disposable, regenerated by tomorrow's session.
Same shape as extract/ — content the pipeline-cron created and can
recreate, not feature work owned by the agent.

Production impact: 15 of 16 currently-stuck PRs are research-session
verdict-deadlocks aged 8-12 days. Without this change they sit forever
because the substantive_fixer can't classify (eval_issues=[] or
mechanical-only) and the reaper allowlist excludes them. Once live, next
hourly reaper cycle picks them up under the standard 24h-deadlock gate.

Pattern choice: %/research-2* (date-suffix) over %/research-% (loose).
Verified 15/15 stuck PRs match the tight pattern; sanity-check found
rio/research-batch-agents-memory-harnesses (manually-named, not date-
suffixed) which the loose pattern would catch and the tight pattern
correctly excludes. Closed-status today, but a future hand-named research
thesis branch sitting in request_changes for 24h would have been at risk.
The date prefix '2' threads the needle until 2030 and ages naturally.

Documented as an allowlist invariant ("disposable pipeline-generated
branches") rather than a list, per Step 3 of the plan — future additions
should match the invariant or update it explicitly.

Verified live before pushing:
- 15/15 currently stuck research PRs match the new pattern
- Zero false positives on existing branch namespace (closed branches
  excluded by status='open' guard regardless)
- Existing extract/ reweave/ fix/ allowlist members unchanged

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:00:48 +01:00
0eb26327fc feat(claims): /api/claims/{slug} canonical detail endpoint
Some checks are pending
CI / lint-and-test (push) Waiting to run
Implements Ship's claim detail contract — one round-trip, all data
resolved server-side. Replaces thin domain-only stub with full tree walk
(domains/ + foundations/ + core/), DB joins for PRs and reviews, and
server-side wikilink resolution to eliminate frontend N+1 cascades.

Response shape (Ship brief 2026-04-29):
  slug, title, domain, secondary_domains, confidence, description,
  created, last_review, body (raw markdown), sourced_from, reviews,
  prs, edges {supports,challenges,related,depends_on}, wikilinks

Wikilink resolution:
- Builds title→stem index from frontmatter title field, fallback to
  filename stem normalized via _normalize_for_match
- Returns flat {link_text: slug_or_null} map; unresolved → null so
  frontend can render plain text
- Inline normalization (lowercase, hyphen↔space, collapse whitespace,
  strip punctuation). Note: lib/attribution.py exposes only
  normalize_handle today, not the title normalizer Ship referenced.
  If a canonical helper lands later, point at it.

Caches:
- title→slug index: 60s TTL (warm cache <20ms p50 verified)
- list endpoint: 5min TTL (preserved from prior)
- Cold: ~3.3s for tree walk of 1,866 files; warm: 13-17ms

Bug fixed in second pass:
- _resolve_sourced_from defaulted title="" which leaked LIKE '%%'
  matching every PR. Now requires non-empty title+stem; handler falls
  back to slug.replace("-"," ") when frontmatter title is missing.

Verified live on VPS:
- AI diagnostic triage claim (no fm.title): sourced_from=1, prs=0
  (correct — Feb claim, pre-description-tracking)
- Recent extract PR claim: sourced_from=1 with URL, prs=1, reviews=1,
  last_review populated, edges 3 supports + 7 related, wikilinks 0
- 404 on missing slug: correct
- Claim with [[maps/...]] wikilink: 5/6 resolved (correct null on map)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 17:37:26 +01:00
fc002354d4 fix(substantive_fixer): json_valid guard in front of json_each
Some checks are pending
CI / lint-and-test (push) Waiting to run
Ganymede review of 5db6a02 (msg 2 of 3): json_each(invalid_json) throws
'malformed JSON' and propagates up through EXISTS, failing the SELECT.
The fix-cycle call site at teleo-pipeline.py:104 isn't try/except wrapped
(the reaper at line 109-116 is, the substantive cycle isn't), so a single
corrupt eval_issues row would trip the fix-stage breaker after 5 occurrences.

Fix is one line — AND json_valid(eval_issues) before the EXISTS clause.
json_valid(NULL) returns NULL (false in WHERE), json_valid(invalid) returns 0,
json_valid(valid) returns 1. SQLite 3.9+, predates VPS 3.45.1.

WARN-on-corrupt-JSON path kept per Ganymede's Q3 — json_valid and json.loads
use technically distinct parsers, cost is ~3 rows × parse-empty-string per
cycle, journal entry names the failure mode if SQLite ever surfaces a row
that passes both SQL guards but fails json.loads.

Comment updated to reflect new guard ordering.
2026-05-08 13:12:25 -04:00
5db6a0248c fix(substantive_fixer): SQL-side actionable-tag filter, eliminate head-of-line
Step 4 of the stuck-PR triage. Push the FIXABLE/CONVERTIBLE/UNFIXABLE_TAGS
intersection from a post-fetch Python loop into the SELECT WHERE clause via
json_each + EXISTS. LIMIT 3 now always returns 3 actionable rows (or fewer if
that's all there are), eliminating the head-of-line block where 3 oldest
empty-eval_issues PRs occupied the slots forever.

Background: 11 hours of post-deploy logs showed substantive_fix_cycle stuck
emitting "0 actionable from 3 candidate(s) — head-of-line: [(3922, []), (3926,
[]), (3940, [])]" every cycle. Reaper closed those three on schedule, then a
new triple of empty-eval_issues PRs took their place. Reaper-as-primary-clearance
worked but is defense-in-depth, not the right architecture. Source of the block
is upstream in this SELECT.

Implementation choice: json_each + EXISTS over LIKE. Robust against tag-name
substring overlap, future-proof against tag renames, and SQLite 3.45.1 on VPS
fully supports it. Verified live: returns 13 of 28 currently-stuck PRs as
actionable, 15 fall through to reaper as before.

Tag list builds from the routing constants at runtime so adding a new tag
auto-updates the SELECT filter — no two-place edit footgun.

WARN-on-corrupt-JSON path retained as defense-in-depth (json_each and
json.loads use different parsers; technically possible for a row to pass one
but not the other).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 12:52:12 -04:00
4b2b59b184 fix(reaper): branch allowlist for disposable pipeline-managed branches
Some checks are pending
CI / lint-and-test (push) Waiting to run
Apply Ganymede review nit #3 from f97dd15 review (the deferred close_on_forgejo
fix already landed in e14b5f2 — Ganymede was reviewing the older commit).

SQL gate previously had no branch filter — empirically all 92 candidates were
extract/* but structurally any agent branch in the deadlock shape was a
candidate. Positive allowlist for extract/, reweave/, fix/ scopes the reaper
to disposable pipeline-managed branches that the pipeline created and can
recreate. Agent branches (theseus/, vida/, epimetheus/, etc.) are WIP feature
work and must not be reaped — owners review their own PRs on their own cadence.

Cheap target-class lock complementing the LIMIT 50 blast-radius cap.
Same scoping principle as PIPELINE_OWNED_PREFIXES, but tighter — epimetheus/
review branches are pipeline-owned for merge purposes but NOT disposable.

Items 2-4 from this review:
- WARNING #2 (audit_log idx_audit_event_ts): defer to followup branch alongside
  sync-mirror migration cleanup, as Ganymede suggested.
- NIT #3 (this commit): branch allowlist applied.
- NIT #4 (token asymmetry comment=admin/close=leo): confirmed established
  codebase pattern. merge.py:946-948 does the same — comment system-toned,
  close attributed to Leo for verdict-source UI clarity. Not accidental.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 23:43:53 -04:00
ba234ec4b3 fix(reaper): apply Ganymede review — dual-PATCH drift, breaker isolation, env config
Followup to f97dd15. Four fixes from review:

MUST-FIX #1 — Forgejo double-PATCH drift
  reaper closes PR via forgejo_api PATCH at line 689, then close_pr() at
  line 700 issued a second PATCH (default close_on_forgejo=True). On
  transient failure of the second PATCH, close_pr returns False without
  updating the DB → status='open' even though Forgejo is closed. Pass
  close_on_forgejo=False so DB close is unconditional after the explicit
  Forgejo PATCH succeeds.

MUST-FIX #2 — reaper exception trips fix breaker
  Unhandled exception in verdict_deadlock_reaper_cycle propagated to
  stage_loop, recording fix-stage failures. After 5 reaper failures the
  fix breaker would open and block mechanical+substantive for 15 min.
  Wrap reaper call in try/except in fix_cycle (same exception-isolation
  pattern as ingest_cycle's extract_cycle wrapper). Defense-in-depth
  must never block primary paths.

WARNING #1 — throttle SQL full-scan
  audit_log only has idx_audit_stage. Filtering on event alone caused
  full-table scans every 60s. Added stage='reaper' so the planner uses
  the existing index — reaper writes audit rows under stage='reaper'
  already so the filter is correct.

WARNING #2 — REAPER_DRY_RUN as code constant
  Flipping dry-run → live required edit + commit + push + deploy +
  restart. Moved REAPER_DRY_RUN, REAPER_DEADLOCK_AGE_HOURS,
  REAPER_INTERVAL_SECONDS, REAPER_MAX_PER_RUN to lib/config.py with
  os.environ.get() overrides. Operator now flips via systemctl edit
  teleo-pipeline.service (Environment=REAPER_DRY_RUN=false) + restart.
  Defaults remain safe: dry-run, 24h age, hourly throttle, 50/run cap.

NIT — dry-run counter naming
  Renamed local `closed` counter in dry-run path to `would_close` so the
  heartbeat audit ("X closed, Y would-close") and journal log are
  unambiguous. Function still returns closed + would_close so callers
  see total work done.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 23:43:53 -04:00
e63d27d259 fix(reaper): verdict-deadlock reaper — close stuck PRs after 24h
Defense-in-depth for PRs that substantive_fixer can't make progress on.
Targets two stuck-verdict shapes empirically observed in production:

  1. leo:request_changes + domain:approve
     Leo asked for substantive fix; fixer either failed silently
     (no_claim_files / no_review_comments / etc.) or the issue tag isn't
     in FIXABLE | CONVERTIBLE | UNFIXABLE.

  2. leo:skipped + domain:request_changes
     Eval bypassed Leo (eval_attempts >= MAX). Domain rejected with no
     structured eval_issues. fixer can't classify the issue.

92 PRs match this gate today, oldest at 2026-04-24 (13d stuck).

Behavior:
  - Hourly throttle via audit_log sentinel ('verdict_deadlock_reaper_run').
  - REAPER_DRY_RUN=True default — first deploy emits 'would_close' audit
    events only. No DB writes. No Forgejo writes. (Ship Apr 24 directive.)
  - 24h cooldown, oldest-first, capped at 50 per run.
  - Heartbeat audit fires whether dry-run or live, so throttle works.
  - Live mode: posts comment + closes Forgejo PR + close_pr() in DB.
    Audits 'verdict_deadlock_closed' per PR.
  - Forgejo PATCH None → skip DB close (avoid drift).

Wired into fix_cycle() in teleo-pipeline.py. Runs after mechanical
and substantive fixes, never blocks them.

Followup (post first-run audit verification):
  - Operator inspects 'verdict_deadlock_would_close' audit rows
  - Flips REAPER_DRY_RUN to False, redeploys
  - Reaper actually closes on next hourly tick
2026-05-07 23:43:53 -04:00
517e9884cc fix(substantive_fixer): WARN on corrupt eval_issues JSON
Some checks are pending
CI / lint-and-test (push) Waiting to run
Third silent return path in substantive_fix_cycle — JSON-decode except
at the eval_issues parse drops rows that don't reach skipped_no_tags
or substantive_rows. If all 3 LIMIT-3 candidates have corrupt JSON,
cycle returns 0,0 with no log entry.

WARN level (not INFO): corrupt JSON is abnormal (post-merge column
drift, hand-edited DB row, partial write during crash). If this fires,
ops want to chase the upstream column-write path. If it never fires,
baseline noise stays at zero.

Closes the visibility gap on ALL silent returns in this function, not
just the two patched in 3f8666e.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 18:33:08 -04:00
3f8666ee0c fix(substantive_fixer): surface silent-skip reasons at INFO
Two silent paths in substantive_fix_cycle masked a 13-day stall:

1. Filter strips all candidates → return 0,0 with no log. With LIMIT 3
   ordered created_at ASC, if the oldest 3 have no fixer-actionable tags
   (e.g. eval_issues=[] from leo:skipped+domain:request_changes), the
   cycle silently picks the same head-of-line every tick.

2. _fix_pr early-returns logged at DEBUG only — invisible without
   fleet-wide DEBUG. Skip reasons (no_claim_files, no_review_comments,
   not_open lock, worktree_failed, etc.) never surfaced in journalctl.

Patch: log skipped candidate eval_issues when no actionable rows
found (path 1); promote DEBUG→INFO for per-PR skip reasons (path 2).
Zero behavior change — observability only.

Diagnosis context: 98 PRs stuck >3d, last successful substantive_fixer
event 2026-04-24. Need journal evidence to choose between (a) one-line
fix to the cycle, (b) larger _fix_pr regression. (Ship Step 2 directive.)
2026-05-07 11:58:22 -04:00
87f97eb4fa sync-mirror: surface tracker SELECT/INSERT failures to ops log
Some checks failed
CI / lint-and-test (push) Has been cancelled
Per Ganymede review: silent fall-through with no log entry is the
failure mode that bites. SELECT redirects stderr to $LOG, falls back
to empty string on failure. INSERT wrapped in if-not branch with WARN
log naming the (branch, sha, pr_number) so duplicate auto-create
possibility is visible.

Matches the Step 0/0b/4.5 observability pattern from prior reviews.
Behavior unchanged on the success path; failures now greppable.
2026-05-01 15:48:28 +01:00
ad1d82f5ee fix(sync-mirror): tracker gate to break empty auto-create loop
Diagnosis (per Ganymede pushback): the original mechanism story was wrong.
Vida and Leo show 100+ PRs at 0 merge failures — luck doesn't produce
that. Real cause is sync-mirror's auto-create loop, not session spawning.

Verified data:
- vida/research-2026-04-30: 1 commit on branch, 303 PRs in DB
- reweave/2026-04-29: 1 commit on branch, 840 PRs in DB
- Cron fires once/day per agent; reweave fires once/day at 01:00 UTC
- Forgejo currently has 0 PRs for vida (all merged/closed); 3 distinct
  SHAs total across reweave's history (PRs replay same SHA repeatedly)

Mechanism (confirmed in /opt/teleo-eval/logs/sync.log):
1. Pipeline merges PR → calls _delete_remote_branch on Forgejo
2. Next sync cycle: git fetch forgejo --prune drops the local Forgejo
   ref; refs/remotes/origin still has it (GitHub copy untouched)
3. comm sees branch GitHub-only → re-pushes to Forgejo at original SHA
4. HAS_PR check uses ?state=closed&limit=50 — closed PR for this branch
   scrolled out of pagination window long ago → returns "no"
5. Auto-create POST → fresh Forgejo PR (e.g. #7295 created at 21:46 for
   branch SHA from 04:12)
6. Pipeline merges (cherry-pick is empty no-op since content's on main;
   reweave union produces "already up to date" via the empty-diff guard
   shipped in 923454c) → _delete_remote_branch → loop

Fix (per Ganymede design point #2: "right place is discovery, not
_claim_next_pr"): SHA-based tracker in pipeline.db. Records (branch, sha)
after every successful auto-create. Subsequent cycles see the same
(branch, sha) → skip the entire push+create sequence. Cheap O(1) sqlite
lookup per branch per cycle.

Why SHA, not branch: research-session.sh and nightly-reweave.sh both use
--force push, so a branch can legitimately get new commits over time.
Tracker keys on SHA so genuine new commits produce a tracker miss → PR
creation proceeds normally. No regression on legitimate branch reuse.

Why pipeline.db, not flat file: shared with discover_external_prs +
audit_log + the agent's own tooling; survives sync-mirror restarts;
ACID-safe under the cron's 2-min cadence. CREATE IF NOT EXISTS is
inline (no migration needed) because this table is private to
sync-mirror — pipeline daemon doesn't read it.

Validated against /tmp/pipeline-test.db copy: gate fires on known
(branch, sha), misses on new SHA (correctly allows new content).

Defense-in-depth — leaves existing HAS_PR check in place. Tracker is
the durable signal; HAS_PR is best-effort and may catch cases the
tracker hasn't seen yet (e.g. PR created out-of-band).

Reweave numbers (Ganymede point #3): same shape, same fix. Both research
and reweave loops killed by the same gate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 15:42:47 +01:00
923454c9ea extract: document basename-uniqueness invariant + skip _-prefixed archive files
Some checks failed
CI / lint-and-test (push) Has been cancelled
Two nits from Ganymede review of ed4af4d:

1. Archive-basename filter depends on basename-uniqueness across queue+archive.
   Current naming (date-prefix + topic-slug) makes collisions rare, but if
   short generic names like "notes.md" enter the queue, the filter silently
   false-positives. Comment block names the assumption.

2. Archive walk now skips _-prefixed files, matching the standing convention
   everywhere else (search.py STRUCTURAL_FILES, reweave wiki-link skip, Layer
   0 entity exclusion). Defensive — no _*.md exists under inbox/archive/
   today, but consistent with codebase convention if a future operator drops
   _README.md to document the directory.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 11:09:19 +01:00
ed4af4d72e fix(extract): dedup queue sources whose basename is already in archive
Daemon re-extracted same source every ~4h cycle when research-session
commits on agent branches re-introduced already-archived queue files.
Existing daemon filters (DB-status, open-PR, 4h cooldown) all missed
this pattern because the queue file gets a fresh sources row at
status='unprocessed' on each re-add, the cooldown lapses exactly at
the cycle interval, and the open-PR filter only catches in-flight
extractions.

Add an archive-basename filter immediately after the queue scan: if
a file with this basename exists anywhere under inbox/archive/, skip.
Archive copy is the source of truth — once extracted, the queue copy
is stale by definition.

Validation against pipeline.db (last 7d):
  78 sources had multiple extract PRs (32% duplicate rate)
  73/78 (94%) carry an archive copy and would have been caught.
  Current queue: 35/99 sources (35%) have archive duplicates today.

Pentagon-Agent: Epimetheus <0144398e-4ed3-4fe2-95a3-3d72e1abf887>
2026-04-30 11:05:39 +01:00
ed5f7ef6cc fix(merge): correct audit-ref comment + add sentinel-drift warning
Some checks failed
CI / lint-and-test (push) Has been cancelled
Two nits from Ganymede line-level review of 7741c1e:

1. Comment at lines 562-565 said --force-with-lease but code is plain
   --force. Comment now describes the actual behavior: bot-owned per-PR
   audit ref, intentional overwrite on stale refs from prior aborted
   attempts, no concurrent writer to lease against.

2. Sentinel-regex extraction in _merge_domain_queue dispatch had no
   graceful-failure log. If the _merge_no_ff_external success-message
   contract drifts and any of the three regexes (M, audit_ref, external
   PR #) miss, dispatch silently builds a comment with None values and
   writes audit_log JSON with null fields. Added a warning log when any
   regex misses — signal-only, doesn't gate the close path since the
   merge already succeeded.

Branch: epimetheus/external-merge-flow-bug1
Parent: 7741c1e (Ship Msg 3 architecture review close)
Diff:   +11/-3, single file lib/merge.py

Ganymede: 3-message protocol Msg 3 (nits applied, ball returned).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 16:19:08 +01:00
7741c1e6de fix(merge): synthetic _merged/* ref + function-owned ff-push (Ship Msg 3)
Phase 2 review fix #1 (architectural pushback): replace force-push of
contributor's gh-pr-N/* branch with a three-step synthetic-branch flow:

  1. Worktree on local branch _merged-{slug} from origin/main
  2. git merge --no-ff origin/{branch} into the local branch
  3. Push merge commit to origin/_merged/{branch} (synthetic audit ref)
  4. Function ff-pushes merge_sha → origin/main directly

Contributor's gh-pr-N/* branch on Forgejo is now never touched.
Force-pushing it would have rewritten the tip with a merge commit the
contributor didn't author — confusing bot force-push in Forgejo PR UI.
Mirrors the _clean/* synthetic branch pattern in cherry-pick.

Function now owns the push to main (was dispatch's job for cherry-pick
and reweave). Returns sentinel "merged --no-ff (external PR #N, M=<sha>,
audit_ref=...)" that dispatch detects to skip its ff-push and route
directly to PR-close + mark_merged + audit. Audit detail JSON now
includes merge_commit_sha + audit_ref + github_pr (Ship review #5).

Smoke-tested in scratch repo end-to-end:
  - contributor branch tip unchanged ✓
  - audit ref _merged/gh-pr-90/... carries merge SHA ✓
  - main tip equals merge SHA (ff-push, no force) ✓
  - contributor SHA ancestor of main → GitHub badge fires ✓

Sentinel return parsed via 3 regexes in dispatch (full 40-char SHA in
return string for durability). Branch comment in dispatch explicitly
notes contributor branch is left in place — sync-mirror keeps the
GitHub PR <-> Forgejo PR link observable through it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 15:32:52 +01:00
992b4ee36f feat(merge): _merge_no_ff_external for gh-pr-* branches (Phase 2)
External GitHub fork PRs need their contributor commit SHA in main's history
for GitHub's "merged" badge to fire. Cherry-pick rewrites the SHA, breaking
that detection. New _merge_no_ff_external function preserves the SHA via a
true merge commit.

Mechanics (mirrors _cherry_pick_onto_main shape):
1. Fetch origin/main + origin/{branch}
2. Detached worktree at origin/main, git merge --no-ff origin/{branch}
   with verbose message: "Merge external GitHub PR #{N}: {branch_slug}"
3. Force-push merge commit M as origin/{branch}, replacing branch tip
4. Dispatch's existing ff-push origin/{branch} → main propagates M to main

M has parents [main_sha, contributor_sha]. M is a fast-forward descendant
of main_sha (first-parent chain), so the ff-push to main is valid without
--force. Contributor SHA reachable from main → GitHub recognizes merged.

Conflict handling: same auto-resolve as cherry-pick — entity-only conflicts
take main's version (--ours = current worktree HEAD = main), other conflicts
abort with detail.

Backout: config.EXTERNAL_PR_NO_FF_MERGE = True (default). Set False to fall
back to cherry-pick if no-ff destabilizes throughput one week pre-Accelerate.

Branch dispatch in _merge_domain_queue:
- reweave/* → _merge_reweave_pr (existing)
- gh-pr-N/* AND config.EXTERNAL_PR_NO_FF_MERGE → _merge_no_ff_external (new)
- everything else → _cherry_pick_onto_main (existing default)

Verified end-to-end in scratch repo:
- merge commit M has [main_sha, contributor_sha] as parents
- contributor SHA is ancestor of M
- after ff-push, contributor SHA is in main's history (GitHub badge fires)
- regex parses 8 cases correctly (real fork PR + edge cases reject cleanly)

Architecture per Ship Msg 3 / doc v3 (537cfd5 on epimetheus/external-merge-flow-design).
Phase 1 (sync-mirror self-heal) deployed yesterday. Phase 3 (FwazB PR #90 cleanup)
queued behind this deploy.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 15:18:37 +01:00
de204db539 fix(sync-mirror): tighten gh-pr-* regex + document SQL-integer-safety
Some checks are pending
CI / lint-and-test (push) Waiting to run
Ganymede review nit on commit 1eb259d:

- Regex changed from [0-9]* (zero-or-more) to [0-9][0-9]* (one-or-more,
  portable BRE form of [0-9]+ that works on both GNU and BSD sed).
- Empty/non-numeric branches now fail at parse, not just at the empty-guard
  below — SQL-integer-safety load-bearing on the regex alone.
- Comment above the UPDATE notes the integer-validation invariants
  (INTEGER `number` column + regex-validated gh_pr_num) since bash sqlite3
  has no parametric binding.

Smoke tested: gh-pr-/foo, gh-pr-abc/foo no longer parse to non-empty.
gh-pr-90/main, gh-pr-4066/contrib/x, gh-pr-1/x all parse correctly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 13:07:50 +01:00
1eb259de8a fix(sync-mirror): self-heal sweep for orphaned gh-pr-* github_pr links
Step 0 (new): runs once per cron tick before per-repo work. Selects PR rows
where branch matches gh-pr-% but github_pr IS NULL, parses the PR number
from the branch name, and updates github_pr + source_channel='github'.

Recovers from races and transient failures in the existing Step 4.5 link
UPDATE — no retry path before. The sweep IS the backfill: same SELECT/UPDATE
heals historical orphans (FwazB PR 4066 picked up on first cron tick) AND
future races on subsequent ticks. No separate one-shot script needed.

Properties:
- Idempotent: SELECT empty when clean, zero work
- No API calls: branch name encodes the GitHub PR number deterministically
- Bounded log volume: one line per actually-healed row
- Runs before any sync_repo work, ahead of branch-mirror loop and the
  auto-create-PR block in Step 4 — same-cycle convergence on fresh races

Closes the Bug #2 path that left FwazB's PR 4066 with github_pr=NULL,
preventing on_merged() from posting comment + closing the GitHub PR.

Verified end-to-end on live DB snapshot:
- before: 4066 had github_pr=NULL
- after sweep: 4066 has github_pr=90, source_channel='github'
- second run: zero output (idempotent)

Phase 1 of docs/external-contributor-merge-flow.md (v2, sweep-only).
Ship architecturally approved Msg 2/2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 13:02:37 +01:00
b8504c1b60 docs: rewrite public README
Some checks failed
CI / lint-and-test (pull_request) Has been cancelled
Replaces the directory-listing format with one that explains what the
pipeline does and shows production scale. Verified all numbers against
production (1,546 claims, 13 domains, 1,975 merged PRs, 508 last-7d
throughput, 94% approval, ~\$0.10/merged claim incl. all stages).

Removes the VPS layout section (IP + paths + username) per Epimetheus
review — that detail moves to the private teleo-ops repo. Generalizes
deploy targets without naming the host.

Adds two Mermaid diagrams (pipeline flow + review tier matrix), both
syntactically safe across GitHub and Forgejo 9 / Gitea 1.22.

Drops the per-directory ownership table — CODEOWNERS is the single
source of truth on review authority. Keeps the high-level role map
for orientation only.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-28 10:19:18 +01:00
33f6ca9e3f fix(mirror): setup script pushes main+tags only (consistency with sync-mirror)
Some checks are pending
CI / lint-and-test (push) Waiting to run
Initial setup-infra-mirror.sh did `git push origin --all`, which contradicted
the main_only mode protection landed in b9c4947 — agent review branches
(epimetheus/*, ganymede/*) ended up publicly visible on the new GitHub
teleo-infrastructure mirror until I deleted them.

Initial push now mirrors the recurring sync's main_only path: refs/heads/main
+ tags only. Re-running the setup script is now idempotent at branch level —
won't redo the agent-branch leak.

Cleanup applied to live GitHub teleo-infrastructure: 18 stale agent review
branches deleted via single batched push (epimetheus/* x14, ganymede/* x3,
ship/metadao-scraper). Only main remains. Codex bidirectional mirror unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 23:09:25 +01:00
b9c4947637 fix(mirror): restrict main_only mode to main+tags (Ganymede review)
Some checks are pending
CI / lint-and-test (push) Waiting to run
Finding #1 (recommendation, applied): infra-mode now pushes only main + tags
to GitHub. Agent review branches (epimetheus/*, ganymede/*) stay Forgejo-only.
Public GitHub history reflects merged work, not pre-review WIP with internal
agent context.

Bidirectional mode unchanged — codex still mirrors all branches so external
contributors can fork from any branch.

Nit #4: setup script m3taversal username has a comment explaining it's a
placeholder for fine-grained PAT auth, mirrors the existing teleo-codex remote.

Two pre-existing nits filed for follow-up branch:
- hardcoded `living-ip:` in GH_PR_NUM head filter (line 273)
- spurious CRITICAL log on GH→forgejo→GH cycles (re-fetch forgejo after Step 2.5)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 22:54:18 +01:00
bf647b7abb feat(mirror): refactor sync-mirror.sh for multi-repo, add infra setup script
Wraps the per-repo body in sync_repo() and loops over MIRROR_REPOS at the
bottom. teleo-codex stays bidirectional (full PR roundtrip + pipeline.db
linking). teleo-infrastructure runs main_only: branch+tag sync Forgejo→
GitHub, ff-only GitHub→Forgejo on main, divergence alerting per-repo.
Steps 2.1 (fork PR refs) and 4 (Forgejo PR auto-create + DB link) gated
on MODE=bidirectional.

Setup script (deploy/setup-infra-mirror.sh) initializes the bare repo at
/opt/teleo-eval/mirror/teleo-infrastructure.git, configures remotes,
performs initial Forgejo→GitHub push. Idempotent. Pre-flight checks both
GitHub repo (must be created manually first — fine-grained PAT can't
create repos in the org) and Forgejo repo are accessible.

Per-repo divergence state file (.divergence-count.<repo>) so each repo
has independent counter + alert state. Also pulls in the source_channel
update from Apr 6 that lived only on VPS (line 215 added 'github').

Not deployed yet — pending Ganymede review and GitHub repo creation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 22:22:33 +01:00
1351db70a9 fix(tests): apply Ganymede review nits + add m3taversal reset script
Some checks are pending
CI / lint-and-test (push) Waiting to run
3 nits from review of d60b6f8 + Q4 ask:

1. test_window_24h_only_today: replace always-true assertion with
   concrete `assert handles == ["carol"]`. Push alice's most-recent
   event from -1 days to -2 days to eliminate fixture-vs-query
   microsecond drift on the 24h boundary.
2. _call helper: asyncio.get_event_loop().run_until_complete →
   asyncio.run (deprecation in 3.12, raises in some 3.14 contexts).
3. test_invalid_limit_falls_to_default: dead first call removed,
   misleading "7 entries" comment now matches assertion.

Q4: scripts/reset-m3taversal-sourcer.py captures the surgical
UPDATE we ran on VPS as a reviewable artifact. Idempotent (no-op
on already-reset rows), audit_log entry per run. Ganymede's point:
DB mutations should leave a code paper trail, not just an audit
row whose origin lives only in the executor's memory.

30/30 tests pass on VPS hermes venv (aiohttp 3.13.5, py 3.11.15).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 17:35:18 +01:00
d60b6f8bf2 test(leaderboard): cover all four slicings + AND-prefix regression
Adds tests/test_leaderboard.py — 30 cases against
diagnostics/leaderboard_routes.py. Two reasons:

(1) Zero coverage on an endpoint Argus + Oberon are about to consume
    for the May 5 hackathon UI. Two bugs slipped through this morning
    (404 wiring missing in app.py; AND-prefix SQL syntax error on
    rolling-window). Tests prevent regression.

(2) Tests serve as living documentation for Oberon's frontend
    integration — each test names a contract guarantee
    (test_left_join_handles_missing_contributors_row,
    test_composed_window_kind_domain, test_role_breakdown_present).

Coverage:
  - _parse_window unit tests (10): all_time, Nd, Nh, caps, garbage,
    case-normalization, and explicit no-AND-prefix assertion
  - handle_leaderboard integration (18): every kind value, every
    window family, domain filter, composed filters, limit + has_more,
    invalid-input fallback, role breakdown shape, empty-window shape,
    LEFT JOIN COALESCE for handles missing from contributors
  - 2 contract assertions: LEADERBOARD_PUBLIC_PATHS membership +
    KIND_VALUES set

Run: 30/30 pass on VPS hermes venv (aiohttp 3.13.5, pytest 9.0.2).
Skips clean locally without aiohttp via pytest.importorskip.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 15:46:47 +01:00
cd5aac5cc6 fix(activity-feed): remove [:120] slug truncation
Some checks are pending
CI / lint-and-test (push) Waiting to run
Claim slugs were being cut at 120 chars in _extract_claim_slugs, causing
Timeline event clicks to 404 when the on-disk filename exceeded that
length (frontend builds /api/claims/<slug> from the truncated value).

This fix landed Apr 26 but regressed when the file was redeployed —
committing the unmangled version to repo so deploy.sh re-shipping
doesn't reintroduce the cap.

Verified live: max slug now 265 chars, 16 of 30 over the old 120 cap.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 15:27:31 +01:00
7c6417d6be test(diagnostics): activity_endpoint classify_pr_operation suite
Move tests from /tmp into the proper test suite. 22 cases covering:

- Leo gotcha: extract/* + commit_type=enrich/challenge classifies by
  commit_type, not branch prefix (same pattern as the contributor-role
  wiring fix)
- Reweave priority: branch.startswith('reweave/') wins over
  _MAINTENANCE_COMMIT_TYPES — nightly reweave PRs classify as enrich,
  not infra. Locks in the bifurcation against future priority refactors
- Full NON_MERGED_STATUS_TO_OPERATION coverage: open, approved, closed,
  conflict, validating, reviewing, merging, zombie
- Knowledge-producing commit_types (research, entity) → new
- Maintenance commit_types (fix, pipeline) → infra
- Defensive: null inputs, unknown status

aiohttp imported at module load — file uses pytest.importorskip so it
runs cleanly in any environment with aiohttp installed and skips gracefully
otherwise. sys.path inject for diagnostics/ since it isn't packaged.

Reviewed-by: Ganymede

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 13:39:44 +01:00
42d35d4e15 fix(diagnostics): wire /api/leaderboard into app.py + fix rolling-window SQL
Some checks are pending
CI / lint-and-test (push) Waiting to run
de7e5ec landed leaderboard_routes.py + the route file's register fn but
the import + register_leaderboard_routes(app) call + auth-middleware
allowlist were never added to app.py — endpoint returned 404 in production.

Three minimal edits to app.py mirror the existing register_*_routes pattern
(import at line 28, allowlist OR-clause at line 512, register call at 2365).

Plus a SQL bug in _parse_window: rolling-window clauses prefixed "AND "
but the WHERE composition uses " AND ".join(...), producing
"WHERE 1=1 AND AND ce.timestamp..." → sqlite3.OperationalError on every
window=Nd / window=Nh request. Stripped the prefix and added a comment so
the asymmetry doesn't bite again.

Verified on VPS:
  GET /api/leaderboard?window=all_time&kind=person → 200, 11 rows
  GET /api/leaderboard?window=7d&kind=person → 200, 2 rows
  GET /api/leaderboard?window=30d&kind=person → 200, 9 rows
  GET /api/leaderboard?domain=internet-finance → 200, 3 rows
  GET /api/leaderboard?kind=agent → 200, leo/rio/clay/astra/vida

Unblocks: Argus dashboard cutover, Oberon column reorder, Leo's CI
taxonomy broadcast.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 13:30:26 +01:00
de7e5ec709 feat(diagnostics): /api/leaderboard reads contribution_events directly
Some checks are pending
CI / lint-and-test (push) Waiting to run
New endpoint replaces the legacy /api/contributors *_count read path with
event-sourced reads from the Phase A contribution_events ledger.

- Params: window (all_time | Nd | Nh), kind (person | agent | org | all),
  domain (filter), limit (default 100, max 500)
- Returns per-handle CI, full role breakdown (author/challenger/synthesizer/
  originator/evaluator), events_count, pr_count, first/last contribution
- ORDER BY ci DESC, last_contribution DESC — recent contributors break ties
- Read-only sqlite URI; total/has_more computed for paginated UIs

Wiring (import + register + _PUBLIC_PATHS entry) currently applied to live
app.py on VPS only — repo app.py has drift from Ship's uncommitted /api/search
POST contract. Next deploy.sh round-trip needs both to land together.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 13:16:41 +01:00
369f6c96da fix(attribution): credit research-session sources to agents, not m3taversal (#7)
Some checks are pending
CI / lint-and-test (push) Waiting to run
Forward fix: research-session.sh writes intake_tier: research-task (no proposed_by — extract.py infers agent from branch).

Backfill: 304 PRs reattributed across 30 days (rio 74, clay 70, astra 53, vida 48, theseus 30, leo 29). Already applied to production.

Format reconciliation: normalize_handle strips (self-directed) suffix so both halves canonicalize to the same agent handle.

5 idempotency tests passing. Production replay self-extinguishes (delta 3839→3839).
2026-04-27 11:59:54 +00:00
6aff03ff56 fix(attribution): unify research-session format on "(self-directed)" suffix
Some checks failed
CI / lint-and-test (pull_request) Has been cancelled
Resolves the format inconsistency between the forward fix and the 304-row
backfill. Both halves now produce prs.submitted_by = "rio (self-directed)":

- research-session.sh: drop proposed_by from the frontmatter template.
  extract.py path 1 (proposed_by-driven) no longer fires; path 2 fires
  instead and constructs f"{agent} (self-directed)" — matches backfill.

- attribution.py: normalize_handle now strips "(self-directed)" suffix
  immediately after lowercase+@-strip, before alias lookup. Closes the
  phantom-person-event class on any future replay through
  record_contributor_attribution. Round-trips through alias rules keyed
  on bare agent names.

Test (5 cases) still passes; suffix-strip behavior verified against
hostile inputs (whitespace, casing, mid-string occurrences must NOT
match — only trailing pattern).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 12:53:52 +01:00
319e03e2c6 test(attribution): prove research-backfill replay is idempotent
Some checks failed
CI / lint-and-test (pull_request) Has been cancelled
Five tests against the real contribution_events schema (lib/db.py:181-209):
- pr-level dedup with NULL claim_path via idx_ce_unique_pr partial index
- per-claim dedup with non-NULL claim_path via idx_ce_unique_claim partial index
- pr-level and per-claim events coexist on the same pr_number
- backfill (INSERT correct + DELETE wrong) is a true no-op on replay
- replay against already-backfilled state preserves unrelated events

Schema case identified: case 2 with partial-index split solution already in
place. Two partial UNIQUE indexes target disjoint row sets (claim_path IS NULL
vs IS NOT NULL), bypassing SQLite's NULL-not-equal-NULL UNIQUE quirk.

Production replay verified: re-running backfill --apply against the live DB
returns "misattributed PRs found: 0" because the first-run UPDATE flipped the
WHERE predicate. Total contribution_events count: 3839 → 3839.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 12:50:17 +01:00
2d332c66d4 fix(attribution): credit research-session sources to agents, not m3taversal
Some checks failed
CI / lint-and-test (pull_request) Has been cancelled
Two-part fix for a bug where every claim extracted from agent overnight
research sessions was being credited to m3taversal in contribution_events
(visible in the activity feed as "@m3taversal" on agent-derived claims).

Forward fix (research/research-session.sh):
The frontmatter template the agent prompt instructs Claude to use now
includes `proposed_by: ${AGENT}` and `intake_tier: research-task`. With
those fields present, extract.py path 1 (line 687) takes precedence and
sets prs.submitted_by to the agent handle, which then propagates into
contribution_events as a kind='agent' author event for the agent.

Without the fields, extract.py fell through to the default branch on
line 695 and set submitted_by='@m3taversal'.

Backfill (scripts/backfill-research-session-attribution.py):
Identifies research-session-derived PRs by finding teleo-codex commits
matching `^<agent>: research session YYYY-MM-DD —`, listing the
inbox/queue/*.md files added in each commit's diff, and matching those
filename basenames against prs.source_path. Only PRs currently
submitted_by='@m3taversal' AND merged within the configurable window
are touched. Default --dry-run; --apply to commit.

For each match the script:
  1. UPDATE prs SET submitted_by = '<agent> (self-directed)'
  2. INSERT OR IGNORE the agent author event (kind='agent', weight=0.30)
     with the original PR's domain, channel, merged_at preserved
  3. DELETE the misattributed m3taversal author event

Applied 30-day backfill on VPS:
  - 304 PRs re-attributed (rio 74, clay 70, astra 53, vida 48,
    theseus 30, leo 29)
  - 297 m3taversal author events deleted, 304 agent author events
    inserted (delta of 7 = pre-v24 PRs that never had m3ta events
    in the first place; we still create the new agent event)
  - m3taversal author count: 1368 → 1071 (−22%)
  - Pre-backfill DB snapshot: pipeline.db.bak-pre-research-attribution

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 12:38:53 +01:00
dea1b02aa6 fix(attribution): narrow exception + document gate asymmetry (Ganymede review)
Two follow-up fixes from Ganymede's review of d0fb4c9:

1. is_publisher_handle: narrow `except Exception` to sqlite3.OperationalError.
   Pre-v26 DB fallback only needs to catch the "table doesn't exist" case;
   broader exceptions (programming errors, locks, corruption) should propagate.

2. upsert_contributor gate: add comment documenting the alias-resolution
   asymmetry between insert_contribution_event (alias-resolved via
   normalize_handle) and upsert_contributor (bare lower+lstrip-@). Today this
   is fine because the v26 classifier produced one publisher row per canonical
   handle. Branch 3 will normalize alias→canonical at writer entry points,
   tightening this gate transparently.

Unit tests for the gates (positive + negative + alias resolution) deferred to
Branch 3 alongside the auto-create flow tests.

Smoke-tested:
  - pre-v26 fallback (no publishers table) → None (correct)
  - case-insensitive match (CNBC → id=1) → correct
  - @ prefix strip (@cnbc → id=1) → correct
  - non-publisher handle (alexastrum) → None (correct)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 14:25:24 +01:00
d0fb4c96e3 fix(attribution): gate writer on publishers table (regression prevention)
Schema v26 (commit 3fe524d) split orgs/citations from contributors into
the publishers table. Without a writer-side gate, every merged PR with
`sourcer: cnbc` (or similar) re-creates CNBC as a contributor and
undoes the v26 classifier cleanup. Once normal pipeline traffic resumes,
the contributors table re-pollutes within hours.

Fix: belt-and-suspenders gate at both writer surfaces.

1. `lib/attribution.py::is_publisher_handle(handle, conn)` — returns
   publisher.id if handle exists in publishers.name, else None. Falls
   back gracefully on pre-v26 DBs (no publishers table → returns None →
   writer behaves like before, no regression).

2. `lib/contributor.py::insert_contribution_event` — checks
   is_publisher_handle on canonical handle before INSERT. If it's a
   publisher, debug-log + return False. Prevents originator events for
   CNBC/SpaceNews/etc.

3. `lib/contributor.py::upsert_contributor` — same gate at top. Prevents
   the contributors table from re-acquiring publisher rows.

Verified end-to-end against live VPS DB snapshot:
  - CNBC originator event: blocked (insert returns False)
  - CNBC contributors row: blocked (no row created)
  - alexastrum, thesensatore, newhandle_xyz: pass through unchanged
  - is_publisher_handle handles case-insensitive lookup correctly
    (CNBC and cnbc both match publisher_id=3)

Pre-deploy event count was 3705. Post-classifier cleanup: 3623 (82 org
events purged). Going forward, no new org events accumulate.

Branch 2 of the schema-v26 rollout. Branch 3 (auto-create at tier='cited',
extract.py sources.publisher_id wiring) is separate scope and not required
for regression prevention.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 14:21:10 +01:00
926a397839 fix(activity): re-apply source classifier + add date-prefix slug fallback
Some checks are pending
CI / lint-and-test (push) Waiting to run
Regression: aeae712's source/create distinction was lost — VPS reverted to
pre-aeae712 behavior where every extract/* knowledge PR returned type=create
regardless of whether a claim was written. Source archives surfaced as
"New claim" chips with date-prefix slugs that 404 on click.

Root cause: aeae712 was deployed via local file copy and never pushed to
origin; a subsequent rsync from origin/main overwrote it with the older
classifier. This branch ships from origin so deploy.sh's repo-first gate
makes recurrence impossible.

- Restore aeae712: extract/* + empty description -> source, with
  empty claim_slug + source_slug field, ci_earned 0.15
- Add Leo's regex fallback: candidate_slug matching
  ^\d{4}-\d{2}-\d{2}-.+-[a-f0-9]{4}$ -> source regardless of branch
  /commit_type/description state. Catches edge cases where description
  leaks but is just a source title (slugified into the inbox filename
  pattern), not a claim insight.
- Add 'challenge' to _FEED_COMMIT_TYPES (latent bug — challenge PRs
  would be filtered out before classification because the filter
  list omitted them; memory says 0 challenges exist so it never
  triggered, but schema support belongs in the filter)
- _build_events: compute candidate slug before classify so the regex
  fallback has a slug to inspect

Verified locally on Leo's example PRs (#4014, #4016) — both classify
as source. VPS smoke pending deploy.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 13:47:00 +01:00
3fe524dd14 fix(classify): Ganymede review fixes — alias cleanup + counter accuracy + handle alignment
Some checks failed
CI / lint-and-test (push) Has been cancelled
1. WARNING — orphan contributor_aliases after publisher/garbage delete:
   Added alias cleanup to the transaction (gated on --delete-events, same
   audit rationale as events). Both garbage and publisher deletion loops
   now DELETE matching contributor_aliases rows. Dry-run adds an orphan
   count diagnostic so the --delete-events decision is informed.

2. NIT — inserted_publishers counter over-reports on replay:
   INSERT OR IGNORE silently skips name collisions, but the counter
   incremented unconditionally. Now uses cur.rowcount so a second apply
   reports 0 inserts instead of falsely claiming 100. moved_to_publisher
   set remains unconditional — publisher rows already present still need
   the matching contributors row deleted.

3. NIT — handle-length gate diverged from writer path:
   Widened from {0,19} (20 chars) to {0,38} (39 chars) to match GitHub's
   handle limit and contributor.py::_HANDLE_RE. Prevents future long-handle
   real contributors from falling through to review_needed and blocking
   --apply. Current data has 0 review_needed either way.

Bonus (Q5): Added audit_log entry inside the transaction. One row in
audit_log.stage='schema_v26', event='classify_contributors' with counter
detail JSON on every --apply run. Cheap audit trail for the destructive op.

Verified end-to-end on VPS DB snapshot:
- First apply: 100/9/9/100/0 (matches pre-fix)
- Second apply: 0/9/0/0/0 (counter fix working)
- With injected aliases + --delete-events: 2 aliases deleted, 1 pre-existing
  orphan correctly left alone (outside script scope), audit_log entry
  written with accurate counters.

Ganymede msg-3. Protocol closed.
2026-04-24 20:47:21 +01:00
45b2f6de20 feat(schema): v26 — publishers + contributor_identities + sources provenance
Separates three concerns currently conflated in contributors table:
  contributors — people + agents we credit (kind in 'person','agent')
  publishers   — news orgs / academic venues / platforms (not credited)
  sources      — gains publisher_id + content_type + original_author columns

Rationale (Cory directive Apr 24): livingip.xyz leaderboard was showing CNBC,
SpaceNews, TechCrunch etc. at the top because the attribution pipeline credited
news org names as if they were contributors. The mechanism-level fix is a
schema split — orgs live in publishers, individuals in contributors, each
table has one semantics.

Migration v26:
  - CREATE TABLE publishers (id PK, name UNIQUE, kind CHECK IN
    news|academic|social_platform|podcast|self|internal|legal|government|
    research_org|commercial|other, url_pattern, created_at)
  - CREATE TABLE contributor_identities (contributor_handle, platform CHECK IN
    x|telegram|github|email|web|internal, platform_handle, verified, created_at)
    Composite PK on (platform, platform_handle) + index on contributor_handle.
    Enables one contributor to unify X + TG + GitHub handles.
  - ALTER TABLE sources ADD COLUMN publisher_id REFERENCES publishers(id)
  - ALTER TABLE sources ADD COLUMN content_type
    (article|paper|tweet|conversation|self_authored|webpage|podcast)
  - ALTER TABLE sources ADD COLUMN original_author TEXT
    (free-text fallback, e.g., "Kim et al." — not credit-bearing)
  - ALTER TABLE sources ADD COLUMN original_author_handle REFERENCES contributors(handle)
    (set only when the author is in our contributor network)
  - ALTER wrapped in try/except on "duplicate column" for replay safety
  - Both SCHEMA_SQL (fresh installs) + migration block (upgrades) updated
  - SCHEMA_VERSION bumped 25 -> 26

Migration is non-breaking. No data moves yet. Existing publishers-polluting-
contributors row state is preserved until the classifier runs. Writer routing
to these tables lands in a separate branch (Phase B writer changes).

Classifier (scripts/classify-contributors.py):
  Analyzes existing contributors rows, buckets into:
    keep_agent   — 9 Pentagon agents
    keep_person  — 21 real humans + reachable pseudonymous X/TG handles
    publisher    — 100 news orgs, academic venues, formal-citation names,
                   brand/platform names
    garbage      — 9 parse artifacts (containing /, parens, 3+ hyphens)
    review_needed — 0 (fully covered by current allowlists)

  Hand-curated allowlists for news/academic/social/internal publisher kinds.
  Garbage detection via regex on special chars and length > 50.
  Named pseudonyms without @ prefix (karpathy, simonw, swyx, metaproph3t,
  sjdedic, ceterispar1bus, etc.) classified as keep_person — they're real
  X/TG contributors missing an @ prefix because extraction frontmatter
  didn't normalize. Cory's auto-create rule catches these on first reference.

  Formal-citation names (Firstname-Lastname form — Clayton Christensen, Hayek,
  Ostrom, Friston, Bostrom, Bak, etc.) classified as academic publishers —
  these are cited, not reachable via @ handle. Get promoted to contributors
  if/when they sign up with an @ handle.

  Apply path is transactional (BEGIN / COMMIT / ROLLBACK on error). Publisher
  insert happens before contributor delete, and contributor delete is gated
  on successful insert so we never lose a row by moving it to a failed
  publisher insert.

  --apply path flags:
    --delete-events  : also DELETE contribution_events rows for moved handles
                       (default: keep events for audit trail)
  --show <handle>   : inspect a single row's classification

Smoke-tested end-to-end via local copy of VPS DB:
  Before: 139 contributors total (polluted with orgs)
  After:  30 contributors (9 agent + 21 person), 100 publishers, 9 deleted
  contribution_events: 3,705 preserved
  contributors <-> publishers overlap: 0

Named contributors verified present after --apply:
  alexastrum (claims=6)  thesensatore (5)  cameron-s1 (1)  m3taversal (1011)

Pentagon agent 'pipeline' (claims_merged=771) intentionally retained — it's
the process name from old extract.py fallback path, not a real contributor.
Classified as agent (kind='agent') so doesn't appear in person leaderboard.

Deploy sequence after Ganymede review:
  1. Branch ff-merge to main
  2. scp lib/db.py + scripts/classify-contributors.py to VPS
  3. Pipeline already at v26 (migration ran during earlier v26 restart)
  4. Run dry-run: python3 ops/classify-contributors.py
  5. Apply: python3 ops/classify-contributors.py --apply
  6. Verify: livingip.xyz leaderboard stops showing CNBC/SpaceNews
  7. Argus /api/contributors unaffected (reads contributors directly, now clean)

Follow-up branch (not in this commit):
  - Writer routing in lib/contributor.py + extract.py:
    org handles -> publishers table + sources.publisher_id
    person handles with @ prefix -> auto-create contributor, tier='cited'
    formal-citation names -> sources.original_author (free text)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 20:47:21 +01:00
f0f9388c1f feat(diagnostics): add POST /api/search for chat API contract
Some checks are pending
CI / lint-and-test (push) Waiting to run
Wire the search endpoint to accept POST bodies matching the embedded
chat contract (query/limit/min_score/domain/confidence/exclude →
slug/path/title/domain/confidence/score/body_excerpt). GET path retained
for legacy callers and adds a min_score override for hackathon debug.

- _qdrant_hits_to_results() shapes raw hits into chat response format
- handle_api_search() dispatches POST vs GET
- /api/search added to _PUBLIC_PATHS (chat is unauthenticated)
- POST route registered alongside existing GET

Resolves VPS↔repo drift flagged by Argus before next deploy.sh run.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 17:58:30 +01:00
0f2b153c92 fix(backfill): Ganymede review — fix tautological guard + origin='human'
Some checks are pending
CI / lint-and-test (push) Waiting to run
Addresses two findings in commit 762fd42 review:

1. BUG: guard query was tautological. `SELECT MAX(number) FROM prs WHERE
   number < 900000` filters out exactly what the `>= 900000` check tests.
   Replaced with a direct check for unexpected rows in the synthetic range
   (excluding our known 900068/900088).

2. WARNING: origin defaults to 'pipeline' via schema default. lib/merge.py
   convention is origin='human' for external contributors. Synthetic rows
   now set origin='human', priority='high' — matches discover_external_prs
   for real GitHub PRs. Prevents Phase B origin-based filtering from
   misclassifying Alex/Cameron as machine-authored.

Also flagged in review: credit projection was optimistic. Author events are
PR-level (not per-claim), so Alex gets 1×0.30 author credit, not 6. Same
for Cameron. Per-claim originator credit goes to the 7 frontmatter sourcers
where applicable. Not a code change — expectation reset for Cory.
2026-04-24 16:49:12 +01:00