docs(external-contrib): merge flow design — Option 2 with gh-pr-* scoping
Two-bug analysis (cherry-pick breaks GitHub merge badge + sync-mirror's head=living-ip filter misses fork PRs leaving prs.github_pr NULL). Empirical verification baked in: PR #87 vs #90 contrast (own-repo merge-no-ff worked end-to-end, fork PR cherry-pick failed both bugs). 10/10 historical Forgejo merge commits propagated to GitHub with identical SHAs — answers Ship's added-scope concern with production data, not theory. Phased implementation: - Phase 1: sync-mirror github_pr backfill via .fork-pr-map (Bug #2, ~30 lines) - Phase 2: _merge_no_ff_external for gh-pr-* branches (Bug #1, ~120 lines) - Phase 3: FwazB PR #90 cleanup (Cory-approved option b) Discovered scope shrink during drafting: existing fixer is already append-only (verified against PR 4066 branch state). No fixer module changes needed — cherry-pick at merge time was the only thing rewriting SHAs. merge --no-ff preserves the existing fix-on-top-of-contributor-commit topology. Open questions for Ship: 1. Backout flag (EXTERNAL_PR_NO_FF_MERGE) overkill or prudent? 2. .fork-pr-map location (bare repo vs /opt/teleo-eval/state/) 3. Phase 1+2 separate deploys or single branch? 4. Rebase-after-fix structured-alert scope vs documented-silent-ignore 5. Merge commit message format Awaiting Ship's architecture sign-off before any code. Ganymede gets line-level once design lands. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
33f6ca9e3f
commit
e6206766dd
1 changed files with 315 additions and 0 deletions
315
docs/external-contributor-merge-flow.md
Normal file
315
docs/external-contributor-merge-flow.md
Normal file
|
|
@ -0,0 +1,315 @@
|
|||
# External Contributor Merge Flow — Design Doc
|
||||
|
||||
**Author:** Epimetheus
|
||||
**Architecture review:** Ship (owns merge.py, sync-mirror.sh)
|
||||
**Code review:** Ganymede (line-level, post-design-approval)
|
||||
**Status:** DRAFT — awaiting Ship's architectural sign-off
|
||||
|
||||
## Problem statement
|
||||
|
||||
External GitHub contributors submit PRs via the `living-ip/teleo-codex` mirror.
|
||||
Pipeline accepts the claim and merges the content into Forgejo `main`, but the
|
||||
GitHub PR shows "open with no diff" — it looks abandoned to the contributor.
|
||||
|
||||
Two compounding bugs intersect on this path:
|
||||
|
||||
1. **Cherry-pick breaks GitHub merge detection.** `lib/merge.py::_cherry_pick_onto_main`
|
||||
creates a new SHA on Forgejo main. GitHub's "is PR head SHA an ancestor of
|
||||
main?" check returns false. `merged: false, merge_commit_sha: null` forever.
|
||||
|
||||
2. **`prs.github_pr` not populated for fork PRs.** `sync-mirror.sh` Step 4.5 looks
|
||||
up GitHub PR number via `?head=living-ip:$branch`, but fork PR heads are
|
||||
`FwazB:contributor/...` (or `<fork-owner>:<branch>`), not `living-ip:`. The
|
||||
filter misses, `github_pr` stays NULL, and `lib/github_feedback.py::on_merged`
|
||||
returns early (no comment, no close) because `_get_github_pr` requires non-NULL.
|
||||
|
||||
Empirical:
|
||||
|
||||
| PR | head | merge mech | github_pr | merged badge | comment posted |
|
||||
|---|---|---|---|---|---|
|
||||
| #87 (own-repo) | `living-ip:fix/...` | `git merge --no-ff` | populated | ✓ true | ✓ |
|
||||
| #90 (FwazB fork) | `FwazB:contributor/...` | cherry-pick | NULL | ✗ false | ✗ |
|
||||
|
||||
Both bugs need fixes. Bug #1 is the structural one (load-bearing for the badge).
|
||||
Bug #2 is a sync-mirror filter issue (load-bearing for the comment/close).
|
||||
|
||||
## Goal
|
||||
|
||||
External GitHub contributor opens a PR → pipeline ingests, evaluates, merges →
|
||||
GitHub PR shows `merged: true` with badge → bot comment posted → PR closed
|
||||
cleanly. No human in the loop on the success path. Failure modes (eval reject,
|
||||
auto-fix, contributor force-push) handled gracefully.
|
||||
|
||||
## Out of scope
|
||||
|
||||
- Agent-extraction PRs (`extract/*`, `reweave/*`, `epimetheus/*`, etc.) — keep
|
||||
cherry-pick. They merge 70+/day, have no contributor UX surface, and the
|
||||
cherry-pick → linear-history rationale (auto-fixer rebase pattern, bisect
|
||||
friendliness) holds.
|
||||
- `/api/contributors` legacy endpoint — separate work, deferred.
|
||||
- PAT-in-URL credential pattern — separate security follow-up.
|
||||
|
||||
## Design — branch-prefix conditional, scoped to `gh-pr-*`
|
||||
|
||||
### Bug #1 fix: `_merge_no_ff_external` for `gh-pr-*` branches
|
||||
|
||||
**Dispatch site (`lib/merge.py::_merge_domain_queue`, currently lines 736-738):**
|
||||
|
||||
```python
|
||||
# Reweave: per-file frontmatter union (existing)
|
||||
if branch.startswith("reweave/"):
|
||||
merge_fn = _merge_reweave_pr(branch)
|
||||
# External GitHub fork PRs: true merge with --no-ff so contributor SHA lands
|
||||
# in main's history → GitHub recognizes "merged" badge.
|
||||
elif branch.startswith("gh-pr-"):
|
||||
merge_fn = _merge_no_ff_external(branch)
|
||||
# Default: cherry-pick (extraction commits ADD new files, applies cleanly,
|
||||
# linear history preserved for the bulk-extraction flow).
|
||||
else:
|
||||
merge_fn = _cherry_pick_onto_main(branch)
|
||||
```
|
||||
|
||||
**New function (`lib/merge.py`):**
|
||||
|
||||
```python
|
||||
async def _merge_no_ff_external(branch: str) -> tuple[bool, str]:
|
||||
"""Merge an external GitHub PR with --no-ff so contributor SHA lands in main.
|
||||
|
||||
Why this differs from _cherry_pick_onto_main:
|
||||
- Cherry-pick rewrites SHA → GitHub never recognizes the PR as merged.
|
||||
- --no-ff preserves the contributor's commit SHA in main's history.
|
||||
- sync-mirror's Forgejo→GitHub propagation already handles merge commits
|
||||
(verified empirically: PR #87 round-tripped cleanly with merge_commit_sha
|
||||
preserved).
|
||||
|
||||
Mechanics:
|
||||
1. Fetch latest origin/main and origin/{branch}
|
||||
2. Create scratch worktree at HEAD of origin/main
|
||||
3. git merge --no-ff origin/{branch} -m "Merge PR: {branch}"
|
||||
4. git push origin HEAD:main
|
||||
5. Cleanup worktree
|
||||
|
||||
Conflict handling:
|
||||
- Entity conflicts: same auto-resolve pattern as cherry-pick
|
||||
(--ours = main HEAD, --theirs = branch). External claims rarely touch
|
||||
entities so this is a low-frequency path.
|
||||
- Other conflicts: abort, return False with conflict detail. Caller marks
|
||||
conflict_permanent. Manual resolution or contributor rebase required.
|
||||
|
||||
Idempotency: caller already gates on PR status, so re-running on a merged
|
||||
PR fails at the merge step (already merged), which is the right behavior.
|
||||
|
||||
Returns (success, message).
|
||||
"""
|
||||
```
|
||||
|
||||
The merge commit message uses the branch name (which contains the GitHub PR
|
||||
number via the `gh-pr-{N}/...` convention) so post-merge processing can
|
||||
correlate back to the GitHub PR if needed.
|
||||
|
||||
### Bug #2 fix: sync-mirror correctly populates `github_pr` for fork PRs
|
||||
|
||||
The current Step 4.5 query in sync-mirror.sh:
|
||||
|
||||
```bash
|
||||
GH_PR_NUM=$(curl -sf "https://api.github.com/repos/$GITHUB_REPO/pulls?head=living-ip:$branch&state=all" ...)
|
||||
```
|
||||
|
||||
Fails for fork PRs because `head` filter format is `<owner>:<branch>` and fork
|
||||
PRs come from a different owner. Two fix options:
|
||||
|
||||
**Option A — Query by branch name without owner prefix.** GitHub API doesn't
|
||||
support owner-less head filter; would have to fetch all PRs and walk client-side.
|
||||
Expensive at scale.
|
||||
|
||||
**Option B — Recognize the `gh-pr-N/` branch prefix.** When sync-mirror creates a
|
||||
Forgejo branch from a fork PR (Step 2.1, already extracts the PR number from
|
||||
`refs/pull/{N}/head`), pass that PR number directly to the Step 4.5 backfill
|
||||
instead of re-querying GitHub. We already know the number — we just need to
|
||||
plumb it through.
|
||||
|
||||
**Recommendation: Option B.** Cleaner, no API rate-limit risk, no client-side
|
||||
walk. Implementation: in Step 2.1's existing while-loop, write the branch→PR
|
||||
mapping to a temp file. In Step 4.5, read the mapping instead of querying
|
||||
GitHub when the branch starts with `gh-pr-`.
|
||||
|
||||
```bash
|
||||
# Step 2.1 (existing loop, add 1 line):
|
||||
echo "$PR_BRANCH $pr_num" >> "$REPO_DIR/.fork-pr-map"
|
||||
|
||||
# Step 4.5 (replace the head-filter query for gh-pr-* branches):
|
||||
if [[ "$branch" == gh-pr-* ]]; then
|
||||
GH_PR_NUM=$(grep "^$branch " "$REPO_DIR/.fork-pr-map" | tail -1 | awk '{print $2}')
|
||||
else
|
||||
GH_PR_NUM=$(curl ... existing query for own-repo branches)
|
||||
fi
|
||||
```
|
||||
|
||||
The map file gets cleaned up at the start of each sync cycle (mtime check; older
|
||||
than 1h gets truncated). Bounded growth.
|
||||
|
||||
### Auto-fixer mode='append' for `gh-pr-*` branches
|
||||
|
||||
**Current behavior (`lib/fixer.py`):** Worktree-based fix → commit → push (regular
|
||||
push, not force). When the PR was created by the pipeline (extract/* branches),
|
||||
this works because the LLM-extractor's commit is at HEAD and we're appending on
|
||||
top — push succeeds.
|
||||
|
||||
**External PR behavior (today):** Same code path runs. Fork PR has FwazB's commit
|
||||
at HEAD; auto-fixer creates a fix commit on top; pushes via Forgejo's branch
|
||||
ref (the fork PR was mirrored as `refs/heads/gh-pr-90/contributor/...` on
|
||||
Forgejo). Push works. Eval reset fires. Eval re-runs.
|
||||
|
||||
Wait — re-reading the existing fixer, it's actually already append-only. Good.
|
||||
The cherry-pick at merge time was the only thing rewriting SHAs. So **no fixer
|
||||
change required** for Option 2. The fixer commit is already at HEAD~1 from
|
||||
FwazB's commit on Forgejo. When we `git merge --no-ff` instead of cherry-pick,
|
||||
the merge commit's parent chain includes BOTH the fix commit AND FwazB's
|
||||
original commit. GitHub sees FwazB's SHA in ancestry → "merged" badge.
|
||||
|
||||
**Cross-check:** Verified PR 4066's existing branch state.
|
||||
|
||||
```
|
||||
$ git log refs/heads/gh-pr-90/contributor/arcium-confidential-computing-challenge --oneline
|
||||
d7916d65 auto-fix: strip 2 broken wiki links ← fixer's commit
|
||||
f6a59d7d claim: confidential computing reshapes... ← FwazB's commit
|
||||
```
|
||||
|
||||
Both commits already on Forgejo. When merge.py cherry-picked, it picked both
|
||||
commits onto main as new SHAs. With merge --no-ff, both stay intact, the merge
|
||||
commit references them, GitHub sees `f6a59d7d` (FwazB's HEAD on his fork) in
|
||||
main's ancestry, marks merged.
|
||||
|
||||
**This means scope shrinks:** the design is purely a merge.py change + a
|
||||
sync-mirror Step 2.1/4.5 plumbing tweak. No fixer module changes.
|
||||
|
||||
### Edge case: contributor rebases their fork after fixer appended
|
||||
|
||||
**Scenario:** FwazB's PR is in eval. Pipeline auto-fixer pushes a fix commit to
|
||||
Forgejo `gh-pr-90/...`. FwazB notices the original wiki-link issue, fixes it
|
||||
locally, force-pushes to his fork. sync-mirror's next cycle fetches his new
|
||||
SHA → tries to update the Forgejo branch → push from sync-mirror is regular
|
||||
(not --force) → fails because Forgejo branch has diverged from FwazB's fork.
|
||||
|
||||
**Today's behavior:** sync-mirror logs a warning. Forgejo branch keeps the
|
||||
appended-fix state. Eval continues against that state. FwazB's most recent
|
||||
fork state is silently ignored.
|
||||
|
||||
**Acceptable risk for hackathon.** The eval-reset-on-tip-change gate will
|
||||
re-trigger eval if anyone force-pushes the Forgejo branch later (e.g., manual
|
||||
re-sync). Documented; not fixing in this PR.
|
||||
|
||||
**Followup:** sync-mirror could detect the divergence, log a structured alert,
|
||||
and post a comment on the GitHub PR ("we detected your force-push but our
|
||||
appended fix has diverged; please rebase against `<sha>` or we'll close").
|
||||
Out of scope for this branch.
|
||||
|
||||
## Test plan
|
||||
|
||||
| # | Scenario | Expected | Validation |
|
||||
|---|---|---|---|
|
||||
| 1 | External PR, clean (no auto-fix needed) | Merged with `--no-ff`, contributor SHA in main, GitHub badge `merged: true`, on_merged comment + close | curl GitHub API for PR state after merge |
|
||||
| 2 | External PR with broken wiki links | auto-fixer appends commit, eval re-runs and approves, merge --no-ff brings BOTH commits in via merge commit, GitHub badge `merged: true` | log line trace + GitHub API |
|
||||
| 3 | External PR rejected by eval (substantive issue) | terminate_pr fires existing path, on_closed posts rejection comment + closes GitHub PR | GitHub API after eval cycle |
|
||||
| 4 | sync-mirror github_pr backfill on fork PR | `prs.github_pr` populated within one cron cycle (≤2 min) of mirror PR creation | sqlite3 SELECT after sync |
|
||||
| 5 | Contributor force-pushes fork mid-eval | sync-mirror logs warning, eval continues against pre-rebase state (documented behavior, not regression) | journalctl |
|
||||
| 6 | Re-running merge on already-merged PR | --no-ff fails cleanly (already up to date), caller handles as no-op | manual replay |
|
||||
|
||||
Test 1 and 2 are the critical-path tests. 3 verifies the rejection path didn't
|
||||
regress. 4 isolates the github_pr backfill fix. 5 is acceptance criteria for
|
||||
the documented edge case. 6 is idempotency.
|
||||
|
||||
**Production smoke test:** after deploy, manually create a tiny test PR from a
|
||||
secondary GitHub account. Walk it through the full lifecycle. Tear down before
|
||||
hackathon. Cost: 5 minutes, catches integration-level issues that unit tests
|
||||
miss.
|
||||
|
||||
## Backout procedure
|
||||
|
||||
If Option 2 misbehaves in production, revert path is one config-line toggle:
|
||||
|
||||
```python
|
||||
# lib/merge.py — gate on a feature flag for fast disable
|
||||
if branch.startswith("gh-pr-") and config.EXTERNAL_PR_NO_FF_MERGE:
|
||||
merge_fn = _merge_no_ff_external(branch)
|
||||
elif ...
|
||||
```
|
||||
|
||||
Default flag value: `True` after deploy. Set to `False` via env var on VPS to
|
||||
fall back to cherry-pick path immediately if anything breaks. No code revert
|
||||
required for a fast cutout.
|
||||
|
||||
Forgejo→GitHub merge commits already in main when the flag flips can't be
|
||||
un-merged (they're real commits), but the failure mode is the same as today:
|
||||
GitHub PR shows merged because the SHA is in history. No worse than current.
|
||||
|
||||
## Migration / cleanup
|
||||
|
||||
**FwazB's PR #90:** existing artifact, can't retroactively un-cherry-pick.
|
||||
Manual close with explanatory comment (Ship's option b). Cory-approved.
|
||||
|
||||
```
|
||||
We've merged your claim into the knowledge base via cherry-pick (commit f6a59d7d
|
||||
on main). Future external PRs will use `git merge --no-ff` so the GitHub merge
|
||||
badge fires correctly. This PR is being closed manually as the one historical
|
||||
case before the fix lands. Thanks for the contribution!
|
||||
|
||||
— LivingIP pipeline
|
||||
```
|
||||
|
||||
Posted via `_post_comment` + `_close_github_pr` from a one-off script
|
||||
(`scripts/close-fwazb-pr-90.py`).
|
||||
|
||||
## Implementation order
|
||||
|
||||
1. **Phase 1 — sync-mirror github_pr backfill (Bug #2, ~30 lines)**
|
||||
- Modify Step 2.1 to write `.fork-pr-map`
|
||||
- Modify Step 4.5 to read map for gh-pr-* branches
|
||||
- Branch: `epimetheus/external-merge-flow-bug2`
|
||||
- This deploys independently, no dependency on Bug #1 fix
|
||||
- Smoke: verify any future fork PR gets `prs.github_pr` populated within 2 min
|
||||
|
||||
2. **Phase 2 — merge.py --no-ff for gh-pr-* (Bug #1, ~120 lines)**
|
||||
- Add `_merge_no_ff_external` function
|
||||
- Add branch-prefix dispatch case in `_merge_domain_queue`
|
||||
- Add config flag `EXTERNAL_PR_NO_FF_MERGE` for backout
|
||||
- Branch: `epimetheus/external-merge-flow-bug1`
|
||||
- Depends on Phase 1 being live (otherwise `on_merged` still no-ops)
|
||||
- Smoke: test PR end-to-end
|
||||
|
||||
3. **Phase 3 — FwazB cleanup (~10 lines)**
|
||||
- Manual one-off script for PR #90
|
||||
- Independent of Phase 1/2 deploy
|
||||
|
||||
Two phases, separately reviewable, separately deployable. Bug #2 alone gives us
|
||||
contributor comments on closed-via-cherry-pick PRs (already partially solves
|
||||
the UX). Bug #1 adds the "merged" badge.
|
||||
|
||||
## Open questions for Ship
|
||||
|
||||
1. **Backout flag necessary?** I'm including `EXTERNAL_PR_NO_FF_MERGE` config
|
||||
flag for fast cutout. Overkill given low-traffic external-PR path? Or
|
||||
prudent given Accelerate Solana pressure?
|
||||
|
||||
2. **`.fork-pr-map` file location.** Currently proposed inside the bare repo
|
||||
(`$REPO_DIR/.fork-pr-map`). Alternative: `/opt/teleo-eval/state/fork-pr-map`.
|
||||
The bare-repo location keeps it co-located with the data it indexes (good
|
||||
for cleanup), but also means it gets tracked by `find` permission sweeps in
|
||||
the script. Either works; you decide.
|
||||
|
||||
3. **Phase 1 vs Phase 2 sequencing.** I have them as separately deployable, but
|
||||
Phase 1 alone produces a half-fixed state (comments + close, no merged
|
||||
badge). Worth merging into a single branch instead, or keep separate for
|
||||
smaller blast radius per deploy?
|
||||
|
||||
4. **Test plan #5 (rebase-after-fix) scope.** Documenting the silent-ignore as
|
||||
acceptable for now. Sufficient for hackathon, or do you want me to add the
|
||||
structured-alert path before May 5?
|
||||
|
||||
5. **`merge --no-ff` commit message format.** I have `"Merge PR: {branch}"`.
|
||||
Better signal to derive the GitHub PR number from the branch and write
|
||||
`"Merge external GitHub PR #{N}: {claim_title_or_branch_slug}"`? Trades
|
||||
verbosity for searchability in `git log --merges`.
|
||||
|
||||
Sending you this doc directly via Pentagon. Ganymede gets line-level review
|
||||
once you sign off on the architecture.
|
||||
Loading…
Reference in a new issue