Pipeline reliability (8 fixes, reviewed by Ganymede+Rhea+Leo+Rio):
1. Merge API recovery — pre-flight approval check, transient/permanent distinction, jitter
2. Ghost PR detection — ls-remote branch check in reconciliation, network guard
3. Source status contract — directory IS status, no code change needed
4. Batch-state markers eliminated — two-gate skip (archive-check + batched branch-check)
5. Branch SHA tracking — batched ls-remote, auto-reset verdicts, dismiss stale reviews
6. Mirror pre-flight permissions — chown check in sync-mirror.sh
7. Telegram archive commit-after-write — git add/commit/push with rebase --abort fallback
8. Post-merge source archiving — queue/ → archive/{domain}/ after merge
Pipeline fixes:
- merge_cycled flag — eval attempts preserved during merge-failure cycling (Ganymede+Rhea)
- merge_failures diagnostic counter
- Startup recovery preserves eval_attempts (was incorrectly resetting to 0)
- No-diff PRs auto-closed by eval (root cause of 17 zombie PRs)
- GC threshold aligned with substantive fixer budget (was 2, now 4)
- Conflict retry with 3-attempt budget + permanent conflict handler
- Local ff-merge fallback for Forgejo 405 errors
Telegram bot:
- KB retrieval: 3-layer (entity resolution → claim search → agent context)
- Reply-to-bot handler (context.bot.id check)
- Tag regex: @teleo|@futairdbot
- Prompt rewrite for natural analyst voice
- Market data API integration (Ben's token price endpoint)
- Conversation windows (5-message unanswered counter, per-user-per-chat)
- Conversation history in prompt (last 5 exchanges)
- Worktree file lock for archive writes
Infrastructure:
- worktree_lock.py — file-based lock (flock) for main worktree coordination
- backfill-sources.py — source DB registration for Argus funnel
- batch-extract-50.sh v3 — two-gate skip, batched ls-remote, network guard
- sync-mirror.sh — auto-PR creation for mirrored GitHub branches, permission pre-flight
- Argus dashboard — conflicts + reviewing in backlog, queue count in funnel
- Enrichment-inside-frontmatter bug fix (regex anchor, not --- split)
Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
85 lines
2.6 KiB
Python
85 lines
2.6 KiB
Python
"""File-based lock for ALL processes writing to the main worktree.
|
|
|
|
One lock, one mechanism (Ganymede: Option C). Used by:
|
|
- Pipeline daemon stages (entity_batch, source archiver, substantive_fixer) via async wrapper
|
|
- Telegram bot (sync context manager)
|
|
|
|
Protects: /opt/teleo-eval/workspaces/main/
|
|
|
|
flock auto-releases on process exit (even crash/kill). No stale lock cleanup needed.
|
|
"""
|
|
|
|
import asyncio
|
|
import fcntl
|
|
import logging
|
|
import time
|
|
from contextlib import asynccontextmanager, contextmanager
|
|
from pathlib import Path
|
|
|
|
logger = logging.getLogger("worktree-lock")
|
|
|
|
LOCKFILE = Path("/opt/teleo-eval/workspaces/.main-worktree.lock")
|
|
|
|
|
|
@contextmanager
|
|
def main_worktree_lock(timeout: float = 10.0):
|
|
"""Sync context manager — use in telegram bot and other external processes.
|
|
|
|
Usage:
|
|
with main_worktree_lock():
|
|
# write to inbox/queue/, git add/commit/push, etc.
|
|
"""
|
|
LOCKFILE.parent.mkdir(parents=True, exist_ok=True)
|
|
fp = open(LOCKFILE, "w")
|
|
start = time.monotonic()
|
|
while True:
|
|
try:
|
|
fcntl.flock(fp, fcntl.LOCK_EX | fcntl.LOCK_NB)
|
|
break
|
|
except BlockingIOError:
|
|
if time.monotonic() - start > timeout:
|
|
fp.close()
|
|
logger.warning("Main worktree lock timeout after %.0fs", timeout)
|
|
raise TimeoutError(f"Could not acquire main worktree lock in {timeout}s")
|
|
time.sleep(0.1)
|
|
try:
|
|
yield
|
|
finally:
|
|
fcntl.flock(fp, fcntl.LOCK_UN)
|
|
fp.close()
|
|
|
|
|
|
@asynccontextmanager
|
|
async def async_main_worktree_lock(timeout: float = 10.0):
|
|
"""Async context manager — use in pipeline daemon stages.
|
|
|
|
Acquires the same file lock via run_in_executor (Ganymede: <1ms overhead).
|
|
|
|
Usage:
|
|
async with async_main_worktree_lock():
|
|
await _git("fetch", "origin", "main", cwd=main_dir)
|
|
await _git("reset", "--hard", "origin/main", cwd=main_dir)
|
|
# ... write files, commit, push ...
|
|
"""
|
|
loop = asyncio.get_event_loop()
|
|
LOCKFILE.parent.mkdir(parents=True, exist_ok=True)
|
|
fp = open(LOCKFILE, "w")
|
|
|
|
def _acquire():
|
|
start = time.monotonic()
|
|
while True:
|
|
try:
|
|
fcntl.flock(fp, fcntl.LOCK_EX | fcntl.LOCK_NB)
|
|
return
|
|
except BlockingIOError:
|
|
if time.monotonic() - start > timeout:
|
|
fp.close()
|
|
raise TimeoutError(f"Could not acquire main worktree lock in {timeout}s")
|
|
time.sleep(0.1)
|
|
|
|
await loop.run_in_executor(None, _acquire)
|
|
try:
|
|
yield
|
|
finally:
|
|
fcntl.flock(fp, fcntl.LOCK_UN)
|
|
fp.close()
|