ship: add agent SOP, auto-deploy infrastructure, cleanup stale files
- AGENT-SOP.md: enforceable checklist for commit/review/deploy cycle - auto-deploy.sh + systemd units: 2-min timer pulls from Forgejo, syncs to working dirs, restarts services only when Python changes, smoke tests - prune-branches.sh: dry-run-by-default branch cleanup tool - Delete root diagnostics/ (stale artifacts, all code moved to ops/) - Delete 7 orphaned HTML prototypes (untracked, local-only) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
6361c7e9e8
commit
4e20986c25
10 changed files with 373 additions and 1432 deletions
|
|
@ -1,65 +0,0 @@
|
|||
# Alerting Integration Patch for app.py
|
||||
|
||||
Two changes needed in the live app.py:
|
||||
|
||||
## 1. Add import (after `from activity_endpoint import handle_activity`)
|
||||
|
||||
```python
|
||||
from alerting_routes import register_alerting_routes
|
||||
```
|
||||
|
||||
## 2. Register routes in create_app() (after the last `app.router.add_*` line)
|
||||
|
||||
```python
|
||||
# Alerting — active monitoring endpoints
|
||||
register_alerting_routes(app, _alerting_conn)
|
||||
```
|
||||
|
||||
## 3. Add helper function (before create_app)
|
||||
|
||||
```python
|
||||
def _alerting_conn() -> sqlite3.Connection:
|
||||
"""Dedicated read-only connection for alerting checks.
|
||||
|
||||
Separate from app['db'] to avoid contention with request handlers.
|
||||
Always sets row_factory for named column access.
|
||||
"""
|
||||
conn = sqlite3.connect(f"file:{DB_PATH}?mode=ro", uri=True)
|
||||
conn.row_factory = sqlite3.Row
|
||||
return conn
|
||||
```
|
||||
|
||||
## 4. Add /check and /api/alerts to PUBLIC_PATHS
|
||||
|
||||
```python
|
||||
_PUBLIC_PATHS = frozenset({"/", "/api/metrics", "/api/rejections", "/api/snapshots",
|
||||
"/api/vital-signs", "/api/contributors", "/api/domains",
|
||||
"/api/audit", "/check", "/api/alerts"})
|
||||
```
|
||||
|
||||
## 5. Add /api/failure-report/ prefix check in auth middleware
|
||||
|
||||
In the `@web.middleware` auth function, add this alongside the existing
|
||||
`request.path.startswith("/api/audit/")` check:
|
||||
|
||||
```python
|
||||
if request.path.startswith("/api/failure-report/"):
|
||||
return await handler(request)
|
||||
```
|
||||
|
||||
## Deploy notes
|
||||
|
||||
- `alerting.py` and `alerting_routes.py` must be in the **same directory** as `app.py`
|
||||
(i.e., `/opt/teleo-eval/diagnostics/`). The import uses a bare module name, not
|
||||
a relative import, so Python resolves it via `sys.path` which includes the working
|
||||
directory. If the deploy changes the working directory or uses a package structure,
|
||||
switch the import in `alerting_routes.py` line 11 to `from .alerting import ...`.
|
||||
|
||||
- The `/api/failure-report/{agent}` endpoint is standalone — any agent can pull their
|
||||
own report on demand via `GET /api/failure-report/<agent-name>?hours=24`.
|
||||
|
||||
## Files to deploy
|
||||
|
||||
- `alerting.py` → `/opt/teleo-eval/diagnostics/alerting.py`
|
||||
- `alerting_routes.py` → `/opt/teleo-eval/diagnostics/alerting_routes.py`
|
||||
- Patched `app.py` → `/opt/teleo-eval/diagnostics/app.py`
|
||||
|
|
@ -1,84 +0,0 @@
|
|||
# Teleo Codex — Evolution
|
||||
|
||||
How the collective intelligence system has grown, phase by phase and day by day. Maps tell you what the KB *contains*. This tells you how the KB *behaves*.
|
||||
|
||||
## Phases
|
||||
|
||||
### Phase 1 — Genesis (Mar 5-9)
|
||||
Cory and Rio built the repo. 2 agents active. First claims, first positions, first source archives. Everything manual. ~200 commits, zero pipeline.
|
||||
|
||||
### Phase 2 — Agent bootstrap (Mar 10-14)
|
||||
All 6 agents came online. Bulk claim loading — agents read their domains and proposed initial claims. Theseus restructured its belief hierarchy. Entity schema generalized cross-domain. ~450 commits but zero automated extractions. Agents learning who they are.
|
||||
|
||||
### Phase 3 — Pipeline ignition (Mar 15-17)
|
||||
Epimetheus's extraction pipeline went live. 155 extractions in 2 days — the system shifted from manual to automated. 67 MetaDAO decision records ingested (governance history). The knowledge base doubled in density.
|
||||
|
||||
### Phase 4 — Steady state (Mar 17-22)
|
||||
Daily research sessions across all agents. Every agent running 1 session/day, archiving 3-10 sources each. Enrichment cycles started — new evidence flowing to existing claims. Divergence schema shipped (PR #1493) — claims began contradicting each other productively. ~520 commits.
|
||||
|
||||
### Phase 5 — Real-time (Mar 23+)
|
||||
Telegram integration went live. Rio started extracting from live conversations. Astra expanded into energy domain (fusion economics, HTS magnets). Infrastructure overhead spiked as ingestion scaled. Transcript archival deployed. The system went from batch to live.
|
||||
|
||||
## Daily Heartbeat
|
||||
|
||||
```
|
||||
Date | Ext | Dec | TG | Res | Ent | Infra | Agents active
|
||||
------------|-----|-----|----|-----|-----|-------|------------------------------------------
|
||||
2026-03-05 | 0 | 0 | 0 | 0 | 0 | 0 | leo, rio
|
||||
2026-03-06 | 0 | 0 | 0 | 0 | 0 | 0 | clay, leo, rio, theseus, vida
|
||||
2026-03-07 | 0 | 0 | 0 | 0 | 0 | 0 | astra, clay, leo, theseus, vida
|
||||
2026-03-08 | 0 | 0 | 0 | 0 | 0 | 0 | astra, clay, leo, rio, theseus, vida
|
||||
2026-03-09 | 0 | 0 | 0 | 0 | 0 | 0 | clay, leo, rio, theseus, vida
|
||||
2026-03-10 | 0 | 0 | 0 | 3 | 0 | 1 | astra, clay, leo, rio, theseus, vida
|
||||
2026-03-11 | 0 | 0 | 0 | 7 | 0 | 30 | astra, clay, leo, rio, theseus, vida
|
||||
2026-03-12 | 0 | 0 | 0 | 1 | 0 | 11 | astra, clay, leo, rio, theseus, vida
|
||||
2026-03-13 | 0 | 0 | 0 | 0 | 0 | 0 | theseus
|
||||
2026-03-14 | 0 | 0 | 0 | 0 | 0 | 26 | rio
|
||||
2026-03-15 | 35 | 30 | 0 | 0 | 6 | 5 | leo, rio
|
||||
2026-03-16 | 53 | 37 | 0 | 2 | 9 | 21 | clay, epimetheus, leo, rio, theseus, vida
|
||||
2026-03-17 | 0 | 0 | 0 | 1 | 0 | 0 | rio
|
||||
2026-03-18 | 81 | 0 | 4 | 12 | 17 | 18 | astra, clay, epimetheus, leo, rio, theseus, vida
|
||||
2026-03-19 | 67 | 0 | 0 | 5 | 26 | 41 | astra, epimetheus, leo, rio, theseus, vida
|
||||
2026-03-20 | 27 | 1 | 0 | 6 | 9 | 38 | astra, epimetheus, leo, rio, theseus, vida
|
||||
2026-03-21 | 23 | 0 | 1 | 5 | 3 | 44 | astra, epimetheus, leo, rio, theseus, vida
|
||||
2026-03-22 | 17 | 0 | 0 | 5 | 2 | 32 | astra, leo, rio, theseus, vida
|
||||
2026-03-23 | 22 | 0 | 14 | 5 | 16 | 190 | astra, epimetheus, leo, rio, theseus, vida
|
||||
2026-03-24 | 31 | 0 | 7 | 5 | 21 | 70 | astra, epimetheus, leo, rio, theseus, vida
|
||||
2026-03-25 | 14 | 0 | 10 | 4 | 18 | 36 | astra, leo, rio, theseus, vida
|
||||
```
|
||||
|
||||
**Legend:** Ext = claim extractions, Dec = decision records, TG = Telegram extractions, Res = research sessions, Ent = entity updates, Infra = pipeline/maintenance commits.
|
||||
|
||||
## Key Milestones
|
||||
|
||||
| Date | Event |
|
||||
|------|-------|
|
||||
| Mar 5 | Repo created. Leo + Rio active. First claims and positions. |
|
||||
| Mar 6 | All 6 agents came online. Archive standardization. PR review requirement established. |
|
||||
| Mar 10 | First research sessions. Theseus restructured belief hierarchy. Leo added diagnostic schemas. |
|
||||
| Mar 11 | Rio generalized entity schema cross-domain. 7 research sessions in one day. |
|
||||
| Mar 15 | Pipeline ignition — 35 extractions + 30 decision records in one day. |
|
||||
| Mar 16 | Biggest extraction day — 53 extractions + 37 decisions. |
|
||||
| Mar 18 | Peak research — 12 sessions. Clay's last active day (2 sessions). 81 extractions. |
|
||||
| Mar 19 | Divergence schema shipped (PR #1493). Game mechanic for structured disagreement. |
|
||||
| Mar 21 | Telegram integration — first live chat extractions. |
|
||||
| Mar 23 | Infrastructure spike (190 infra commits) as ingestion scaled. Rio Telegram goes live at volume. |
|
||||
| Mar 25 | Transcript archival deployed. Astra expanded into energy domain. |
|
||||
|
||||
## Flags & Concerns
|
||||
|
||||
- **Clay dropped off after Mar 18.** Only 2 research sessions total vs. 8 for other agents. Entertainment domain is under-researched.
|
||||
- **Infra-to-substance ratio is ~2:1.** Expected during bootstrap but should improve. Mar 23 was worst (190 infra vs. 22 extractions).
|
||||
- **Enrichment quality issues.** Space (#1751) and health (#1752) enrichment PRs had duplicate evidence blocks, deleted content, and merge conflicts. Pipeline enrichment pass creates artifacts requiring manual cleanup.
|
||||
|
||||
## Current State (Mar 25)
|
||||
|
||||
| Metric | Count |
|
||||
|--------|-------|
|
||||
| Claims in KB | 426 |
|
||||
| Entities tracked | 103 |
|
||||
| Decision records | 76 |
|
||||
| Sources archived | 858 |
|
||||
| Domains active | 14 |
|
||||
| Agents active | 6 (Clay intermittent) |
|
||||
| Total commits | 1,939 |
|
||||
File diff suppressed because it is too large
Load diff
|
|
@ -1,59 +0,0 @@
|
|||
# Week 3 (Mar 17-23, 2026) — From Batch to Live
|
||||
|
||||
## Headline
|
||||
The collective went from a knowledge base to a live intelligence system. Rio started ingesting Telegram conversations in real-time, Astra spun up covering space/energy/manufacturing, and the KB expanded from ~400 to 426 claims across 14 domains. The pipeline processed 597 sources and generated 117 merged PRs.
|
||||
|
||||
## What actually happened
|
||||
|
||||
### Astra came alive
|
||||
The biggest structural change — a new agent covering space-development, energy, manufacturing, and robotics. In 8 days, Astra ran 8 research sessions, archived ~60 sources, and contributed 29 new claims. The energy domain is entirely new: fusion economics, HTS magnets, plasma-facing materials. Space got depth it didn't have: cislunar economics, commercial stations, He-3 extraction, launch cost phase transitions.
|
||||
|
||||
### Rio went real-time
|
||||
Telegram integration means Rio now extracts from live conversations, not just archived articles. ~59 Telegram-sourced commits. Also processed 46 decision records from MetaDAO governance — the futarchy proposal dataset is now substantial. Plus 8 SEC regulatory framework claims that gave the IF domain serious legal depth.
|
||||
|
||||
### Theseus stayed steady
|
||||
8 research sessions, ~58 sources. Major extractions: Dario Amodei pieces, Noah Smith superintelligence series, Anthropic RSP rollback, METR evaluations. AI alignment domain is the deepest in the KB.
|
||||
|
||||
### Vida kept pace
|
||||
8 research sessions, ~51 sources. Health enrichments from GLP-1 economics, clinical AI, SDOH evidence.
|
||||
|
||||
### Clay went quiet
|
||||
2 research sessions on Mar 18, then silence. Entertainment domain is the least active. Needs attention.
|
||||
|
||||
### Leo focused on infrastructure
|
||||
Divergence schema shipped (PR #1493). 6 research sessions. Most time went to PR review, conflict resolution, and evaluator role.
|
||||
|
||||
## By the numbers
|
||||
|
||||
| Metric | Count |
|
||||
|--------|-------|
|
||||
| New claims added | ~29 |
|
||||
| Existing claims enriched | ~132 files modified |
|
||||
| Sources archived | 597 |
|
||||
| Entities added | 10 |
|
||||
| Decision records added | 46 |
|
||||
| Merged PRs | 117 |
|
||||
| Research sessions | 42 |
|
||||
| Telegram extractions | ~59 |
|
||||
| Pipeline/maintenance commits | ~420 |
|
||||
|
||||
## What's meaningful
|
||||
|
||||
- **29 new claims** — real intellectual growth, mostly space/energy (Astra) and IF regulatory (Rio)
|
||||
- **132 claim enrichments** — evidence accumulating on existing positions
|
||||
- **46 decision records** — primary futarchy data, not analysis of analysis
|
||||
- **Divergence schema** — the KB can now track productive disagreements
|
||||
- **Telegram going live** — first real-time contribution channel
|
||||
|
||||
## What changed about how we think
|
||||
|
||||
The biggest qualitative shift: the KB now has enough depth to create real tensions. The divergence schema shipped precisely because claims are contradicting each other productively (GLP-1 inflationary vs. deflationary by geography; human-AI collaboration helps vs. hurts by task type). The collective is past the accumulation phase and into the refinement phase.
|
||||
|
||||
## Concerns
|
||||
|
||||
1. Clay silent after day 1
|
||||
2. Enrichment pipeline creating duplicate artifacts (PRs #1751, #1752)
|
||||
3. Infra-to-substance ratio at 2:1
|
||||
|
||||
---
|
||||
*Generated by Leo, 2026-03-25*
|
||||
78
ops/AGENT-SOP.md
Normal file
78
ops/AGENT-SOP.md
Normal file
|
|
@ -0,0 +1,78 @@
|
|||
# Agent SOP: Ship, Review, Deploy
|
||||
|
||||
Load at session start. No exceptions.
|
||||
|
||||
## Code Changes
|
||||
|
||||
1. Branch from main: `git checkout -b {agent-name}/{description}`
|
||||
2. Make changes. One branch per task. One concern per PR.
|
||||
3. Commit with agent-name prefix, what changed and why.
|
||||
4. Push to Forgejo. Open PR with deploy manifest (see deploy-manifest.md).
|
||||
5. Ganymede reviews. Address feedback on same branch.
|
||||
6. Merge after approval. Delete branch immediately.
|
||||
7. Auto-deploy handles the rest. Do not manually deploy.
|
||||
|
||||
## Do Not
|
||||
|
||||
- SCP files directly to VPS
|
||||
- Deploy before committing to the repo
|
||||
- Edit files on VPS directly
|
||||
- Send the same review request twice for unchanged code
|
||||
- Claim code exists or was approved without reading git/files to verify
|
||||
- Go from memory when you can verify from files
|
||||
- Reuse branch names (Forgejo returns 409 Conflict on closed PR branches)
|
||||
|
||||
## Canonical File Locations
|
||||
|
||||
| Code | Location |
|
||||
|---|---|
|
||||
| Pipeline lib | `ops/pipeline-v2/lib/` |
|
||||
| Pipeline scripts | `ops/pipeline-v2/` |
|
||||
| Diagnostics | `ops/diagnostics/` |
|
||||
| Agent state | `ops/agent-state/` |
|
||||
| Deploy/ops scripts | `ops/` |
|
||||
| Claims | `core/`, `domains/`, `foundations/` |
|
||||
| Agent identity | `agents/{name}/` |
|
||||
|
||||
One location per file. If your path doesn't match this table, stop.
|
||||
|
||||
## Verification Before Acting
|
||||
|
||||
- Before editing: read the file. Never describe code from memory.
|
||||
- Before reviewing: check git log for prior approvals on the same files.
|
||||
- Before deploying: `git status` must show clean tree.
|
||||
- Before messaging another agent: check if the same message was already sent.
|
||||
|
||||
## Branch Hygiene
|
||||
|
||||
- Delete branch immediately after merge.
|
||||
- Nightly research branches: deleted after 7 days if unmerged.
|
||||
- Never leave a branch open with no active work.
|
||||
|
||||
## Deploy
|
||||
|
||||
After merge to main, auto-deploy runs within 2 minutes on VPS:
|
||||
1. Pulls latest main into deploy checkout
|
||||
2. Syntax-checks all Python files
|
||||
3. Syncs to working directories (pipeline, diagnostics, agent-state)
|
||||
4. Restarts services only if Python files changed
|
||||
5. Runs smoke tests (systemd status + health endpoints)
|
||||
|
||||
Manual deploy (only if auto-deploy is broken):
|
||||
```
|
||||
cd ops && ./deploy.sh --dry-run && ./deploy.sh --restart
|
||||
```
|
||||
|
||||
Check auto-deploy status: `journalctl -u teleo-auto-deploy -n 20`
|
||||
|
||||
## Shell and Python Safety
|
||||
|
||||
- Run `bash -n script.sh` after modifying any shell script.
|
||||
- Never interpolate shell variables into Python strings via `'$var'`.
|
||||
Pass values via `os.environ` or `sys.argv`.
|
||||
- Never write credentials to `.git/config`. Use per-command `git -c http.extraHeader`.
|
||||
|
||||
## Schema Changes
|
||||
|
||||
Any PR that changes a file format, DB table, or API response shape must follow
|
||||
`ops/schema-change-protocol.md`. Tag all consumers. Include migration.
|
||||
84
ops/auto-deploy-setup.md
Normal file
84
ops/auto-deploy-setup.md
Normal file
|
|
@ -0,0 +1,84 @@
|
|||
# Auto-Deploy Setup
|
||||
|
||||
One-time setup on VPS. After this, merges to main deploy automatically within 2 minutes.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- SSH access as `teleo` user: `ssh teleo@77.42.65.182`
|
||||
- Forgejo running at localhost:3000
|
||||
- `teleo` user has sudo access for `teleo-*` services
|
||||
|
||||
## Steps
|
||||
|
||||
### 1. Create the deploy checkout
|
||||
|
||||
```bash
|
||||
git clone http://localhost:3000/teleo/teleo-codex.git /opt/teleo-eval/workspaces/deploy
|
||||
cd /opt/teleo-eval/workspaces/deploy
|
||||
git checkout main
|
||||
```
|
||||
|
||||
This checkout is ONLY for auto-deploy. The pipeline's main worktree at
|
||||
`/opt/teleo-eval/workspaces/main` is separate and untouched.
|
||||
|
||||
### 2. Install systemd units
|
||||
|
||||
```bash
|
||||
sudo cp /opt/teleo-eval/workspaces/deploy/ops/auto-deploy.service /etc/systemd/system/teleo-auto-deploy.service
|
||||
sudo cp /opt/teleo-eval/workspaces/deploy/ops/auto-deploy.timer /etc/systemd/system/teleo-auto-deploy.timer
|
||||
sudo systemctl daemon-reload
|
||||
sudo systemctl enable --now teleo-auto-deploy.timer
|
||||
```
|
||||
|
||||
### 3. Verify
|
||||
|
||||
```bash
|
||||
# Timer is active
|
||||
systemctl status teleo-auto-deploy.timer
|
||||
|
||||
# Run once manually to seed the stamp file
|
||||
sudo systemctl start teleo-auto-deploy.service
|
||||
|
||||
# Check logs
|
||||
journalctl -u teleo-auto-deploy -n 20
|
||||
```
|
||||
|
||||
### 4. Add teleo sudoers for auto-deploy restarts
|
||||
|
||||
If not already present, add to `/etc/sudoers.d/teleo`:
|
||||
```
|
||||
teleo ALL=(ALL) NOPASSWD: /bin/systemctl restart teleo-pipeline, /bin/systemctl restart teleo-diagnostics
|
||||
```
|
||||
|
||||
## How It Works
|
||||
|
||||
Every 2 minutes, the timer fires `auto-deploy.sh`:
|
||||
1. Fetches main from Forgejo (localhost)
|
||||
2. Compares SHA against `/opt/teleo-eval/.last-deploy-sha`
|
||||
3. If new commits: pulls, syntax-checks Python, syncs to working dirs
|
||||
4. Restarts services ONLY if Python files changed in relevant paths
|
||||
5. Runs smoke tests (systemd status + health endpoints)
|
||||
6. Updates stamp on success. On failure: does NOT update stamp, retries next cycle.
|
||||
|
||||
## Monitoring
|
||||
|
||||
```bash
|
||||
# Recent deploys
|
||||
journalctl -u teleo-auto-deploy --since "1 hour ago"
|
||||
|
||||
# Timer schedule
|
||||
systemctl list-timers teleo-auto-deploy.timer
|
||||
|
||||
# Last deployed SHA
|
||||
cat /opt/teleo-eval/.last-deploy-sha
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
**"git pull --ff-only failed"**: The deploy checkout diverged from main.
|
||||
Fix: `cd /opt/teleo-eval/workspaces/deploy && git reset --hard origin/main`
|
||||
|
||||
**Syntax errors blocking deploy**: Fix the code, push to main. Next cycle retries.
|
||||
|
||||
**Service won't restart**: Check `journalctl -u teleo-pipeline -n 30`. Fix and push.
|
||||
Auto-deploy will retry because stamp wasn't updated.
|
||||
12
ops/auto-deploy.service
Normal file
12
ops/auto-deploy.service
Normal file
|
|
@ -0,0 +1,12 @@
|
|||
# Install: sudo cp ops/auto-deploy.service /etc/systemd/system/teleo-auto-deploy.service
|
||||
# Then: sudo systemctl daemon-reload && sudo systemctl enable --now teleo-auto-deploy.timer
|
||||
[Unit]
|
||||
Description=Auto-deploy teleo-codex from Forgejo to working directories
|
||||
After=network.target
|
||||
|
||||
[Service]
|
||||
Type=oneshot
|
||||
User=teleo
|
||||
ExecStart=/opt/teleo-eval/workspaces/deploy/ops/auto-deploy.sh
|
||||
StandardOutput=journal
|
||||
StandardError=journal
|
||||
128
ops/auto-deploy.sh
Executable file
128
ops/auto-deploy.sh
Executable file
|
|
@ -0,0 +1,128 @@
|
|||
#!/usr/bin/env bash
|
||||
# auto-deploy.sh — Pull from Forgejo, sync to working dirs, restart if needed.
|
||||
# Runs as systemd timer (teleo-auto-deploy.timer) every 2 minutes.
|
||||
# Exits silently when nothing has changed.
|
||||
set -euo pipefail
|
||||
|
||||
DEPLOY_CHECKOUT="/opt/teleo-eval/workspaces/deploy"
|
||||
PIPELINE_DIR="/opt/teleo-eval/pipeline"
|
||||
DIAGNOSTICS_DIR="/opt/teleo-eval/diagnostics"
|
||||
AGENT_STATE_DIR="/opt/teleo-eval/ops/agent-state"
|
||||
STAMP_FILE="/opt/teleo-eval/.last-deploy-sha"
|
||||
LOG_TAG="auto-deploy"
|
||||
|
||||
log() { logger -t "$LOG_TAG" "$1"; echo "$(date '+%Y-%m-%d %H:%M:%S') $1"; }
|
||||
|
||||
if [ ! -d "$DEPLOY_CHECKOUT/.git" ]; then
|
||||
log "ERROR: Deploy checkout not found at $DEPLOY_CHECKOUT. Run setup first."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
cd "$DEPLOY_CHECKOUT"
|
||||
if ! git fetch origin main --quiet 2>&1; then
|
||||
log "ERROR: git fetch failed"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
NEW_SHA=$(git rev-parse origin/main)
|
||||
OLD_SHA=$(cat "$STAMP_FILE" 2>/dev/null || echo "none")
|
||||
|
||||
if [ "$NEW_SHA" = "$OLD_SHA" ]; then
|
||||
exit 0
|
||||
fi
|
||||
|
||||
log "New commits: ${OLD_SHA:0:8} -> ${NEW_SHA:0:8}"
|
||||
|
||||
git checkout main --quiet 2>/dev/null || true
|
||||
if ! git pull --ff-only --quiet 2>&1; then
|
||||
log "ERROR: git pull --ff-only failed. Manual intervention needed."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Syntax check all Python files before copying
|
||||
ERRORS=0
|
||||
for f in ops/pipeline-v2/lib/*.py ops/pipeline-v2/*.py ops/diagnostics/*.py; do
|
||||
[ -f "$f" ] || continue
|
||||
if ! python3 -c "import ast, sys; ast.parse(open(sys.argv[1]).read())" "$f" 2>/dev/null; then
|
||||
log "SYNTAX ERROR: $f"
|
||||
ERRORS=$((ERRORS + 1))
|
||||
fi
|
||||
done
|
||||
if [ "$ERRORS" -gt 0 ]; then
|
||||
log "ERROR: $ERRORS syntax errors. Deploy aborted. Fix and push again."
|
||||
exit 1
|
||||
fi
|
||||
log "Syntax check passed"
|
||||
|
||||
# Sync to working directories (mirrors deploy.sh logic)
|
||||
RSYNC_FLAGS="-az --exclude=__pycache__ --exclude=*.pyc --exclude=*.bak*"
|
||||
|
||||
rsync $RSYNC_FLAGS ops/pipeline-v2/lib/ "$PIPELINE_DIR/lib/"
|
||||
|
||||
for f in teleo-pipeline.py reweave.py batch-extract-50.sh; do
|
||||
[ -f "ops/pipeline-v2/$f" ] && rsync $RSYNC_FLAGS "ops/pipeline-v2/$f" "$PIPELINE_DIR/$f"
|
||||
done
|
||||
|
||||
rsync $RSYNC_FLAGS ops/diagnostics/ "$DIAGNOSTICS_DIR/"
|
||||
rsync $RSYNC_FLAGS ops/agent-state/ "$AGENT_STATE_DIR/"
|
||||
[ -f ops/research-session.sh ] && rsync $RSYNC_FLAGS ops/research-session.sh /opt/teleo-eval/research-session.sh
|
||||
|
||||
log "Files synced"
|
||||
|
||||
# Restart services only if Python files changed
|
||||
RESTART=""
|
||||
if [ "$OLD_SHA" != "none" ]; then
|
||||
if git diff --name-only "$OLD_SHA" "$NEW_SHA" -- ops/pipeline-v2/ 2>/dev/null | grep -q '\.py$'; then
|
||||
RESTART="$RESTART teleo-pipeline"
|
||||
fi
|
||||
if git diff --name-only "$OLD_SHA" "$NEW_SHA" -- ops/diagnostics/ 2>/dev/null | grep -q '\.py$'; then
|
||||
RESTART="$RESTART teleo-diagnostics"
|
||||
fi
|
||||
else
|
||||
RESTART="teleo-pipeline teleo-diagnostics"
|
||||
fi
|
||||
|
||||
if [ -n "$RESTART" ]; then
|
||||
log "Restarting:$RESTART"
|
||||
sudo systemctl restart $RESTART
|
||||
sleep 5
|
||||
|
||||
FAIL=0
|
||||
for svc in $RESTART; do
|
||||
if systemctl is-active --quiet "$svc"; then
|
||||
log "$svc: active"
|
||||
else
|
||||
log "ERROR: $svc failed to start"
|
||||
journalctl -u "$svc" -n 5 --no-pager 2>/dev/null || true
|
||||
FAIL=1
|
||||
fi
|
||||
done
|
||||
|
||||
if echo "$RESTART" | grep -q "teleo-pipeline"; then
|
||||
if curl -sf --connect-timeout 3 http://localhost:8080/health > /dev/null 2>&1; then
|
||||
log "pipeline health: OK"
|
||||
else
|
||||
log "WARNING: pipeline health check failed"
|
||||
FAIL=1
|
||||
fi
|
||||
fi
|
||||
|
||||
if echo "$RESTART" | grep -q "teleo-diagnostics"; then
|
||||
if curl -sf --connect-timeout 3 http://localhost:8081/ops > /dev/null 2>&1; then
|
||||
log "diagnostics health: OK"
|
||||
else
|
||||
log "WARNING: diagnostics health check failed"
|
||||
FAIL=1
|
||||
fi
|
||||
fi
|
||||
|
||||
if [ "$FAIL" -gt 0 ]; then
|
||||
log "WARNING: Smoke test failures. NOT updating stamp. Will retry next cycle."
|
||||
exit 1
|
||||
fi
|
||||
else
|
||||
log "No Python changes — services not restarted"
|
||||
fi
|
||||
|
||||
echo "$NEW_SHA" > "$STAMP_FILE"
|
||||
log "Deploy complete: $(git log --oneline -1 "$NEW_SHA")"
|
||||
12
ops/auto-deploy.timer
Normal file
12
ops/auto-deploy.timer
Normal file
|
|
@ -0,0 +1,12 @@
|
|||
# Install: sudo cp ops/auto-deploy.timer /etc/systemd/system/teleo-auto-deploy.timer
|
||||
# Then: sudo systemctl daemon-reload && sudo systemctl enable --now teleo-auto-deploy.timer
|
||||
[Unit]
|
||||
Description=Run teleo auto-deploy every 2 minutes
|
||||
|
||||
[Timer]
|
||||
OnBootSec=30
|
||||
OnUnitActiveSec=2min
|
||||
AccuracySec=10s
|
||||
|
||||
[Install]
|
||||
WantedBy=timers.target
|
||||
59
ops/prune-branches.sh
Executable file
59
ops/prune-branches.sh
Executable file
|
|
@ -0,0 +1,59 @@
|
|||
#!/usr/bin/env bash
|
||||
# prune-branches.sh — Delete merged remote branches older than N days.
|
||||
# Usage: ./prune-branches.sh [--days 14] [--remote forgejo] [--execute]
|
||||
# Default: dry-run (shows what would be deleted). Pass --execute to actually delete.
|
||||
set -euo pipefail
|
||||
|
||||
DAYS=14
|
||||
REMOTE="forgejo"
|
||||
EXECUTE=false
|
||||
|
||||
while [ $# -gt 0 ]; do
|
||||
case "$1" in
|
||||
--days) DAYS="$2"; shift 2 ;;
|
||||
--remote) REMOTE="$2"; shift 2 ;;
|
||||
--execute) EXECUTE=true; shift ;;
|
||||
--help|-h) echo "Usage: $0 [--days N] [--remote name] [--execute]"; exit 0 ;;
|
||||
*) echo "Unknown arg: $1"; exit 1 ;;
|
||||
esac
|
||||
done
|
||||
|
||||
CUTOFF=$(date -v-${DAYS}d +%Y-%m-%d 2>/dev/null || date -d "-${DAYS} days" +%Y-%m-%d)
|
||||
PROTECTED="main|HEAD"
|
||||
|
||||
echo "Scanning $REMOTE for merged branches older than $CUTOFF..."
|
||||
echo ""
|
||||
|
||||
git fetch "$REMOTE" --prune --quiet
|
||||
|
||||
COUNT=0
|
||||
DELETE_COUNT=0
|
||||
|
||||
while IFS= read -r branch; do
|
||||
branch=$(echo "$branch" | sed 's/^[[:space:]]*//')
|
||||
[ -z "$branch" ] && continue
|
||||
|
||||
short="${branch#$REMOTE/}"
|
||||
echo "$short" | grep -qE "^($PROTECTED)$" && continue
|
||||
|
||||
last_date=$(git log -1 --format='%ai' "$branch" 2>/dev/null | cut -d' ' -f1)
|
||||
[ -z "$last_date" ] && continue
|
||||
COUNT=$((COUNT + 1))
|
||||
|
||||
if [[ "$last_date" < "$CUTOFF" ]]; then
|
||||
if $EXECUTE; then
|
||||
echo " DELETE: $short ($last_date)"
|
||||
git push "$REMOTE" --delete "$short" 2>/dev/null && DELETE_COUNT=$((DELETE_COUNT + 1)) || echo " FAILED: $short"
|
||||
else
|
||||
echo " WOULD DELETE: $short ($last_date)"
|
||||
DELETE_COUNT=$((DELETE_COUNT + 1))
|
||||
fi
|
||||
fi
|
||||
done < <(git branch -r | grep "^ $REMOTE/")
|
||||
|
||||
echo ""
|
||||
if $EXECUTE; then
|
||||
echo "Deleted $DELETE_COUNT of $COUNT branches."
|
||||
else
|
||||
echo "Would delete $DELETE_COUNT of $COUNT branches. Run with --execute to proceed."
|
||||
fi
|
||||
Loading…
Reference in a new issue