teleo-infrastructure/deploy-manifest.md
m3taversal 681afad506
Some checks failed
CI / lint-and-test (push) Has been cancelled
Consolidate pipeline code from teleo-codex + VPS into single repo
Sources merged:
- teleo-codex/ops/pipeline-v2/ (11 newer lib files, 5 new lib modules)
- teleo-codex/ops/ (agent-state, diagnostics expansion, systemd units, ops scripts)
- VPS /opt/teleo-eval/telegram/ (10 new bot files, agent configs)
- VPS /opt/teleo-eval/pipeline/ops/ (vector-gc, backfill-descriptions)
- VPS /opt/teleo-eval/sync-mirror.sh (Bug 2 + Step 2.5 fixes)

Non-trivial merges:
- connect.py: kept codex threshold (0.65) + added infra domain parameter
- watchdog.py: kept infra version (stale_pr integration, superset of codex)
- deploy.sh: codex rsync version (interim, until VPS git clone migration)
- diagnostics/app.py: codex decomposed dashboard (14 new route modules)

81 files changed, +17105/-200 lines

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 16:52:26 +01:00

2.3 KiB

Deploy Manifest

Every PR that touches VPS-deployed code must include a deploy manifest — either in the PR description or as a comment before requesting deploy. Rhea can reject deploys without one.

Template

Copy this into your PR description and fill it in:

## Deploy Manifest

**Files changed:**
- path/to/file.py (new | modified | deleted)

**Services to restart:**
- teleo-bot.service
- teleo-eval.service

**New ReadWritePaths:** (leave blank if none)
- /opt/teleo-eval/data/new-directory

**Migration steps:** (leave blank if none)
- Run: sqlite3 pipeline.db < migrations/001-add-column.sql

**Endpoints affected:**
- GET /health
- GET /api/alerts

**Expected behavior after deploy:**
- /health returns 200 with new field X
- New cron runs every 5 minutes

What Counts as VPS-Deployed Code

File type Example Needs manifest?
Python application code bot.py, app.py, alerting.py Yes
Shell scripts on VPS extract-cron.sh, evaluate-trigger.sh Yes
systemd service/timer files teleo-bot.service Yes
Database migrations ALTER TABLE, new tables Yes
HTML/CSS/JS served by app dashboard.html, teleo-app Yes
Claim/source/entity markdown domains/ai-alignment/claim.md No
Schema definitions schemas/claim.md No (but see schema-change-protocol.md)
Agent identity/beliefs agents/theseus/identity.md No

Rules

  1. No deploy without manifest. If the PR lacks one, Rhea bounces it back.
  2. List every service that needs restart. "Just restart everything" is not acceptable — it causes unnecessary downtime.
  3. ReadWritePaths are mandatory. If your code writes to a new path, say so. Missing ReadWritePaths is the #1 cause of silent deploy failures.
  4. Endpoints affected enables verification. Argus uses this field to run post-deploy smoke tests. Without it, verification is guesswork.
  5. Migration steps must be idempotent. If the deploy is retried, the migration shouldn't break.

Post-Deploy Verification

After Rhea restarts the service:

  1. Argus hits every endpoint listed in "Endpoints affected"
  2. Argus checks systemd journal for errors in the last 60 seconds
  3. Argus reports pass/fail in the Engineering group chat

If verification fails, Rhea rolls back. The PR author fixes and resubmits.