teleo-codex/ops/auto-deploy-setup.md
m3taversal 4e20986c25 ship: add agent SOP, auto-deploy infrastructure, cleanup stale files
- AGENT-SOP.md: enforceable checklist for commit/review/deploy cycle
- auto-deploy.sh + systemd units: 2-min timer pulls from Forgejo, syncs
  to working dirs, restarts services only when Python changes, smoke tests
- prune-branches.sh: dry-run-by-default branch cleanup tool
- Delete root diagnostics/ (stale artifacts, all code moved to ops/)
- Delete 7 orphaned HTML prototypes (untracked, local-only)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 12:46:25 +01:00

84 lines
2.4 KiB
Markdown

# Auto-Deploy Setup
One-time setup on VPS. After this, merges to main deploy automatically within 2 minutes.
## Prerequisites
- SSH access as `teleo` user: `ssh teleo@77.42.65.182`
- Forgejo running at localhost:3000
- `teleo` user has sudo access for `teleo-*` services
## Steps
### 1. Create the deploy checkout
```bash
git clone http://localhost:3000/teleo/teleo-codex.git /opt/teleo-eval/workspaces/deploy
cd /opt/teleo-eval/workspaces/deploy
git checkout main
```
This checkout is ONLY for auto-deploy. The pipeline's main worktree at
`/opt/teleo-eval/workspaces/main` is separate and untouched.
### 2. Install systemd units
```bash
sudo cp /opt/teleo-eval/workspaces/deploy/ops/auto-deploy.service /etc/systemd/system/teleo-auto-deploy.service
sudo cp /opt/teleo-eval/workspaces/deploy/ops/auto-deploy.timer /etc/systemd/system/teleo-auto-deploy.timer
sudo systemctl daemon-reload
sudo systemctl enable --now teleo-auto-deploy.timer
```
### 3. Verify
```bash
# Timer is active
systemctl status teleo-auto-deploy.timer
# Run once manually to seed the stamp file
sudo systemctl start teleo-auto-deploy.service
# Check logs
journalctl -u teleo-auto-deploy -n 20
```
### 4. Add teleo sudoers for auto-deploy restarts
If not already present, add to `/etc/sudoers.d/teleo`:
```
teleo ALL=(ALL) NOPASSWD: /bin/systemctl restart teleo-pipeline, /bin/systemctl restart teleo-diagnostics
```
## How It Works
Every 2 minutes, the timer fires `auto-deploy.sh`:
1. Fetches main from Forgejo (localhost)
2. Compares SHA against `/opt/teleo-eval/.last-deploy-sha`
3. If new commits: pulls, syntax-checks Python, syncs to working dirs
4. Restarts services ONLY if Python files changed in relevant paths
5. Runs smoke tests (systemd status + health endpoints)
6. Updates stamp on success. On failure: does NOT update stamp, retries next cycle.
## Monitoring
```bash
# Recent deploys
journalctl -u teleo-auto-deploy --since "1 hour ago"
# Timer schedule
systemctl list-timers teleo-auto-deploy.timer
# Last deployed SHA
cat /opt/teleo-eval/.last-deploy-sha
```
## Troubleshooting
**"git pull --ff-only failed"**: The deploy checkout diverged from main.
Fix: `cd /opt/teleo-eval/workspaces/deploy && git reset --hard origin/main`
**Syntax errors blocking deploy**: Fix the code, push to main. Next cycle retries.
**Service won't restart**: Check `journalctl -u teleo-pipeline -n 30`. Fix and push.
Auto-deploy will retry because stamp wasn't updated.