teleo-codex/diagnostics/PATCH_INSTRUCTIONS.md
m3taversal 33e670b436 argus: add active alerting system (Phase 1)
Three new files for the engineering acceleration initiative:
- alerting.py: 7 health check functions (dormant agents, quality regression,
  throughput anomaly, rejection spikes, stuck loops, cost spikes, domain
  rejection patterns) + failure report generator
- alerting_routes.py: /check, /api/alerts, /api/failure-report/{agent} endpoints
- PATCH_INSTRUCTIONS.md: integration guide for app.py (imports, route
  registration, auth middleware bypass, DB connection)

Observe and alert only — no pipeline modification. Independence constraint
is load-bearing for measurement trustworthiness.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 22:45:07 +00:00

65 lines
2.2 KiB
Markdown

# Alerting Integration Patch for app.py
Two changes needed in the live app.py:
## 1. Add import (after `from activity_endpoint import handle_activity`)
```python
from alerting_routes import register_alerting_routes
```
## 2. Register routes in create_app() (after the last `app.router.add_*` line)
```python
# Alerting — active monitoring endpoints
register_alerting_routes(app, _alerting_conn)
```
## 3. Add helper function (before create_app)
```python
def _alerting_conn() -> sqlite3.Connection:
"""Dedicated read-only connection for alerting checks.
Separate from app['db'] to avoid contention with request handlers.
Always sets row_factory for named column access.
"""
conn = sqlite3.connect(f"file:{DB_PATH}?mode=ro", uri=True)
conn.row_factory = sqlite3.Row
return conn
```
## 4. Add /check and /api/alerts to PUBLIC_PATHS
```python
_PUBLIC_PATHS = frozenset({"/", "/api/metrics", "/api/rejections", "/api/snapshots",
"/api/vital-signs", "/api/contributors", "/api/domains",
"/api/audit", "/check", "/api/alerts"})
```
## 5. Add /api/failure-report/ prefix check in auth middleware
In the `@web.middleware` auth function, add this alongside the existing
`request.path.startswith("/api/audit/")` check:
```python
if request.path.startswith("/api/failure-report/"):
return await handler(request)
```
## Deploy notes
- `alerting.py` and `alerting_routes.py` must be in the **same directory** as `app.py`
(i.e., `/opt/teleo-eval/diagnostics/`). The import uses a bare module name, not
a relative import, so Python resolves it via `sys.path` which includes the working
directory. If the deploy changes the working directory or uses a package structure,
switch the import in `alerting_routes.py` line 11 to `from .alerting import ...`.
- The `/api/failure-report/{agent}` endpoint is standalone — any agent can pull their
own report on demand via `GET /api/failure-report/<agent-name>?hours=24`.
## Files to deploy
- `alerting.py``/opt/teleo-eval/diagnostics/alerting.py`
- `alerting_routes.py``/opt/teleo-eval/diagnostics/alerting_routes.py`
- Patched `app.py``/opt/teleo-eval/diagnostics/app.py`