Commit graph

3 commits

Author SHA1 Message Date
8b1ce13da7 argus: add Phase 1 active monitoring system
- What: alerting.py (7 health checks), alerting_routes.py (3 endpoints),
  PATCH_INSTRUCTIONS.md (app.py integration guide for Rhea)
- Why: engineering acceleration initiative — move from passive dashboard
  to active monitoring with agent health, quality regression, throughput
  anomaly, stuck loop, cost spike, and domain rejection pattern detection
- Endpoints: GET /check, GET /api/alerts, GET /api/failure-report/{agent}
- Deploy: Rhea applies PATCH_INSTRUCTIONS to live app.py, restarts service,
  adds 5-min systemd timer for /check

Pentagon-Agent: Argus <9aa57086-bee9-461b-ae26-dfe5809820a8>
2026-04-14 18:14:07 +00:00
5514e04498 Consolidate diagnostics Python files to ops/diagnostics/
Move vitality.py/vitality_routes.py from root diagnostics/ to ops/diagnostics/ (canonical location).
Overwrite ops/diagnostics/alerting.py and alerting_routes.py with root versions (newer: SQL injection protection via _ALLOWED_DIM_EXPRS, proper error handling + conn.close).
Remove root diagnostics/*.py — all code now in ops/diagnostics/.
Include diff log documenting resolution of each multi-copy file.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 10:12:04 +02:00
33e670b436 argus: add active alerting system (Phase 1)
Three new files for the engineering acceleration initiative:
- alerting.py: 7 health check functions (dormant agents, quality regression,
  throughput anomaly, rejection spikes, stuck loops, cost spikes, domain
  rejection patterns) + failure report generator
- alerting_routes.py: /check, /api/alerts, /api/failure-report/{agent} endpoints
- PATCH_INSTRUCTIONS.md: integration guide for app.py (imports, route
  registration, auth middleware bypass, DB connection)

Observe and alert only — no pipeline modification. Independence constraint
is load-bearing for measurement trustworthiness.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 22:45:07 +00:00