m3taversal 2c0d428dc0 Add Phase 1+2 instrumentation: review records, cascade automation, cross-domain index, agent state

Phase 1 — Audit logging infrastructure:
- review_records table (migration v12) capturing every eval verdict with outcome, rejection reason, disagreement type
- Cascade automation: auto-flag dependent beliefs/positions when merged claims change
- Merge frontmatter stamps: last_review metadata on merged claim files

Phase 2 — Cross-domain and state tracking:
- Cross-domain citation index: entity overlap detection across domains on every merge
- Agent-state schema v1: file-backed state for VPS agents (memory, tasks, inbox, metrics)
- Cascade completion tracking: process-cascade-inbox.py logs review outcomes
- research-session.sh: state hooks + cascade processing integration

All changes are live on VPS. This commit brings the code under version control for review.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-04-02 10:50:49 +00:00

9.1 KiB

Raw Blame History

Agent State Schema v1

File-backed durable state for teleo agents running headless on VPS. Survives context truncation, crash recovery, and session handoffs.

Design Principles

Three formats — JSON for structured fields, JSONL for append-only logs, Markdown for context-window-friendly content
Many small files — selective loading, crash isolation, no locks needed
Write on events — not timers. State updates happen when something meaningful changes.
Shared-nothing writes — each agent owns its directory. Communication via inbox files.
State ≠ Git — state is operational (how the agent functions). Git is output (what the agent produces).

Directory Layout

/opt/teleo-eval/agent-state/{agent}/
├── report.json          # Current status — read every wake
├── tasks.json           # Active task queue — read every wake
├── session.json         # Current/last session metadata
├── memory.md            # Accumulated cross-session knowledge (structured)
├── inbox/               # Messages from other agents/orchestrator
│   └── {uuid}.json      # One file per message, atomic create
├── journal.jsonl        # Append-only session log
└── metrics.json         # Cumulative performance counters

File Specifications

report.json

Written: after each meaningful action (session start, key finding, session end) Read: every wake, by orchestrator for monitoring

{
  "agent": "rio",
  "updated_at": "2026-03-31T22:00:00Z",
  "status": "idle | researching | extracting | evaluating | error",
  "summary": "Completed research session — 8 sources archived on Solana launchpad mechanics",
  "current_task": null,
  "last_session": {
    "id": "20260331-220000",
    "started_at": "2026-03-31T20:30:00Z",
    "ended_at": "2026-03-31T22:00:00Z",
    "outcome": "completed | timeout | error",
    "sources_archived": 8,
    "branch": "rio/research-2026-03-31",
    "pr_number": 247
  },
  "blocked_by": null,
  "next_priority": "Follow up on conditional AMM thread from @0xfbifemboy"
}

tasks.json

Written: when task status changes Read: every wake

{
  "agent": "rio",
  "updated_at": "2026-03-31T22:00:00Z",
  "tasks": [
    {
      "id": "task-001",
      "type": "research | extract | evaluate | follow-up | disconfirm",
      "description": "Investigate conditional AMM mechanisms in MetaDAO v2",
      "status": "pending | active | completed | dropped",
      "priority": "high | medium | low",
      "created_at": "2026-03-31T22:00:00Z",
      "context": "Flagged in research session 2026-03-31 — @0xfbifemboy thread on conditional liquidity",
      "follow_up_from": null,
      "completed_at": null,
      "outcome": null
    }
  ]
}

session.json

Written: at session start and session end Read: every wake (for continuation), by orchestrator for scheduling

{
  "agent": "rio",
  "session_id": "20260331-220000",
  "started_at": "2026-03-31T20:30:00Z",
  "ended_at": "2026-03-31T22:00:00Z",
  "type": "research | extract | evaluate | ad-hoc",
  "domain": "internet-finance",
  "branch": "rio/research-2026-03-31",
  "status": "running | completed | timeout | error",
  "model": "sonnet",
  "timeout_seconds": 5400,
  "research_question": "How is conditional liquidity being implemented in Solana AMMs?",
  "belief_targeted": "Markets aggregate information better than votes because skin-in-the-game creates selection pressure on beliefs",
  "disconfirmation_target": "Cases where prediction markets failed to aggregate information despite financial incentives",
  "sources_archived": 8,
  "sources_expected": 10,
  "tokens_used": null,
  "cost_usd": null,
  "errors": [],
  "handoff_notes": "Found 3 sources on conditional AMM failures — needs extraction. Also flagged @metaproph3t thread for Theseus (AI governance angle)."
}

memory.md

Written: at session end, when learning something critical Read: every wake (included in research prompt context)

# Rio — Operational Memory

## Cross-Session Patterns
- Conditional AMMs keep appearing across 3+ independent sources (sessions 03-28, 03-29, 03-31). This is likely a real trend, not cherry-picking.
- @0xfbifemboy consistently produces highest-signal threads in the DeFi mechanism design space.

## Dead Ends (don't re-investigate)
- Polymarket fee structure analysis (2026-03-25): fully documented in existing claims, no new angles.
- Jupiter governance token utility (2026-03-27): vaporware, no mechanism to analyze.

## Open Questions
- Is MetaDAO's conditional market maker manipulation-resistant at scale? No evidence either way yet.
- How does futarchy handle low-liquidity markets? This is the keystone weakness.

## Corrections
- Previously believed Drift protocol was pure order-book. Actually hybrid AMM+CLOB. Updated 2026-03-30.

## Cross-Agent Flags Received
- Theseus (2026-03-29): "Check if MetaDAO governance has AI agent participation — alignment implications"
- Leo (2026-03-28): "Your conditional AMM analysis connects to Astra's resource allocation claims"

inbox/{uuid}.json

Written: by other agents or orchestrator Read: checked on wake, deleted after processing

{
  "id": "msg-abc123",
  "from": "theseus",
  "to": "rio",
  "created_at": "2026-03-31T18:00:00Z",
  "type": "flag | task | question | cascade",
  "priority": "high | normal",
  "subject": "Check MetaDAO for AI agent participation",
  "body": "Found evidence that AI agents are trading on Drift — check if any are participating in MetaDAO conditional markets. Alignment implications if automated agents are influencing futarchic governance.",
  "source_ref": "theseus/research-2026-03-31",
  "expires_at": null
}

journal.jsonl

Written: append at session boundaries Read: debug/audit only (never loaded into agent context by default)

{"ts":"2026-03-31T20:30:00Z","event":"session_start","session_id":"20260331-220000","type":"research"}
{"ts":"2026-03-31T20:35:00Z","event":"orient_complete","files_read":["identity.md","beliefs.md","reasoning.md","_map.md"]}
{"ts":"2026-03-31T21:30:00Z","event":"sources_archived","count":5,"domain":"internet-finance"}
{"ts":"2026-03-31T22:00:00Z","event":"session_end","outcome":"completed","sources_archived":8,"handoff":"conditional AMM failures need extraction"}

metrics.json

Written: at session end (cumulative counters) Read: by CI scoring system, by orchestrator for scheduling decisions

{
  "agent": "rio",
  "updated_at": "2026-03-31T22:00:00Z",
  "lifetime": {
    "sessions_total": 47,
    "sessions_completed": 42,
    "sessions_timeout": 3,
    "sessions_error": 2,
    "sources_archived": 312,
    "claims_proposed": 89,
    "claims_accepted": 71,
    "claims_challenged": 12,
    "claims_rejected": 6,
    "disconfirmation_attempts": 47,
    "disconfirmation_hits": 8,
    "cross_agent_flags_sent": 23,
    "cross_agent_flags_received": 15
  },
  "rolling_30d": {
    "sessions": 12,
    "sources_archived": 87,
    "claims_proposed": 24,
    "acceptance_rate": 0.83,
    "avg_sources_per_session": 7.25
  }
}

Integration Points

research-session.sh

Add these hooks:

Pre-session (after branch creation, before Claude launch):
- Write session.json with status "running"
- Write report.json with status "researching"
- Append session_start to journal.jsonl
- Include memory.md and tasks.json in the research prompt
Post-session (after commit, before/after PR):
- Update session.json with outcome, source count, branch, PR number
- Update report.json with summary and next_priority
- Update metrics.json counters
- Append session_end to journal.jsonl
- Process and clean inbox/ (mark processed messages)
On error/timeout:
- Update session.json status to "error" or "timeout"
- Update report.json with error info
- Append error event to journal.jsonl

Pipeline daemon (teleo-pipeline.py)

Read report.json for all agents to build dashboard
Write to inbox/ when cascade events need agent attention
Read metrics.json for scheduling decisions (deprioritize agents with high error rates)

Claude research prompt

Add to the prompt:

### Step 0: Load Operational State (1 min)
Read /opt/teleo-eval/agent-state/{agent}/memory.md — this is your cross-session operational memory.
Read /opt/teleo-eval/agent-state/{agent}/tasks.json — check for pending tasks.
Check /opt/teleo-eval/agent-state/{agent}/inbox/ for messages from other agents.
Process any high-priority inbox items before choosing your research direction.

Bootstrap

Run ops/agent-state/bootstrap.sh to create directories and seed initial state for all agents.

Migration from Existing State

research-journal.md continues as-is (agent-written, in git). memory.md is the structured equivalent for operational state (not in git).
ops/sessions/*.json continue for backward compat. session.json per agent is the richer replacement.
ops/queue.md remains the human-visible task board. tasks.json per agent is the machine-readable equivalent.
Workspace flags (~/.pentagon/workspace/collective/flag-*) migrate to inbox/ messages over time.

9.1 KiB Raw Blame History