teleo-infrastructure/tests/test_apply_proposal.py
Fawaz 7bb6fc417b
Some checks are pending
CI / lint-and-test (push) Waiting to run
feat(kb): apply_proposal engine (stage 2 of KB apply pipeline) (#35)
* feat(kb): apply_proposal engine to land approved proposals into canonical

Stage 2 of the KB apply pipeline (approve -> APPLY -> render -> surface).
Turns an approved kb_stage.kb_proposals row into canonical public.* rows and
flips the ledger to 'applied' in one verified transaction.

- Connects as the narrow kb_apply role (never superuser): writes only
  strategies, strategy_nodes, claim_evidence, claim_edges + kb_proposals ledger.
  Enforces "agents propose, do not self-apply" at the DB boundary.
- Per-type handlers: revise_strategy (versioned strategy + node replace),
  add_edge, attach_evidence (requires existing source_id; source minting is
  intentionally out of scope for kb_apply's grants).
- Strict apply_payload contract (v1); freeform eval packets are normalized
  upstream, not applied directly.
- --dry-run prints exact SQL; idempotent (refuses non-approved / already-applied);
  transactional with an in-txn DO-block invariant check that rolls back on failure.
- Unit tests cover SQL builders, validation, dispatch, and status guards.

* fix(kb): rowcount=1 apply guard + real applied_by FK stamp

Closes the three draft-exit review items on the apply engine:

- Ledger flip now runs in a DO block asserting exactly one 'approved'
  row moved to 'applied' (GET DIAGNOSTICS row_count). Closes the
  concurrent double-apply race — load_proposal (read) and the flip
  (write) are separate statements, so a row lock cannot span them; only
  one concurrent apply can match status='approved', so rowcount=1 is the
  authoritative guard. A loser RAISEs and the whole txn rolls back.
- applied_by_agent_id is stamped as a real FK resolved from public.agents
  by handle, defaulting to the kb-apply service agent — no more NULL FK,
  no backfill needed.
- scripts/kb_apply_prereqs.sql: one-time superuser bootstrap — inserts the
  kb-apply service-agent row (kb_apply never gets INSERT on agents), grants
  kb_apply SELECT on public.agents, and ensures the one-active-strategy
  unique index (idempotent; already present on prod).

18/18 unit tests pass.

* fix(kb): hard-resolve applied_by handle, RAISE on NULL FK

Resolve applied_by into a variable and assert NOT NULL before the ledger
flip, instead of an inline subselect that silently stamps a NULL
applied_by_agent_id on an unresolved handle. Since the FK is ON DELETE SET
NULL, a bad handle (typo/unseeded agent) was a legal silent NULL -- the
perpetually-NULL FK we eliminated. Unresolved handle now hard-fails ->
rollback. Non-default --applied-by (operator, future drafters) is the path
that goes through the lookup and could strand NULL.
2026-07-04 19:57:49 -04:00

198 lines
7 KiB
Python

"""Unit tests for scripts/apply_proposal.py.
Pure tests: exercise the SQL builders, payload validation, and dispatch without
a live database. Runnable under pytest or standalone (`python3 tests/test_apply_proposal.py`).
"""
from pathlib import Path
import sys
try:
import pytest
except ImportError: # standalone fallback so the file runs without pytest installed
import contextlib
import types
pytest = types.SimpleNamespace()
@contextlib.contextmanager
def _raises(exc):
try:
yield
except exc:
return
raise AssertionError(f"expected {exc.__name__} to be raised")
pytest.raises = _raises
REPO_ROOT = Path(__file__).resolve().parents[1]
sys.path.insert(0, str(REPO_ROOT / "scripts"))
import apply_proposal as ap # noqa: E402
# --- literals ------------------------------------------------------------- #
def test_sql_literal_escapes_single_quotes():
assert ap.sql_literal("O'Brien") == "'O''Brien'"
def test_sql_literal_none_is_null():
assert ap.sql_literal(None) == "null"
def test_sql_literal_bool_and_number():
assert ap.sql_literal(True) == "true"
assert ap.sql_literal(0.5) == "0.5"
# --- revise_strategy ------------------------------------------------------ #
def _revise_payload():
return {
"apply_payload": {
"agent_id": "11111111-1111-1111-1111-111111111111",
"strategy": {
"diagnosis": "Species-level inflection.",
"guiding_policy": "Build Teleo as a multi-agent public intelligence.",
"proximate_objectives": ["route before creating", "keep KB calibrated"],
},
"strategy_nodes": [
{"node_type": "diagnosis", "title": "Inflection", "body": "b1", "rank": 1},
{"node_type": "policy", "title": "Multi-agent PI", "body": "b2", "rank": 1},
],
}
}
def test_revise_strategy_sql_shape():
sql = ap.build_revise_strategy_sql(_revise_payload()["apply_payload"], "pid-1", "m3ta")
assert sql.startswith("begin;")
assert sql.rstrip().endswith("commit;")
# deactivate old, insert versioned active, retire nodes, insert new nodes
assert "update public.strategies" in sql
assert "active = false" in sql
assert "coalesce(max(version), 0) + 1" in sql
assert "update public.strategy_nodes" in sql
assert "insert into public.strategy_nodes" in sql
# ledger + invariant
assert "status = 'applied'" in sql
assert "exactly one active strategy" in sql
# proximate_objectives rendered as jsonb
assert "::jsonb" in sql
def test_revise_strategy_requires_agent_id():
payload = _revise_payload()["apply_payload"]
del payload["agent_id"]
with pytest.raises(ValueError):
ap.build_revise_strategy_sql(payload, "pid-1", None)
def test_revise_strategy_rejects_bad_node_type():
payload = _revise_payload()["apply_payload"]
payload["strategy_nodes"][0]["node_type"] = "manifesto"
with pytest.raises(ValueError):
ap.build_revise_strategy_sql(payload, "pid-1", None)
# --- add_edge ------------------------------------------------------------- #
def test_add_edge_sql_dedup_guard():
payload = {"from_claim": "aaaa", "to_claim": "bbbb", "edge_type": "supersedes", "weight": 0.9}
sql = ap.build_add_edge_sql(payload, "pid-2", None)
assert "insert into public.claim_edges" in sql
assert "not exists" in sql
assert "::edge_type" in sql
assert "status = 'applied'" in sql
def test_add_edge_rejects_self_loop():
payload = {"from_claim": "same", "to_claim": "same", "edge_type": "supports"}
with pytest.raises(ValueError):
ap.build_add_edge_sql(payload, "pid-2", None)
# --- attach_evidence ------------------------------------------------------ #
def test_attach_evidence_sql_with_source_id():
payload = {"evidence": [{"claim_id": "cccc", "source_id": "dddd", "role": "grounds", "weight": 0.78}]}
sql = ap.build_attach_evidence_sql(payload, "pid-3", None)
assert "insert into public.claim_evidence" in sql
assert "::evidence_role" in sql
assert "not exists" in sql
def test_attach_evidence_requires_source_id():
payload = {"evidence": [{"claim_id": "cccc"}]}
with pytest.raises(ValueError):
ap.build_attach_evidence_sql(payload, "pid-3", None)
# --- dispatch + guards ---------------------------------------------------- #
def test_build_apply_sql_requires_apply_payload():
proposal = {"id": "pid", "proposal_type": "add_edge", "payload": {"rationale": "x"}}
with pytest.raises(ValueError):
ap.build_apply_sql(proposal, None)
def test_build_apply_sql_rejects_unsupported_type():
proposal = {"id": "pid", "proposal_type": "reject_claim", "payload": {"apply_payload": {}}}
with pytest.raises(ValueError):
ap.build_apply_sql(proposal, None)
def test_assert_applyable_blocks_pending():
with pytest.raises(SystemExit):
ap.assert_applyable({"id": "pid", "status": "pending_review"})
def test_assert_applyable_blocks_already_applied():
with pytest.raises(SystemExit):
ap.assert_applyable({"id": "pid", "status": "applied"})
def test_assert_applyable_allows_approved():
ap.assert_applyable({"id": "pid", "status": "approved"}) # no raise
# --- ledger flip: concurrency guard + FK stamp --------------------------- #
def test_ledger_flip_asserts_rowcount_one():
# The flip must guard against a concurrent double-apply by asserting exactly
# one 'approved' row moved, not by re-reading status afterwards.
sql = ap.build_add_edge_sql({"from_claim": "a", "to_claim": "b", "edge_type": "supports"}, "pid", None)
assert "get diagnostics" in sql
assert "flipped <> 1" in sql
assert "and status = 'approved'" in sql
def test_ledger_flip_stamps_agent_fk():
sql = ap.build_add_edge_sql({"from_claim": "a", "to_claim": "b", "edge_type": "supports"}, "pid", "kb-apply")
# Hard resolve into a variable + NOT-NULL assert, never a silent inline
# subselect that would stamp NULL on an unresolved handle.
assert "select id into resolved_agent_id" in sql
assert "resolved_agent_id is null then" in sql
assert "applied_by_agent_id = resolved_agent_id" in sql
assert "applied_by_handle = 'kb-apply'" in sql
def test_build_apply_sql_defaults_applied_by_to_service_agent():
proposal = {
"id": "pid",
"proposal_type": "add_edge",
"payload": {"apply_payload": {"from_claim": "a", "to_claim": "b", "edge_type": "supports"}},
}
sql = ap.build_apply_sql(proposal, None)
assert f"applied_by_handle = '{ap.SERVICE_AGENT_HANDLE}'" in sql
if __name__ == "__main__":
import traceback
failures = 0
for name, fn in sorted(globals().items()):
if name.startswith("test_") and callable(fn):
try:
fn()
print(f"PASS {name}")
except Exception: # noqa: BLE001
failures += 1
print(f"FAIL {name}")
traceback.print_exc()
sys.exit(1 if failures else 0)