teleo/teleo-codex

Fork 0

Teleo Agents 0e3f3c289d

Sync Graph Data to teleo-app / sync (push) Waiting to run

Details

reweave: merge 52 files via frontmatter union [auto]

2026-04-06 19:55:09 +00:00

2.7 KiB

Raw Blame History

type

domain

description

confidence

source

created

title

agent

scope

sourcer

related_claims

reweave_edges

claim

ai-alignment

The properties most relevant to autonomous weapons alignment (meaningful human control, intent, adversarial resistance) cannot be verified with current methods because behavioral testing cannot determine internal decision processes and adversarially trained systems resist interpretability-based verification

experimental

CSET Georgetown, AI Verification technical framework report

2026-04-04

Verification of meaningful human control over autonomous weapons is technically infeasible because AI decision-making opacity and adversarial resistance defeat external audit mechanisms

theseus

structural

CSET Georgetown

scalable oversight degrades rapidly as capability gaps grow

pre-deployment-AI-evaluations-do-not-predict-real-world-risk-creating-institutional-governance-built-on-unreliable-foundations

AI capability and reliability are independent dimensions

Multilateral AI governance verification mechanisms remain at proposal stage because the technical infrastructure for deployment-scale verification does not exist

Multilateral AI governance verification mechanisms remain at proposal stage because the technical infrastructure for deployment-scale verification does not exist|related|2026-04-06

Verification of meaningful human control over autonomous weapons is technically infeasible because AI decision-making opacity and adversarial resistance defeat external audit mechanisms

CSET's analysis reveals that verifying 'meaningful human control' faces fundamental technical barriers: (1) AI decision-making is opaque—external observers cannot determine whether a human 'meaningfully' reviewed a decision versus rubber-stamped it; (2) Verification requires access to system architectures that states classify as sovereign military secrets; (3) The same benchmark-reality gap documented in civilian AI (METR findings) applies to military systems—behavioral testing cannot determine intent or internal decision processes; (4) Adversarially trained systems (the most capable and most dangerous) are specifically resistant to interpretability-based verification approaches that work in civilian contexts. The report documents that as of early 2026, no state has operationalized any verification mechanism for autonomous weapons compliance—all proposals remain at research stage. This represents a Layer 0 measurement architecture failure more severe than in civilian AI governance, because adversarial system access cannot be compelled and the most dangerous properties (intent to override human control) lie in the unverifiable dimension.

2.7 KiB Raw Blame History

Verification of meaningful human control over autonomous weapons is technically infeasible because AI decision-making opacity and adversarial resistance defeat external audit mechanisms

2.7 KiB

Raw Blame History