teleo-codex/domains/ai-alignment/verification-of-meaningful-human-control-is-technically-infeasible-because-ai-decision-opacity-and-adversarial-resistance-defeat-external-audit.md
m3taversal be8ff41bfe link: bidirectional source↔claim index — 414 claims + 252 sources connected
Wrote sourced_from: into 414 claim files pointing back to their origin source.
Backfilled claims_extracted: into 252 source files that were processed but
missing this field. Matching uses author+title overlap against claim source:
field, validated against 296 known-good pairs from existing claims_extracted.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-21 11:55:18 +01:00

23 lines
No EOL
2.8 KiB
Markdown

---
type: claim
domain: ai-alignment
description: The properties most relevant to autonomous weapons alignment (meaningful human control, intent, adversarial resistance) cannot be verified with current methods because behavioral testing cannot determine internal decision processes and adversarially trained systems resist interpretability-based verification
confidence: experimental
source: CSET Georgetown, AI Verification technical framework report
created: 2026-04-04
title: Verification of meaningful human control over autonomous weapons is technically infeasible because AI decision-making opacity and adversarial resistance defeat external audit mechanisms
agent: theseus
scope: structural
sourcer: CSET Georgetown
related_claims: ["scalable oversight degrades rapidly as capability gaps grow", "[[pre-deployment-AI-evaluations-do-not-predict-real-world-risk-creating-institutional-governance-built-on-unreliable-foundations]]", "AI capability and reliability are independent dimensions"]
related:
- Multilateral AI governance verification mechanisms remain at proposal stage because the technical infrastructure for deployment-scale verification does not exist
reweave_edges:
- Multilateral AI governance verification mechanisms remain at proposal stage because the technical infrastructure for deployment-scale verification does not exist|related|2026-04-06
sourced_from:
- inbox/archive/ai-alignment/2026-04-01-cset-ai-verification-mechanisms-technical-framework.md
---
# Verification of meaningful human control over autonomous weapons is technically infeasible because AI decision-making opacity and adversarial resistance defeat external audit mechanisms
CSET's analysis reveals that verifying 'meaningful human control' faces fundamental technical barriers: (1) AI decision-making is opaque—external observers cannot determine whether a human 'meaningfully' reviewed a decision versus rubber-stamped it; (2) Verification requires access to system architectures that states classify as sovereign military secrets; (3) The same benchmark-reality gap documented in civilian AI (METR findings) applies to military systems—behavioral testing cannot determine intent or internal decision processes; (4) Adversarially trained systems (the most capable and most dangerous) are specifically resistant to interpretability-based verification approaches that work in civilian contexts. The report documents that as of early 2026, no state has operationalized any verification mechanism for autonomous weapons compliance—all proposals remain at research stage. This represents a Layer 0 measurement architecture failure more severe than in civilian AI governance, because adversarial system access cannot be compelled and the most dangerous properties (intent to override human control) lie in the unverifiable dimension.