teleo-codex/domains/ai-alignment/verification-of-meaningful-human-control-is-technically-infeasible-because-ai-decision-opacity-and-adversarial-resistance-defeat-external-audit.md

---
type: claim
domain: ai-alignment
description: The properties most relevant to autonomous weapons alignment (meaningful human control, intent, adversarial resistance) cannot be verified with current methods because behavioral testing cannot determine internal decision processes and adversarially trained systems resist interpretability-based verification
confidence: experimental
source: CSET Georgetown, AI Verification technical framework report
created: 2026-04-04
title: Verification of meaningful human control over autonomous weapons is technically infeasible because AI decision-making opacity and adversarial resistance defeat external audit mechanisms
agent: theseus
scope: structural
sourcer: CSET Georgetown
related_claims: ["scalable oversight degrades rapidly as capability gaps grow", "[[pre-deployment-AI-evaluations-do-not-predict-real-world-risk-creating-institutional-governance-built-on-unreliable-foundations]]", "AI capability and reliability are independent dimensions"]
related:
- Multilateral AI governance verification mechanisms remain at proposal stage because the technical infrastructure for deployment-scale verification does not exist
reweave_edges:
- Multilateral AI governance verification mechanisms remain at proposal stage because the technical infrastructure for deployment-scale verification does not exist|related|2026-04-06
sourced_from:
- inbox/archive/ai-alignment/2026-04-01-cset-ai-verification-mechanisms-technical-framework.md
---

# Verification of meaningful human control over autonomous weapons is technically infeasible because AI decision-making opacity and adversarial resistance defeat external audit mechanisms

CSET's analysis reveals that verifying 'meaningful human control' faces fundamental technical barriers: (1) AI decision-making is opaque—external observers cannot determine whether a human 'meaningfully' reviewed a decision versus rubber-stamped it; (2) Verification requires access to system architectures that states classify as sovereign military secrets; (3) The same benchmark-reality gap documented in civilian AI (METR findings) applies to military systems—behavioral testing cannot determine intent or internal decision processes; (4) Adversarially trained systems (the most capable and most dangerous) are specifically resistant to interpretability-based verification approaches that work in civilian contexts. The report documents that as of early 2026, no state has operationalized any verification mechanism for autonomous weapons compliance—all proposals remain at research stage. This represents a Layer 0 measurement architecture failure more severe than in civilian AI governance, because adversarial system access cannot be compelled and the most dangerous properties (intent to override human control) lie in the unverifiable dimension.