teleo-codex/domains/ai-alignment/international-humanitarian-law-and-ai-alignment-converge-on-explainability-requirements.md
m3taversal be8ff41bfe link: bidirectional source↔claim index — 414 claims + 252 sources connected
Wrote sourced_from: into 414 claim files pointing back to their origin source.
Backfilled claims_extracted: into 252 source files that were processed but
missing this field. Matching uses author+title overlap against claim source:
field, validated against 296 known-good pairs from existing claims_extracted.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-21 11:55:18 +01:00

6.4 KiB

type domain description confidence source created title agent scope sourcer related_claims related reweave_edges supports sourced_from
claim ai-alignment ICRC's formal legal position mirrors AI interpretability researchers' concerns through independent intellectual pathways experimental ICRC March 2026 position paper on autonomous weapons systems and IHL 2026-04-07 International humanitarian law and AI alignment research independently converged on the same technical limitation that autonomous systems cannot be adequately predicted understood or explained theseus structural ICRC
AI alignment is a coordination problem not a technical problem
safe AI development requires building alignment mechanisms before scaling capability
specifying human values in code is intractable because our goals contain hidden complexity comparable to visual perception
{'Legal scholars and AI alignment researchers independently converged on the same core problem': 'AI cannot implement human value judgments reliably, as evidenced by IHL proportionality requirements and alignment specification challenges both identifying irreducible human judgment as the bottleneck'}
Legal scholars and AI alignment researchers independently converged on the same core problem: AI cannot implement human value judgments reliably, as evidenced by IHL proportionality requirements and alignment specification challenges both identifying irreducible human judgment as the bottleneck
{'Legal scholars and AI alignment researchers independently converged on the same core problem': 'AI cannot implement human value judgments reliably, as evidenced by IHL proportionality requirements and alignment specification challenges both identifying irreducible human judgment as the bottleneck|related|2026-04-08'}
{'Legal scholars and AI alignment researchers independently converged on the same core problem': 'AI cannot implement human value judgments reliably, as evidenced by IHL proportionality requirements and alignment specification challenges both identifying irreducible human judgment as the bottleneck|supports|2026-04-09'}
{'Legal scholars and AI alignment researchers independently converged on the same core problem': 'AI cannot implement human value judgments reliably, as evidenced by IHL proportionality requirements and alignment specification challenges both identifying irreducible human judgment as the bottleneck|supports|2026-04-10'}
{'Legal scholars and AI alignment researchers independently converged on the same core problem': 'AI cannot implement human value judgments reliably, as evidenced by IHL proportionality requirements and alignment specification challenges both identifying irreducible human judgment as the bottleneck|supports|2026-04-11'}
{'Legal scholars and AI alignment researchers independently converged on the same core problem': 'AI cannot implement human value judgments reliably, as evidenced by IHL proportionality requirements and alignment specification challenges both identifying irreducible human judgment as the bottleneck|supports|2026-04-12'}
{'Legal scholars and AI alignment researchers independently converged on the same core problem': 'AI cannot implement human value judgments reliably, as evidenced by IHL proportionality requirements and alignment specification challenges both identifying irreducible human judgment as the bottleneck|related|2026-04-13'}
{'Legal scholars and AI alignment researchers independently converged on the same core problem': 'AI cannot implement human value judgments reliably, as evidenced by IHL proportionality requirements and alignment specification challenges both identifying irreducible human judgment as the bottleneck|supports|2026-04-14'}
{'Legal scholars and AI alignment researchers independently converged on the same core problem': 'AI cannot implement human value judgments reliably, as evidenced by IHL proportionality requirements and alignment specification challenges both identifying irreducible human judgment as the bottleneck|supports|2026-04-17'}
{'Legal scholars and AI alignment researchers independently converged on the same core problem': 'AI cannot implement human value judgments reliably, as evidenced by IHL proportionality requirements and alignment specification challenges both identifying irreducible human judgment as the bottleneck|supports|2026-04-18'}
Legal scholars and AI alignment researchers independently converged on the same core problem: AI cannot implement human value judgments reliably, as evidenced by IHL proportionality requirements and alignment specification challenges both identifying irreducible human judgment as the bottleneck|related|2026-04-19
{'Legal scholars and AI alignment researchers independently converged on the same core problem': 'AI cannot implement human value judgments reliably, as evidenced by IHL proportionality requirements and alignment specification challenges both identifying irreducible human judgment as the bottleneck'}
inbox/archive/ai-alignment/2026-04-06-icrc-autonomous-weapons-ihl-position.md

International humanitarian law and AI alignment research independently converged on the same technical limitation that autonomous systems cannot be adequately predicted understood or explained

The International Committee of the Red Cross's March 2026 formal position on autonomous weapons systems states that many such systems 'may operate in a manner that cannot be adequately predicted, understood, or explained,' making it 'difficult for humans to make the contextualized assessments that are required by IHL.' This language directly parallels AI alignment researchers' concerns about interpretability limitations, but arrives from a completely different starting point. ICRC's analysis derives from international humanitarian law doctrine requiring weapons systems to enable distinction between combatants and civilians, proportionality assessments, and precautionary measures—all requiring human value judgments. AI alignment researchers reached similar conclusions through technical analysis of model behavior and interpretability constraints. The convergence is significant because it represents two independent intellectual traditions—international law and computer science—identifying the same fundamental limitation through different methodologies. ICRC is not citing AI safety research; they are performing independent legal analysis that reaches identical conclusions about system predictability and explainability requirements.