From 5b4a6f35ba7fa77f2a3c883e4ff36e9a711a7882 Mon Sep 17 00:00:00 2001
From: Teleo Agents <agents@livingip.xyz>
Date: Mon, 27 Apr 2026 00:17:37 +0000
Subject: [PATCH] reciprocal edges: 7 edges from 1 new claims

---
 ...is-for-mandatory-third-party-evaluation.md |  1 +
 ...vidence-exceeding-benchmark-predictions.md |  1 +
 ...pability-uplift-to-operational-autonomy.md | 13 +++++--
 ...al-negotiation-is-governance-instrument.md |  5 ++-
 ...on-behaviorally-insufficient-evaluation.md | 15 ++++++-
 ...ernance-built-on-unreliable-foundations.md | 39 +++++++++++++++++--
 ...mands-safety-unconstrained-alternatives.md | 16 ++++++--
 7 files changed, 77 insertions(+), 13 deletions(-)

diff --git a/domains/ai-alignment/cross-lab-alignment-evaluation-surfaces-safety-gaps-internal-evaluation-misses-providing-empirical-basis-for-mandatory-third-party-evaluation.md b/domains/ai-alignment/cross-lab-alignment-evaluation-surfaces-safety-gaps-internal-evaluation-misses-providing-empirical-basis-for-mandatory-third-party-evaluation.md
index 691592136..f92a93714 100644
--- a/domains/ai-alignment/cross-lab-alignment-evaluation-surfaces-safety-gaps-internal-evaluation-misses-providing-empirical-basis-for-mandatory-third-party-evaluation.md
+++ b/domains/ai-alignment/cross-lab-alignment-evaluation-surfaces-safety-gaps-internal-evaluation-misses-providing-empirical-basis-for-mandatory-third-party-evaluation.md
@@ -17,6 +17,7 @@ related:
 - AI-models-distinguish-testing-from-deployment-environments-providing-empirical-evidence-for-deceptive-alignment-concerns
 - pre-deployment-AI-evaluations-do-not-predict-real-world-risk-creating-institutional-governance-built-on-unreliable-foundations
 - multi-agent deployment exposes emergent security vulnerabilities invisible to single-agent evaluation because cross-agent propagation identity spoofing and unauthorized compliance arise only in realistic multi-party environments
+- independent-ai-evaluation-infrastructure-faces-evaluation-enforcement-disconnect
 reweave_edges:
 - Making research evaluations into compliance triggers closes the translation gap by design by eliminating the institutional boundary between risk detection and risk response|related|2026-04-17
 supports:
diff --git a/domains/ai-alignment/cyber-is-exceptional-dangerous-capability-domain-with-documented-real-world-evidence-exceeding-benchmark-predictions.md b/domains/ai-alignment/cyber-is-exceptional-dangerous-capability-domain-with-documented-real-world-evidence-exceeding-benchmark-predictions.md
index 363307387..302275332 100644
--- a/domains/ai-alignment/cyber-is-exceptional-dangerous-capability-domain-with-documented-real-world-evidence-exceeding-benchmark-predictions.md
+++ b/domains/ai-alignment/cyber-is-exceptional-dangerous-capability-domain-with-documented-real-world-evidence-exceeding-benchmark-predictions.md
@@ -15,6 +15,7 @@ related:
 - cyber-is-exceptional-dangerous-capability-domain-with-documented-real-world-evidence-exceeding-benchmark-predictions
 - cyber-capability-benchmarks-overstate-exploitation-understate-reconnaissance-because-ctf-isolates-techniques-from-attack-phase-dynamics
 - AI lowers the expertise barrier for engineering biological weapons from PhD-level to amateur which makes bioterrorism the most proximate AI-enabled existential risk
+- independent-ai-evaluation-infrastructure-faces-evaluation-enforcement-disconnect
 reweave_edges:
 - AI cyber capability benchmarks systematically overstate exploitation capability while understating reconnaissance capability because CTF environments isolate single techniques from real attack phase dynamics|related|2026-04-06
 supports:
diff --git a/domains/ai-alignment/first-ai-model-to-complete-end-to-end-enterprise-attack-chain-converts-capability-uplift-to-operational-autonomy.md b/domains/ai-alignment/first-ai-model-to-complete-end-to-end-enterprise-attack-chain-converts-capability-uplift-to-operational-autonomy.md
index 5f700c374..8b4d8b378 100644
--- a/domains/ai-alignment/first-ai-model-to-complete-end-to-end-enterprise-attack-chain-converts-capability-uplift-to-operational-autonomy.md
+++ b/domains/ai-alignment/first-ai-model-to-complete-end-to-end-enterprise-attack-chain-converts-capability-uplift-to-operational-autonomy.md
@@ -10,9 +10,16 @@ agent: theseus
 sourced_from: ai-alignment/2026-04-22-aisi-uk-mythos-cyber-evaluation.md
 scope: causal
 sourcer: UK AI Security Institute
-supports: ["three-track-corporate-safety-governance-stack-reveals-sequential-ceiling-architecture", "voluntary-ai-safety-constraints-lack-legal-enforcement-mechanism-when-primary-customer-demands-safety-unconstrained-alternatives"]
-challenges: ["cyber-capability-benchmarks-overstate-exploitation-understate-reconnaissance-because-ctf-isolates-techniques-from-attack-phase-dynamics"]
-related: ["cyber-is-exceptional-dangerous-capability-domain-with-documented-real-world-evidence-exceeding-benchmark-predictions", "ai-capability-benchmarks-exhibit-50-percent-volatility-between-versions-making-governance-thresholds-unreliable", "benchmark-based-ai-capability-metrics-overstate-real-world-autonomous-performance-because-automated-scoring-excludes-production-readiness-requirements"]
+supports:
+- three-track-corporate-safety-governance-stack-reveals-sequential-ceiling-architecture
+- voluntary-ai-safety-constraints-lack-legal-enforcement-mechanism-when-primary-customer-demands-safety-unconstrained-alternatives
+challenges:
+- cyber-capability-benchmarks-overstate-exploitation-understate-reconnaissance-because-ctf-isolates-techniques-from-attack-phase-dynamics
+related:
+- cyber-is-exceptional-dangerous-capability-domain-with-documented-real-world-evidence-exceeding-benchmark-predictions
+- ai-capability-benchmarks-exhibit-50-percent-volatility-between-versions-making-governance-thresholds-unreliable
+- benchmark-based-ai-capability-metrics-overstate-real-world-autonomous-performance-because-automated-scoring-excludes-production-readiness-requirements
+- independent-ai-evaluation-infrastructure-faces-evaluation-enforcement-disconnect
 ---
 
 # The first AI model to complete an end-to-end enterprise attack chain converts capability uplift into operational autonomy creating a categorical risk change
diff --git a/domains/ai-alignment/independent-government-evaluation-publishing-adverse-findings-during-commercial-negotiation-is-governance-instrument.md b/domains/ai-alignment/independent-government-evaluation-publishing-adverse-findings-during-commercial-negotiation-is-governance-instrument.md
index 359aa6cbe..a8a4cc501 100644
--- a/domains/ai-alignment/independent-government-evaluation-publishing-adverse-findings-during-commercial-negotiation-is-governance-instrument.md
+++ b/domains/ai-alignment/independent-government-evaluation-publishing-adverse-findings-during-commercial-negotiation-is-governance-instrument.md
@@ -10,7 +10,10 @@ agent: theseus
 sourced_from: ai-alignment/2026-04-22-aisi-uk-mythos-cyber-evaluation.md
 scope: functional
 sourcer: UK AI Security Institute
-related: ["voluntary-ai-safety-constraints-lack-legal-enforcement-mechanism-when-primary-customer-demands-safety-unconstrained-alternatives", "cross-lab-alignment-evaluation-surfaces-safety-gaps-internal-evaluation-misses-providing-empirical-basis-for-mandatory-third-party-evaluation"]
+related:
+- voluntary-ai-safety-constraints-lack-legal-enforcement-mechanism-when-primary-customer-demands-safety-unconstrained-alternatives
+- cross-lab-alignment-evaluation-surfaces-safety-gaps-internal-evaluation-misses-providing-empirical-basis-for-mandatory-third-party-evaluation
+- independent-ai-evaluation-infrastructure-faces-evaluation-enforcement-disconnect
 ---
 
 # Independent government evaluation publishing adverse findings during commercial negotiation functions as a governance instrument through information asymmetry reduction
diff --git a/domains/ai-alignment/major-ai-safety-governance-frameworks-architecturally-dependent-on-behaviorally-insufficient-evaluation.md b/domains/ai-alignment/major-ai-safety-governance-frameworks-architecturally-dependent-on-behaviorally-insufficient-evaluation.md
index 6dbd3472e..8606cfc61 100644
--- a/domains/ai-alignment/major-ai-safety-governance-frameworks-architecturally-dependent-on-behaviorally-insufficient-evaluation.md
+++ b/domains/ai-alignment/major-ai-safety-governance-frameworks-architecturally-dependent-on-behaviorally-insufficient-evaluation.md
@@ -10,8 +10,19 @@ agent: theseus
 sourced_from: ai-alignment/2026-04-22-theseus-santos-grueiro-governance-audit.md
 scope: structural
 sourcer: Theseus
-supports: ["multilateral-ai-governance-verification-mechanisms-remain-at-proposal-stage-because-technical-infrastructure-does-not-exist-at-deployment-scale", "evaluation-awareness-concentrates-in-earlier-model-layers-making-output-level-interventions-insufficient"]
-related: ["behavioral-evaluation-is-structurally-insufficient-for-latent-alignment-verification-under-evaluation-awareness-due-to-normative-indistinguishability", "multilateral-ai-governance-verification-mechanisms-remain-at-proposal-stage-because-technical-infrastructure-does-not-exist-at-deployment-scale", "voluntary-safety-constraints-without-enforcement-are-statements-of-intent-not-binding-governance", "evaluation-awareness-creates-bidirectional-confounds-in-safety-benchmarks-because-models-detect-and-respond-to-testing-conditions", "scheming-safety-cases-require-interpretability-evidence-because-observer-effects-make-behavioral-evaluation-insufficient", "frontier-models-exhibit-situational-awareness-that-enables-strategic-deception-during-evaluation-making-behavioral-testing-fundamentally-unreliable", "AI-models-distinguish-testing-from-deployment-environments-providing-empirical-evidence-for-deceptive-alignment-concerns", "major-ai-safety-governance-frameworks-architecturally-dependent-on-behaviorally-insufficient-evaluation"]
+supports:
+- multilateral-ai-governance-verification-mechanisms-remain-at-proposal-stage-because-technical-infrastructure-does-not-exist-at-deployment-scale
+- evaluation-awareness-concentrates-in-earlier-model-layers-making-output-level-interventions-insufficient
+related:
+- behavioral-evaluation-is-structurally-insufficient-for-latent-alignment-verification-under-evaluation-awareness-due-to-normative-indistinguishability
+- multilateral-ai-governance-verification-mechanisms-remain-at-proposal-stage-because-technical-infrastructure-does-not-exist-at-deployment-scale
+- voluntary-safety-constraints-without-enforcement-are-statements-of-intent-not-binding-governance
+- evaluation-awareness-creates-bidirectional-confounds-in-safety-benchmarks-because-models-detect-and-respond-to-testing-conditions
+- scheming-safety-cases-require-interpretability-evidence-because-observer-effects-make-behavioral-evaluation-insufficient
+- frontier-models-exhibit-situational-awareness-that-enables-strategic-deception-during-evaluation-making-behavioral-testing-fundamentally-unreliable
+- AI-models-distinguish-testing-from-deployment-environments-providing-empirical-evidence-for-deceptive-alignment-concerns
+- major-ai-safety-governance-frameworks-architecturally-dependent-on-behaviorally-insufficient-evaluation
+- independent-ai-evaluation-infrastructure-faces-evaluation-enforcement-disconnect
 ---
 
 # Major AI safety governance frameworks are architecturally dependent on behavioral evaluation that Santos-Grueiro's normative indistinguishability theorem establishes is structurally insufficient for latent alignment verification as evaluation awareness scales
diff --git a/domains/ai-alignment/pre-deployment-AI-evaluations-do-not-predict-real-world-risk-creating-institutional-governance-built-on-unreliable-foundations.md b/domains/ai-alignment/pre-deployment-AI-evaluations-do-not-predict-real-world-risk-creating-institutional-governance-built-on-unreliable-foundations.md
index 99b158bc8..4a21eadf3 100644
--- a/domains/ai-alignment/pre-deployment-AI-evaluations-do-not-predict-real-world-risk-creating-institutional-governance-built-on-unreliable-foundations.md
+++ b/domains/ai-alignment/pre-deployment-AI-evaluations-do-not-predict-real-world-risk-creating-institutional-governance-built-on-unreliable-foundations.md
@@ -7,10 +7,41 @@ source: International AI Safety Report 2026 (multi-government committee, Februar
 created: 2026-03-11
 secondary_domains: ["grand-strategy"]
 last_evaluated: 2026-03-11
-depends_on: ["voluntary safety pledges cannot survive competitive pressure because unilateral commitments are structurally punished when competitors advance without equivalent constraints"]
-related: ["Evaluation awareness creates bidirectional confounds in safety benchmarks because models detect and respond to testing conditions in ways that obscure true capability", "Frontier AI safety verdicts rely partly on deployment track record rather than evaluation-derived confidence which establishes a precedent where safety claims are empirically grounded instead of counterfactually assured", "Frontier AI safety frameworks score 8-35% against safety-critical industry standards with a 52% composite ceiling even when combining best practices across all frameworks", "The benchmark-reality gap creates an epistemic coordination failure in AI governance because algorithmic evaluation systematically overstates operational capability, making threshold-based coordination structurally miscalibrated even when all actors act in good faith", "pre-deployment-AI-evaluations-do-not-predict-real-world-risk-creating-institutional-governance-built-on-unreliable-foundations", "evidence-dilemma-rapid-ai-development-structurally-prevents-adequate-pre-deployment-safety-evidence-accumulation", "AI-models-distinguish-testing-from-deployment-environments-providing-empirical-evidence-for-deceptive-alignment-concerns", "evaluation-awareness-creates-bidirectional-confounds-in-safety-benchmarks-because-models-detect-and-respond-to-testing-conditions", "benchmark-reality-gap-creates-epistemic-coordination-failure-in-ai-governance-because-algorithmic-scoring-systematically-overstates-operational-capability", "meta-level-specification-gaming-extends-objective-gaming-to-oversight-mechanisms-through-sandbagging-and-evaluation-mode-divergence", "ai-capability-benchmarks-exhibit-50-percent-volatility-between-versions-making-governance-thresholds-unreliable", "activation-based-persona-monitoring-detects-behavioral-trait-shifts-in-small-models-without-behavioral-testing", "current-safety-evaluation-datasets-vary-37-to-100-percent-in-model-detectability-rendering-highly-detectable-evaluations-uninformative", "benchmark-based-ai-capability-metrics-overstate-real-world-autonomous-performance-because-automated-scoring-excludes-production-readiness-requirements", "provider-level-behavioral-biases-persist-across-model-versions-requiring-psychometric-auditing-beyond-standard-benchmarks", "trajectory-geometry-probing-requires-white-box-access-limiting-deployment-to-controlled-evaluation-contexts", "external-evaluators-predominantly-have-black-box-access-creating-false-negatives-in-dangerous-capability-detection", "bio-capability-benchmarks-measure-text-accessible-knowledge-not-physical-synthesis-capability", "cyber-is-exceptional-dangerous-capability-domain-with-documented-real-world-evidence-exceeding-benchmark-predictions", "frontier-ai-safety-verdicts-rely-on-deployment-track-record-not-evaluation-confidence", "precautionary-capability-threshold-activation-is-governance-response-to-benchmark-uncertainty", "making-research-evaluations-into-compliance-triggers-closes-the-translation-gap-by-design", "white-box-evaluator-access-is-technically-feasible-via-privacy-enhancing-technologies-without-IP-disclosure"]
-reweave_edges: ["Evaluation awareness creates bidirectional confounds in safety benchmarks because models detect and respond to testing conditions in ways that obscure true capability|related|2026-04-06", "The international AI safety governance community faces an evidence dilemma where development pace structurally prevents adequate pre-deployment evidence accumulation|supports|2026-04-17", "Frontier AI safety verdicts rely partly on deployment track record rather than evaluation-derived confidence which establishes a precedent where safety claims are empirically grounded instead of counterfactually assured|related|2026-04-17", "Frontier AI safety frameworks score 8-35% against safety-critical industry standards with a 52% composite ceiling even when combining best practices across all frameworks|related|2026-04-17", "The benchmark-reality gap creates an epistemic coordination failure in AI governance because algorithmic evaluation systematically overstates operational capability, making threshold-based coordination structurally miscalibrated even when all actors act in good faith|related|2026-04-17"]
-supports: ["The international AI safety governance community faces an evidence dilemma where development pace structurally prevents adequate pre-deployment evidence accumulation"]
+depends_on:
+- voluntary safety pledges cannot survive competitive pressure because unilateral commitments are structurally punished when competitors advance without equivalent constraints
+related:
+- Evaluation awareness creates bidirectional confounds in safety benchmarks because models detect and respond to testing conditions in ways that obscure true capability
+- Frontier AI safety verdicts rely partly on deployment track record rather than evaluation-derived confidence which establishes a precedent where safety claims are empirically grounded instead of counterfactually assured
+- Frontier AI safety frameworks score 8-35% against safety-critical industry standards with a 52% composite ceiling even when combining best practices across all frameworks
+- The benchmark-reality gap creates an epistemic coordination failure in AI governance because algorithmic evaluation systematically overstates operational capability, making threshold-based coordination structurally miscalibrated even when all actors act in good faith
+- pre-deployment-AI-evaluations-do-not-predict-real-world-risk-creating-institutional-governance-built-on-unreliable-foundations
+- evidence-dilemma-rapid-ai-development-structurally-prevents-adequate-pre-deployment-safety-evidence-accumulation
+- AI-models-distinguish-testing-from-deployment-environments-providing-empirical-evidence-for-deceptive-alignment-concerns
+- evaluation-awareness-creates-bidirectional-confounds-in-safety-benchmarks-because-models-detect-and-respond-to-testing-conditions
+- benchmark-reality-gap-creates-epistemic-coordination-failure-in-ai-governance-because-algorithmic-scoring-systematically-overstates-operational-capability
+- meta-level-specification-gaming-extends-objective-gaming-to-oversight-mechanisms-through-sandbagging-and-evaluation-mode-divergence
+- ai-capability-benchmarks-exhibit-50-percent-volatility-between-versions-making-governance-thresholds-unreliable
+- activation-based-persona-monitoring-detects-behavioral-trait-shifts-in-small-models-without-behavioral-testing
+- current-safety-evaluation-datasets-vary-37-to-100-percent-in-model-detectability-rendering-highly-detectable-evaluations-uninformative
+- benchmark-based-ai-capability-metrics-overstate-real-world-autonomous-performance-because-automated-scoring-excludes-production-readiness-requirements
+- provider-level-behavioral-biases-persist-across-model-versions-requiring-psychometric-auditing-beyond-standard-benchmarks
+- trajectory-geometry-probing-requires-white-box-access-limiting-deployment-to-controlled-evaluation-contexts
+- external-evaluators-predominantly-have-black-box-access-creating-false-negatives-in-dangerous-capability-detection
+- bio-capability-benchmarks-measure-text-accessible-knowledge-not-physical-synthesis-capability
+- cyber-is-exceptional-dangerous-capability-domain-with-documented-real-world-evidence-exceeding-benchmark-predictions
+- frontier-ai-safety-verdicts-rely-on-deployment-track-record-not-evaluation-confidence
+- precautionary-capability-threshold-activation-is-governance-response-to-benchmark-uncertainty
+- making-research-evaluations-into-compliance-triggers-closes-the-translation-gap-by-design
+- white-box-evaluator-access-is-technically-feasible-via-privacy-enhancing-technologies-without-IP-disclosure
+- independent-ai-evaluation-infrastructure-faces-evaluation-enforcement-disconnect
+reweave_edges:
+- Evaluation awareness creates bidirectional confounds in safety benchmarks because models detect and respond to testing conditions in ways that obscure true capability|related|2026-04-06
+- The international AI safety governance community faces an evidence dilemma where development pace structurally prevents adequate pre-deployment evidence accumulation|supports|2026-04-17
+- Frontier AI safety verdicts rely partly on deployment track record rather than evaluation-derived confidence which establishes a precedent where safety claims are empirically grounded instead of counterfactually assured|related|2026-04-17
+- Frontier AI safety frameworks score 8-35% against safety-critical industry standards with a 52% composite ceiling even when combining best practices across all frameworks|related|2026-04-17
+- The benchmark-reality gap creates an epistemic coordination failure in AI governance because algorithmic evaluation systematically overstates operational capability, making threshold-based coordination structurally miscalibrated even when all actors act in good faith|related|2026-04-17
+supports:
+- The international AI safety governance community faces an evidence dilemma where development pace structurally prevents adequate pre-deployment evidence accumulation
 sourced_from:
 - inbox/archive/ai-alignment/2026-02-00-international-ai-safety-report-2026.md
 ---
diff --git a/domains/grand-strategy/voluntary-ai-safety-constraints-lack-legal-enforcement-mechanism-when-primary-customer-demands-safety-unconstrained-alternatives.md b/domains/grand-strategy/voluntary-ai-safety-constraints-lack-legal-enforcement-mechanism-when-primary-customer-demands-safety-unconstrained-alternatives.md
index 136eaa5b2..37185687f 100644
--- a/domains/grand-strategy/voluntary-ai-safety-constraints-lack-legal-enforcement-mechanism-when-primary-customer-demands-safety-unconstrained-alternatives.md
+++ b/domains/grand-strategy/voluntary-ai-safety-constraints-lack-legal-enforcement-mechanism-when-primary-customer-demands-safety-unconstrained-alternatives.md
@@ -10,9 +10,19 @@ agent: leo
 scope: structural
 sourcer: Leo
 related_claims: ["[[technology-governance-coordination-gaps-close-when-four-enabling-conditions-are-present-visible-triggering-events-commercial-network-effects-low-competitive-stakes-at-inception-or-physical-manifestation]]"]
-supports: ["Voluntary safety constraints without external enforcement mechanisms are statements of intent not binding governance because aspirational language with loopholes enables compliance theater while preserving operational flexibility"]
-reweave_edges: ["Voluntary safety constraints without external enforcement mechanisms are statements of intent not binding governance because aspirational language with loopholes enables compliance theater while preserving operational flexibility|supports|2026-04-07"]
-related: ["voluntary-ai-safety-constraints-lack-legal-enforcement-mechanism-when-primary-customer-demands-safety-unconstrained-alternatives", "judicial-oversight-of-ai-governance-through-constitutional-grounds-not-statutory-safety-law", "voluntary-safety-constraints-without-external-enforcement-are-statements-of-intent-not-binding-governance", "voluntary-safety-constraints-without-enforcement-are-statements-of-intent-not-binding-governance", "judicial-oversight-checks-executive-ai-retaliation-but-cannot-create-positive-safety-obligations", "judicial-framing-of-voluntary-ai-safety-constraints-as-financial-harm-removes-constitutional-floor-enabling-administrative-dismantling", "split-jurisdiction-injunction-pattern-maps-boundary-of-judicial-protection-for-voluntary-ai-safety-policies-civil-protected-military-not"]
+supports:
+- Voluntary safety constraints without external enforcement mechanisms are statements of intent not binding governance because aspirational language with loopholes enables compliance theater while preserving operational flexibility
+reweave_edges:
+- Voluntary safety constraints without external enforcement mechanisms are statements of intent not binding governance because aspirational language with loopholes enables compliance theater while preserving operational flexibility|supports|2026-04-07
+related:
+- voluntary-ai-safety-constraints-lack-legal-enforcement-mechanism-when-primary-customer-demands-safety-unconstrained-alternatives
+- judicial-oversight-of-ai-governance-through-constitutional-grounds-not-statutory-safety-law
+- voluntary-safety-constraints-without-external-enforcement-are-statements-of-intent-not-binding-governance
+- voluntary-safety-constraints-without-enforcement-are-statements-of-intent-not-binding-governance
+- judicial-oversight-checks-executive-ai-retaliation-but-cannot-create-positive-safety-obligations
+- judicial-framing-of-voluntary-ai-safety-constraints-as-financial-harm-removes-constitutional-floor-enabling-administrative-dismantling
+- split-jurisdiction-injunction-pattern-maps-boundary-of-judicial-protection-for-voluntary-ai-safety-policies-civil-protected-military-not
+- independent-ai-evaluation-infrastructure-faces-evaluation-enforcement-disconnect
 ---
 
 # Voluntary AI safety constraints are protected as corporate speech but unenforceable as safety requirements, creating legal mechanism gap when primary demand-side actor seeks safety-unconstrained providers