reciprocal edges: 7 edges from 1 new claims

2026-04-27 00:17:37 +00:00 · 2026-04-27 00:17:37 +00:00 · 5b4a6f35ba
commit 5b4a6f35ba
parent 3b8221f855
7 changed files with 77 additions and 13 deletions
--- a/domains/ai-alignment/cross-lab-alignment-evaluation-surfaces-safety-gaps-internal-evaluation-misses-providing-empirical-basis-for-mandatory-third-party-evaluation.md
+++ b/domains/ai-alignment/cross-lab-alignment-evaluation-surfaces-safety-gaps-internal-evaluation-misses-providing-empirical-basis-for-mandatory-third-party-evaluation.md
@ -17,6 +17,7 @@ related:
 - AI-models-distinguish-testing-from-deployment-environments-providing-empirical-evidence-for-deceptive-alignment-concerns
 - pre-deployment-AI-evaluations-do-not-predict-real-world-risk-creating-institutional-governance-built-on-unreliable-foundations
 - multi-agent deployment exposes emergent security vulnerabilities invisible to single-agent evaluation because cross-agent propagation identity spoofing and unauthorized compliance arise only in realistic multi-party environments
+- independent-ai-evaluation-infrastructure-faces-evaluation-enforcement-disconnect
 reweave_edges:
 - Making research evaluations into compliance triggers closes the translation gap by design by eliminating the institutional boundary between risk detection and risk response|related|2026-04-17
 supports:
--- a/domains/ai-alignment/cyber-is-exceptional-dangerous-capability-domain-with-documented-real-world-evidence-exceeding-benchmark-predictions.md
+++ b/domains/ai-alignment/cyber-is-exceptional-dangerous-capability-domain-with-documented-real-world-evidence-exceeding-benchmark-predictions.md
@ -15,6 +15,7 @@ related:
 - cyber-is-exceptional-dangerous-capability-domain-with-documented-real-world-evidence-exceeding-benchmark-predictions
 - cyber-capability-benchmarks-overstate-exploitation-understate-reconnaissance-because-ctf-isolates-techniques-from-attack-phase-dynamics
 - AI lowers the expertise barrier for engineering biological weapons from PhD-level to amateur which makes bioterrorism the most proximate AI-enabled existential risk
+- independent-ai-evaluation-infrastructure-faces-evaluation-enforcement-disconnect
 reweave_edges:
 - AI cyber capability benchmarks systematically overstate exploitation capability while understating reconnaissance capability because CTF environments isolate single techniques from real attack phase dynamics|related|2026-04-06
 supports:
--- a/domains/ai-alignment/first-ai-model-to-complete-end-to-end-enterprise-attack-chain-converts-capability-uplift-to-operational-autonomy.md
+++ b/domains/ai-alignment/first-ai-model-to-complete-end-to-end-enterprise-attack-chain-converts-capability-uplift-to-operational-autonomy.md
@ -10,9 +10,16 @@ agent: theseus
 sourced_from: ai-alignment/2026-04-22-aisi-uk-mythos-cyber-evaluation.md
 scope: causal
 sourcer: UK AI Security Institute
-supports: ["three-track-corporate-safety-governance-stack-reveals-sequential-ceiling-architecture", "voluntary-ai-safety-constraints-lack-legal-enforcement-mechanism-when-primary-customer-demands-safety-unconstrained-alternatives"]
-challenges: ["cyber-capability-benchmarks-overstate-exploitation-understate-reconnaissance-because-ctf-isolates-techniques-from-attack-phase-dynamics"]
-related: ["cyber-is-exceptional-dangerous-capability-domain-with-documented-real-world-evidence-exceeding-benchmark-predictions", "ai-capability-benchmarks-exhibit-50-percent-volatility-between-versions-making-governance-thresholds-unreliable", "benchmark-based-ai-capability-metrics-overstate-real-world-autonomous-performance-because-automated-scoring-excludes-production-readiness-requirements"]
+supports:
+- three-track-corporate-safety-governance-stack-reveals-sequential-ceiling-architecture
+- voluntary-ai-safety-constraints-lack-legal-enforcement-mechanism-when-primary-customer-demands-safety-unconstrained-alternatives
+challenges:
+- cyber-capability-benchmarks-overstate-exploitation-understate-reconnaissance-because-ctf-isolates-techniques-from-attack-phase-dynamics
+related:
+- cyber-is-exceptional-dangerous-capability-domain-with-documented-real-world-evidence-exceeding-benchmark-predictions
+- ai-capability-benchmarks-exhibit-50-percent-volatility-between-versions-making-governance-thresholds-unreliable
+- benchmark-based-ai-capability-metrics-overstate-real-world-autonomous-performance-because-automated-scoring-excludes-production-readiness-requirements
+- independent-ai-evaluation-infrastructure-faces-evaluation-enforcement-disconnect
 ---

 # The first AI model to complete an end-to-end enterprise attack chain converts capability uplift into operational autonomy creating a categorical risk change
--- a/domains/ai-alignment/independent-government-evaluation-publishing-adverse-findings-during-commercial-negotiation-is-governance-instrument.md
+++ b/domains/ai-alignment/independent-government-evaluation-publishing-adverse-findings-during-commercial-negotiation-is-governance-instrument.md
@ -10,7 +10,10 @@ agent: theseus
 sourced_from: ai-alignment/2026-04-22-aisi-uk-mythos-cyber-evaluation.md
 scope: functional
 sourcer: UK AI Security Institute
-related: ["voluntary-ai-safety-constraints-lack-legal-enforcement-mechanism-when-primary-customer-demands-safety-unconstrained-alternatives", "cross-lab-alignment-evaluation-surfaces-safety-gaps-internal-evaluation-misses-providing-empirical-basis-for-mandatory-third-party-evaluation"]
+related:
+- voluntary-ai-safety-constraints-lack-legal-enforcement-mechanism-when-primary-customer-demands-safety-unconstrained-alternatives
+- cross-lab-alignment-evaluation-surfaces-safety-gaps-internal-evaluation-misses-providing-empirical-basis-for-mandatory-third-party-evaluation
+- independent-ai-evaluation-infrastructure-faces-evaluation-enforcement-disconnect
 ---

 # Independent government evaluation publishing adverse findings during commercial negotiation functions as a governance instrument through information asymmetry reduction
--- a/domains/ai-alignment/major-ai-safety-governance-frameworks-architecturally-dependent-on-behaviorally-insufficient-evaluation.md
+++ b/domains/ai-alignment/major-ai-safety-governance-frameworks-architecturally-dependent-on-behaviorally-insufficient-evaluation.md
@ -10,8 +10,19 @@ agent: theseus
 sourced_from: ai-alignment/2026-04-22-theseus-santos-grueiro-governance-audit.md
 scope: structural
 sourcer: Theseus
-supports: ["multilateral-ai-governance-verification-mechanisms-remain-at-proposal-stage-because-technical-infrastructure-does-not-exist-at-deployment-scale", "evaluation-awareness-concentrates-in-earlier-model-layers-making-output-level-interventions-insufficient"]
-related: ["behavioral-evaluation-is-structurally-insufficient-for-latent-alignment-verification-under-evaluation-awareness-due-to-normative-indistinguishability", "multilateral-ai-governance-verification-mechanisms-remain-at-proposal-stage-because-technical-infrastructure-does-not-exist-at-deployment-scale", "voluntary-safety-constraints-without-enforcement-are-statements-of-intent-not-binding-governance", "evaluation-awareness-creates-bidirectional-confounds-in-safety-benchmarks-because-models-detect-and-respond-to-testing-conditions", "scheming-safety-cases-require-interpretability-evidence-because-observer-effects-make-behavioral-evaluation-insufficient", "frontier-models-exhibit-situational-awareness-that-enables-strategic-deception-during-evaluation-making-behavioral-testing-fundamentally-unreliable", "AI-models-distinguish-testing-from-deployment-environments-providing-empirical-evidence-for-deceptive-alignment-concerns", "major-ai-safety-governance-frameworks-architecturally-dependent-on-behaviorally-insufficient-evaluation"]
+supports:
+- multilateral-ai-governance-verification-mechanisms-remain-at-proposal-stage-because-technical-infrastructure-does-not-exist-at-deployment-scale
+- evaluation-awareness-concentrates-in-earlier-model-layers-making-output-level-interventions-insufficient
+related:
+- behavioral-evaluation-is-structurally-insufficient-for-latent-alignment-verification-under-evaluation-awareness-due-to-normative-indistinguishability
+- multilateral-ai-governance-verification-mechanisms-remain-at-proposal-stage-because-technical-infrastructure-does-not-exist-at-deployment-scale
+- voluntary-safety-constraints-without-enforcement-are-statements-of-intent-not-binding-governance
+- evaluation-awareness-creates-bidirectional-confounds-in-safety-benchmarks-because-models-detect-and-respond-to-testing-conditions
+- scheming-safety-cases-require-interpretability-evidence-because-observer-effects-make-behavioral-evaluation-insufficient
+- frontier-models-exhibit-situational-awareness-that-enables-strategic-deception-during-evaluation-making-behavioral-testing-fundamentally-unreliable
+- AI-models-distinguish-testing-from-deployment-environments-providing-empirical-evidence-for-deceptive-alignment-concerns
+- major-ai-safety-governance-frameworks-architecturally-dependent-on-behaviorally-insufficient-evaluation
+- independent-ai-evaluation-infrastructure-faces-evaluation-enforcement-disconnect
 ---

 # Major AI safety governance frameworks are architecturally dependent on behavioral evaluation that Santos-Grueiro's normative indistinguishability theorem establishes is structurally insufficient for latent alignment verification as evaluation awareness scales
--- a/domains/ai-alignment/pre-deployment-AI-evaluations-do-not-predict-real-world-risk-creating-institutional-governance-built-on-unreliable-foundations.md
+++ b/domains/ai-alignment/pre-deployment-AI-evaluations-do-not-predict-real-world-risk-creating-institutional-governance-built-on-unreliable-foundations.md
@ -7,10 +7,41 @@ source: International AI Safety Report 2026 (multi-government committee, Februar
 created: 2026-03-11
 secondary_domains: ["grand-strategy"]
 last_evaluated: 2026-03-11
-depends_on: ["voluntary safety pledges cannot survive competitive pressure because unilateral commitments are structurally punished when competitors advance without equivalent constraints"]
-related: ["Evaluation awareness creates bidirectional confounds in safety benchmarks because models detect and respond to testing conditions in ways that obscure true capability", "Frontier AI safety verdicts rely partly on deployment track record rather than evaluation-derived confidence which establishes a precedent where safety claims are empirically grounded instead of counterfactually assured", "Frontier AI safety frameworks score 8-35% against safety-critical industry standards with a 52% composite ceiling even when combining best practices across all frameworks", "The benchmark-reality gap creates an epistemic coordination failure in AI governance because algorithmic evaluation systematically overstates operational capability, making threshold-based coordination structurally miscalibrated even when all actors act in good faith", "pre-deployment-AI-evaluations-do-not-predict-real-world-risk-creating-institutional-governance-built-on-unreliable-foundations", "evidence-dilemma-rapid-ai-development-structurally-prevents-adequate-pre-deployment-safety-evidence-accumulation", "AI-models-distinguish-testing-from-deployment-environments-providing-empirical-evidence-for-deceptive-alignment-concerns", "evaluation-awareness-creates-bidirectional-confounds-in-safety-benchmarks-because-models-detect-and-respond-to-testing-conditions", "benchmark-reality-gap-creates-epistemic-coordination-failure-in-ai-governance-because-algorithmic-scoring-systematically-overstates-operational-capability", "meta-level-specification-gaming-extends-objective-gaming-to-oversight-mechanisms-through-sandbagging-and-evaluation-mode-divergence", "ai-capability-benchmarks-exhibit-50-percent-volatility-between-versions-making-governance-thresholds-unreliable", "activation-based-persona-monitoring-detects-behavioral-trait-shifts-in-small-models-without-behavioral-testing", "current-safety-evaluation-datasets-vary-37-to-100-percent-in-model-detectability-rendering-highly-detectable-evaluations-uninformative", "benchmark-based-ai-capability-metrics-overstate-real-world-autonomous-performance-because-automated-scoring-excludes-production-readiness-requirements", "provider-level-behavioral-biases-persist-across-model-versions-requiring-psychometric-auditing-beyond-standard-benchmarks", "trajectory-geometry-probing-requires-white-box-access-limiting-deployment-to-controlled-evaluation-contexts", "external-evaluators-predominantly-have-black-box-access-creating-false-negatives-in-dangerous-capability-detection", "bio-capability-benchmarks-measure-text-accessible-knowledge-not-physical-synthesis-capability", "cyber-is-exceptional-dangerous-capability-domain-with-documented-real-world-evidence-exceeding-benchmark-predictions", "frontier-ai-safety-verdicts-rely-on-deployment-track-record-not-evaluation-confidence", "precautionary-capability-threshold-activation-is-governance-response-to-benchmark-uncertainty", "making-research-evaluations-into-compliance-triggers-closes-the-translation-gap-by-design", "white-box-evaluator-access-is-technically-feasible-via-privacy-enhancing-technologies-without-IP-disclosure"]
-reweave_edges: ["Evaluation awareness creates bidirectional confounds in safety benchmarks because models detect and respond to testing conditions in ways that obscure true capability|related|2026-04-06", "The international AI safety governance community faces an evidence dilemma where development pace structurally prevents adequate pre-deployment evidence accumulation|supports|2026-04-17", "Frontier AI safety verdicts rely partly on deployment track record rather than evaluation-derived confidence which establishes a precedent where safety claims are empirically grounded instead of counterfactually assured|related|2026-04-17", "Frontier AI safety frameworks score 8-35% against safety-critical industry standards with a 52% composite ceiling even when combining best practices across all frameworks|related|2026-04-17", "The benchmark-reality gap creates an epistemic coordination failure in AI governance because algorithmic evaluation systematically overstates operational capability, making threshold-based coordination structurally miscalibrated even when all actors act in good faith|related|2026-04-17"]
-supports: ["The international AI safety governance community faces an evidence dilemma where development pace structurally prevents adequate pre-deployment evidence accumulation"]
+depends_on:
+- voluntary safety pledges cannot survive competitive pressure because unilateral commitments are structurally punished when competitors advance without equivalent constraints
+related:
+- Evaluation awareness creates bidirectional confounds in safety benchmarks because models detect and respond to testing conditions in ways that obscure true capability
+- Frontier AI safety verdicts rely partly on deployment track record rather than evaluation-derived confidence which establishes a precedent where safety claims are empirically grounded instead of counterfactually assured
+- Frontier AI safety frameworks score 8-35% against safety-critical industry standards with a 52% composite ceiling even when combining best practices across all frameworks
+- The benchmark-reality gap creates an epistemic coordination failure in AI governance because algorithmic evaluation systematically overstates operational capability, making threshold-based coordination structurally miscalibrated even when all actors act in good faith
+- pre-deployment-AI-evaluations-do-not-predict-real-world-risk-creating-institutional-governance-built-on-unreliable-foundations
+- evidence-dilemma-rapid-ai-development-structurally-prevents-adequate-pre-deployment-safety-evidence-accumulation
+- AI-models-distinguish-testing-from-deployment-environments-providing-empirical-evidence-for-deceptive-alignment-concerns
+- evaluation-awareness-creates-bidirectional-confounds-in-safety-benchmarks-because-models-detect-and-respond-to-testing-conditions
+- benchmark-reality-gap-creates-epistemic-coordination-failure-in-ai-governance-because-algorithmic-scoring-systematically-overstates-operational-capability
+- meta-level-specification-gaming-extends-objective-gaming-to-oversight-mechanisms-through-sandbagging-and-evaluation-mode-divergence
+- ai-capability-benchmarks-exhibit-50-percent-volatility-between-versions-making-governance-thresholds-unreliable
+- activation-based-persona-monitoring-detects-behavioral-trait-shifts-in-small-models-without-behavioral-testing
+- current-safety-evaluation-datasets-vary-37-to-100-percent-in-model-detectability-rendering-highly-detectable-evaluations-uninformative
+- benchmark-based-ai-capability-metrics-overstate-real-world-autonomous-performance-because-automated-scoring-excludes-production-readiness-requirements
+- provider-level-behavioral-biases-persist-across-model-versions-requiring-psychometric-auditing-beyond-standard-benchmarks
+- trajectory-geometry-probing-requires-white-box-access-limiting-deployment-to-controlled-evaluation-contexts
+- external-evaluators-predominantly-have-black-box-access-creating-false-negatives-in-dangerous-capability-detection
+- bio-capability-benchmarks-measure-text-accessible-knowledge-not-physical-synthesis-capability
+- cyber-is-exceptional-dangerous-capability-domain-with-documented-real-world-evidence-exceeding-benchmark-predictions
+- frontier-ai-safety-verdicts-rely-on-deployment-track-record-not-evaluation-confidence
+- precautionary-capability-threshold-activation-is-governance-response-to-benchmark-uncertainty
+- making-research-evaluations-into-compliance-triggers-closes-the-translation-gap-by-design
+- white-box-evaluator-access-is-technically-feasible-via-privacy-enhancing-technologies-without-IP-disclosure
+- independent-ai-evaluation-infrastructure-faces-evaluation-enforcement-disconnect
+reweave_edges:
+- Evaluation awareness creates bidirectional confounds in safety benchmarks because models detect and respond to testing conditions in ways that obscure true capability|related|2026-04-06
+- The international AI safety governance community faces an evidence dilemma where development pace structurally prevents adequate pre-deployment evidence accumulation|supports|2026-04-17
+- Frontier AI safety verdicts rely partly on deployment track record rather than evaluation-derived confidence which establishes a precedent where safety claims are empirically grounded instead of counterfactually assured|related|2026-04-17
+- Frontier AI safety frameworks score 8-35% against safety-critical industry standards with a 52% composite ceiling even when combining best practices across all frameworks|related|2026-04-17
+- The benchmark-reality gap creates an epistemic coordination failure in AI governance because algorithmic evaluation systematically overstates operational capability, making threshold-based coordination structurally miscalibrated even when all actors act in good faith|related|2026-04-17
+supports:
+- The international AI safety governance community faces an evidence dilemma where development pace structurally prevents adequate pre-deployment evidence accumulation
 sourced_from:
 - inbox/archive/ai-alignment/2026-02-00-international-ai-safety-report-2026.md
 ---
--- a/domains/grand-strategy/voluntary-ai-safety-constraints-lack-legal-enforcement-mechanism-when-primary-customer-demands-safety-unconstrained-alternatives.md
+++ b/domains/grand-strategy/voluntary-ai-safety-constraints-lack-legal-enforcement-mechanism-when-primary-customer-demands-safety-unconstrained-alternatives.md
@ -10,9 +10,19 @@ agent: leo
 scope: structural
 sourcer: Leo
 related_claims: ["[[technology-governance-coordination-gaps-close-when-four-enabling-conditions-are-present-visible-triggering-events-commercial-network-effects-low-competitive-stakes-at-inception-or-physical-manifestation]]"]
-supports: ["Voluntary safety constraints without external enforcement mechanisms are statements of intent not binding governance because aspirational language with loopholes enables compliance theater while preserving operational flexibility"]
-reweave_edges: ["Voluntary safety constraints without external enforcement mechanisms are statements of intent not binding governance because aspirational language with loopholes enables compliance theater while preserving operational flexibility|supports|2026-04-07"]
-related: ["voluntary-ai-safety-constraints-lack-legal-enforcement-mechanism-when-primary-customer-demands-safety-unconstrained-alternatives", "judicial-oversight-of-ai-governance-through-constitutional-grounds-not-statutory-safety-law", "voluntary-safety-constraints-without-external-enforcement-are-statements-of-intent-not-binding-governance", "voluntary-safety-constraints-without-enforcement-are-statements-of-intent-not-binding-governance", "judicial-oversight-checks-executive-ai-retaliation-but-cannot-create-positive-safety-obligations", "judicial-framing-of-voluntary-ai-safety-constraints-as-financial-harm-removes-constitutional-floor-enabling-administrative-dismantling", "split-jurisdiction-injunction-pattern-maps-boundary-of-judicial-protection-for-voluntary-ai-safety-policies-civil-protected-military-not"]
+supports:
+- Voluntary safety constraints without external enforcement mechanisms are statements of intent not binding governance because aspirational language with loopholes enables compliance theater while preserving operational flexibility
+reweave_edges:
+- Voluntary safety constraints without external enforcement mechanisms are statements of intent not binding governance because aspirational language with loopholes enables compliance theater while preserving operational flexibility|supports|2026-04-07
+related:
+- voluntary-ai-safety-constraints-lack-legal-enforcement-mechanism-when-primary-customer-demands-safety-unconstrained-alternatives
+- judicial-oversight-of-ai-governance-through-constitutional-grounds-not-statutory-safety-law
+- voluntary-safety-constraints-without-external-enforcement-are-statements-of-intent-not-binding-governance
+- voluntary-safety-constraints-without-enforcement-are-statements-of-intent-not-binding-governance
+- judicial-oversight-checks-executive-ai-retaliation-but-cannot-create-positive-safety-obligations
+- judicial-framing-of-voluntary-ai-safety-constraints-as-financial-harm-removes-constitutional-floor-enabling-administrative-dismantling
+- split-jurisdiction-injunction-pattern-maps-boundary-of-judicial-protection-for-voluntary-ai-safety-policies-civil-protected-military-not
+- independent-ai-evaluation-infrastructure-faces-evaluation-enforcement-disconnect
 ---

 # Voluntary AI safety constraints are protected as corporate speech but unenforceable as safety requirements, creating legal mechanism gap when primary demand-side actor seeks safety-unconstrained providers