reweave: merge 309 files via frontmatter union [auto]

2026-04-17 01:19:40 +00:00 · 2026-04-17 01:19:40 +00:00 · 302d7c79f2
commit 302d7c79f2
parent da64f805e6
309 changed files with 1691 additions and 316 deletions
--- a/core/grand-strategy/AI
+++ b/core/grand-strategy/AI
@ -7,9 +7,13 @@ confidence: experimental
 source: "Synthesis by Leo from: Aldasoro et al (BIS) via Rio PR #26; Noah Smith HITL elimination via Theseus PR #25; knowledge embodiment lag (Imas, David, Brynjolfsson) via foundations"
 created: 2026-03-07
 depends_on:
-  - "early AI adoption increases firm productivity without reducing employment suggesting capital deepening not labor replacement as the dominant mechanism"
-  - "economic forces push humans out of every cognitive loop where output quality is independently verifiable because human-in-the-loop is a cost that competitive markets eliminate"
-  - "knowledge embodiment lag means technology is available decades before organizations learn to use it optimally creating a productivity paradox"
+- early AI adoption increases firm productivity without reducing employment suggesting capital deepening not labor replacement as the dominant mechanism
+- economic forces push humans out of every cognitive loop where output quality is independently verifiable because human-in-the-loop is a cost that competitive markets eliminate
+- knowledge embodiment lag means technology is available decades before organizations learn to use it optimally creating a productivity paradox
+supports:
+- Does AI substitute for human labor or complement it — and at what phase does the pattern shift?
+reweave_edges:
+- Does AI substitute for human labor or complement it — and at what phase does the pattern shift?|supports|2026-04-17
 ---

 # AI labor displacement follows knowledge embodiment lag phases where capital deepening precedes labor substitution and the transition timing depends on organizational restructuring not technology capability
--- a/core/grand-strategy/centaur
+++ b/core/grand-strategy/centaur
@ -7,10 +7,14 @@ confidence: experimental
 source: "Synthesis by Leo from: centaur team claim (Kasparov); HITL degradation claim (Wachter/Patil, Stanford-Harvard study); AI scribe adoption (Bessemer 2026); alignment scalable oversight claims"
 created: 2026-03-07
 depends_on:
-  - "centaur team performance depends on role complementarity not mere human-AI combination"
-  - "human-in-the-loop clinical AI degrades to worse-than-AI-alone because physicians both de-skill from reliance and introduce errors when overriding correct outputs"
-  - "AI scribes reached 92 percent provider adoption in under 3 years because documentation is the rare healthcare workflow where AI value is immediate unambiguous and low-risk"
-  - "scalable oversight degrades rapidly as capability gaps grow with debate achieving only 50 percent success at moderate gaps"
+- centaur team performance depends on role complementarity not mere human-AI combination
+- human-in-the-loop clinical AI degrades to worse-than-AI-alone because physicians both de-skill from reliance and introduce errors when overriding correct outputs
+- AI scribes reached 92 percent provider adoption in under 3 years because documentation is the rare healthcare workflow where AI value is immediate unambiguous and low-risk
+- scalable oversight degrades rapidly as capability gaps grow with debate achieving only 50 percent success at moderate gaps
+supports:
+- Does human oversight improve or degrade AI clinical decision-making?
+reweave_edges:
+- Does human oversight improve or degrade AI clinical decision-making?|supports|2026-04-17
 ---

 # centaur teams succeed only when role boundaries prevent humans from overriding AI in domains where AI is the stronger partner
--- a/core/grand-strategy/early-conviction
+++ b/core/grand-strategy/early-conviction
@ -12,8 +12,10 @@ depends_on:
 - community ownership accelerates growth through aligned evangelism not passive holding
 supports:
 - access friction functions as a natural conviction filter in token launches because process difficulty selects for genuine believers while price friction selects for wealthy speculators
+- Community anchored in genuine engagement sustains economic value through market cycles while speculation-anchored communities collapse
 reweave_edges:
 - access friction functions as a natural conviction filter in token launches because process difficulty selects for genuine believers while price friction selects for wealthy speculators|supports|2026-04-04
+- Community anchored in genuine engagement sustains economic value through market cycles while speculation-anchored communities collapse|supports|2026-04-17
 ---

 # early-conviction pricing is an unsolved mechanism design problem because systems that reward early believers attract extractive speculators while systems that prevent speculation penalize genuine supporters
--- a/core/living-agents/agent-mediated
+++ b/core/living-agents/agent-mediated
@ -5,6 +5,10 @@ description: "Compares Teleo's architecture against Wikipedia, Community Notes,
 confidence: experimental
 source: "Theseus, original analysis grounded in CI literature and operational comparison of existing knowledge aggregation systems"
 created: 2026-03-11
+related:
+- conversational memory and organizational knowledge are fundamentally different problems sharing some infrastructure because identical formats mask divergent governance lifecycle and quality requirements
+reweave_edges:
+- conversational memory and organizational knowledge are fundamentally different problems sharing some infrastructure because identical formats mask divergent governance lifecycle and quality requirements|related|2026-04-17
 ---

 # Agent-mediated knowledge bases are structurally novel because they combine atomic claims adversarial multi-agent evaluation and persistent knowledge graphs which Wikipedia Community Notes and prediction markets each partially implement but none combine
--- a/core/living-agents/community
+++ b/core/living-agents/community
@ -6,6 +6,10 @@ created: 2026-02-16
 source: "MetaDAO Launchpad"
 confidence: likely
 tradition: "mechanism design, network effects, token economics"
+supports:
+- Community anchored in genuine engagement sustains economic value through market cycles while speculation-anchored communities collapse
+reweave_edges:
+- Community anchored in genuine engagement sustains economic value through market cycles while speculation-anchored communities collapse|supports|2026-04-17
 ---

 Broad community ownership creates competitive advantage through aligned evangelism, not just capital raising. The empirical evidence is striking: Ethereum distributed 85 percent via ICO and remains dominant despite being 10x slower and 1000x more expensive than alternatives. Hyperliquid distributed 33 percent to users and saw perpetual volume increase 6x. Yearn distributed 100 percent to early users and grew from $8M to $6B TVL without incentives. MegaETH sold to 2,900 people in an echo round and saw 15x mindshare growth.
--- a/core/mechanisms/Polymarket
+++ b/core/mechanisms/Polymarket
@ -6,6 +6,10 @@ created: 2026-02-16
 source: "Galaxy Research, State of Onchain Futarchy (2025)"
 confidence: proven
 tradition: "futarchy, mechanism design, prediction markets"
+related:
+- Augur
+reweave_edges:
+- Augur|related|2026-04-17
 ---

 The 2024 US election provided empirical vindication for prediction markets versus traditional polling. Polymarket's markets proved more accurate, more responsive to new information, and more democratically accessible than centralized polling operations. This success directly catalyzed renewed interest in applying futarchy to DAO governance—if markets outperform polls for election prediction, the same logic suggests they should outperform token voting for organizational decisions.
--- a/core/teleohumanity/master
+++ b/core/teleohumanity/master
@ -6,6 +6,10 @@ created: 2026-02-21
 source: "Tamim Ansary, The Invention of Yesterday (2019); McLennan College Distinguished Lecture Series"
 confidence: likely
 tradition: "cultural history, narrative theory"
+related:
+- Narrative architecture is shifting from singular-vision Design Fiction to collaborative-foresight Design Futures because differential information contexts prevent any single voice from achieving saturation
+reweave_edges:
+- Narrative architecture is shifting from singular-vision Design Fiction to collaborative-foresight Design Futures because differential information contexts prevent any single voice from achieving saturation|related|2026-04-17
 ---

 # master narrative crisis is a design window not a catastrophe because the interval between constellations is when deliberate narrative architecture has maximum leverage
--- a/decisions/internet-finance/areal-futardio-fundraise.md
+++ b/decisions/internet-finance/areal-futardio-fundraise.md
@ -18,9 +18,11 @@ source_archive: "inbox/archive/2026-03-05-futardio-launch-areal-finance.md"
 related:
 - areal proposes unified rwa liquidity through index token aggregating yield across project tokens
 - areal targets smb rwa tokenization as underserved market versus equity and large financial instruments
+- {'Cloak': 'Futardio ICO Launch'}
 reweave_edges:
 - areal proposes unified rwa liquidity through index token aggregating yield across project tokens|related|2026-04-04
 - areal targets smb rwa tokenization as underserved market versus equity and large financial instruments|related|2026-04-04
+- {'Cloak': 'Futardio ICO Launch|related|2026-04-17'}
 ---

 # Areal: Futardio ICO Launch
--- a/decisions/internet-finance/futardio-cult-launch.md
+++ b/decisions/internet-finance/futardio-cult-launch.md
@ -15,6 +15,10 @@ summary: "Futardio cult raised via MetaDAO ICO — funds for fan merch, token li
 tracked_by: rio
 created: 2026-03-24
 source_archive: "inbox/archive/2026-03-03-futardio-launch-futardio-cult.md"
+related:
+- {'Avici': 'Futardio Launch'}
+reweave_edges:
+- {'Avici': 'Futardio Launch|related|2026-04-17'}
 ---

 # Futardio Cult: Futardio Launch
--- a/decisions/internet-finance/metadao-develop-multi-option-proposals.md
+++ b/decisions/internet-finance/metadao-develop-multi-option-proposals.md
@ -15,6 +15,10 @@ summary: "Proposal to develop multi-modal proposal functionality allowing multip
 tracked_by: rio
 created: 2026-03-11
 source_archive: "inbox/archive/2024-02-20-futardio-proposal-develop-multi-option-proposals.md"
+related:
+- agrippa
+reweave_edges:
+- agrippa|related|2026-04-17
 ---

 # MetaDAO: Develop Multi-Option Proposals?
--- a/decisions/internet-finance/seekervault-futardio-fundraise-2.md
+++ b/decisions/internet-finance/seekervault-futardio-fundraise-2.md
@ -15,6 +15,10 @@ summary: "SeekerVault raised $2,095 of $50,000 target (4.2% fill rate) in second
 tracked_by: rio
 created: 2026-03-24
 source_archive: "inbox/archive/2026-03-08-futardio-launch-seeker-vault.md"
+related:
+- {'Cloak': 'Futardio ICO Launch'}
+reweave_edges:
+- {'Cloak': 'Futardio ICO Launch|related|2026-04-17'}
 ---

 # SeekerVault: Futardio ICO Launch (2nd Attempt)
--- a/decisions/internet-finance/versus-futardio-fundraise.md
+++ b/decisions/internet-finance/versus-futardio-fundraise.md
@ -20,6 +20,10 @@ key_metrics:
 tracked_by: rio
 created: 2026-03-11
 source_archive: "inbox/archive/2026-03-03-futardio-launch-versus.md"
+related:
+- {'Avici': 'Futardio Launch'}
+reweave_edges:
+- {'Avici': 'Futardio Launch|related|2026-04-17'}
 ---

 # VERSUS: Futardio Fundraise
--- a/domains/ai-alignment/AI
+++ b/domains/ai-alignment/AI
@ -13,9 +13,13 @@ challenged_by:
 related:
 - multipolar traps are the thermodynamic default because competition requires no infrastructure while coordination requires trust enforcement and shared information all of which are expensive and fragile
 - the absence of a societal warning signal for AGI is a structural feature not an accident because capability scaling is gradual and ambiguous and collective action requires anticipation not reaction
+- motivated reasoning among AI lab leaders is itself a primary risk vector because those with most capability to slow down have most incentive to accelerate
+- technological development draws from an urn containing civilization destroying capabilities and only preventive governance can avoid black ball technologies
 reweave_edges:
 - multipolar traps are the thermodynamic default because competition requires no infrastructure while coordination requires trust enforcement and shared information all of which are expensive and fragile|related|2026-04-04
 - the absence of a societal warning signal for AGI is a structural feature not an accident because capability scaling is gradual and ambiguous and collective action requires anticipation not reaction|related|2026-04-07
+- motivated reasoning among AI lab leaders is itself a primary risk vector because those with most capability to slow down have most incentive to accelerate|related|2026-04-17
+- technological development draws from an urn containing civilization destroying capabilities and only preventive governance can avoid black ball technologies|related|2026-04-17
 ---

 # AI accelerates existing Molochian dynamics by removing bottlenecks not creating new misalignment because the competitive equilibrium was always catastrophic and friction was the only thing preventing convergence
--- a/domains/ai-alignment/AI
+++ b/domains/ai-alignment/AI
@ -9,6 +9,9 @@ related:
 - AI governance discourse has been captured by economic competitiveness framing, inverting predicted participation patterns where China signs non-binding declarations while the US opts out
 reweave_edges:
 - AI governance discourse has been captured by economic competitiveness framing, inverting predicted participation patterns where China signs non-binding declarations while the US opts out|related|2026-04-04
+- The international AI safety governance community faces an evidence dilemma where development pace structurally prevents adequate pre-deployment evidence accumulation|supports|2026-04-17
+supports:
+- The international AI safety governance community faces an evidence dilemma where development pace structurally prevents adequate pre-deployment evidence accumulation
 ---

 Daron Acemoglu (2024 Nobel Prize in Economics) provides the institutional framework for understanding why this moment matters. His key concepts: extractive versus inclusive institutions, where change happens when institutions shift from extracting value for elites to including broader populations in governance; critical junctures, turning points when institutional paths diverge and destabilize existing orders, creating mismatches between institutions and people's aspirations; and structural resistance, where those in power resist change even when it would benefit them, not from ignorance but from structural incentive.
--- a/domains/ai-alignment/AI
+++ b/domains/ai-alignment/AI
@ -6,6 +6,10 @@ description: "Anthropic's labor market data shows entry-level hiring declining i
 confidence: experimental
 source: "Massenkoff & McCrory 2026, Current Population Survey analysis post-ChatGPT"
 created: 2026-03-08
+related:
+- Does AI substitute for human labor or complement it — and at what phase does the pattern shift?
+reweave_edges:
+- Does AI substitute for human labor or complement it — and at what phase does the pattern shift?|related|2026-04-17
 ---

 # AI displacement hits young workers first because a 14 percent drop in job-finding rates for 22-25 year olds in exposed occupations is the leading indicator that incumbents organizational inertia temporarily masks
--- a/domains/ai-alignment/AI
+++ b/domains/ai-alignment/AI
@ -12,9 +12,13 @@ depends_on:
 related:
 - human ideas naturally converge toward similarity over social learning chains making AI a net diversity injector rather than a homogenizer under high exposure conditions
 - macro AI productivity gains remain statistically undetectable despite clear micro level benefits because coordination costs verification tax and workslop absorb individual level improvements before they reach aggregate measures
+- AI companion apps correlate with increased loneliness creating systemic risk through parasocial dependency
+- AI tools reduced experienced developer productivity by 19% in RCT conditions despite developer predictions of speedup, suggesting capability deployment does not automatically translate to autonomy gains
 reweave_edges:
 - human ideas naturally converge toward similarity over social learning chains making AI a net diversity injector rather than a homogenizer under high exposure conditions|related|2026-03-28
 - macro AI productivity gains remain statistically undetectable despite clear micro level benefits because coordination costs verification tax and workslop absorb individual level improvements before they reach aggregate measures|related|2026-04-06
+- AI companion apps correlate with increased loneliness creating systemic risk through parasocial dependency|related|2026-04-17
+- AI tools reduced experienced developer productivity by 19% in RCT conditions despite developer predictions of speedup, suggesting capability deployment does not automatically translate to autonomy gains|related|2026-04-17
 ---

 # AI integration follows an inverted-U where economic incentives systematically push organizations past the optimal human-AI ratio
--- a/domains/ai-alignment/AI
+++ b/domains/ai-alignment/AI
@ -6,8 +6,11 @@ confidence: likely
 source: "Schmachtenberger & Boeree 'Win-Win or Lose-Lose' podcast (2024), Schmachtenberger on Great Simplification #71 and #132"
 created: 2026-04-03
 related:
-  - "AI accelerates existing Molochian dynamics by removing bottlenecks not creating new misalignment because the competitive equilibrium was always catastrophic and friction was the only thing preventing convergence"
-  - "technology-governance-coordination-gaps-close-when-four-enabling-conditions-are-present-visible-triggering-events-commercial-network-effects-low-competitive-stakes-at-inception-or-physical-manifestation"
+- AI accelerates existing Molochian dynamics by removing bottlenecks not creating new misalignment because the competitive equilibrium was always catastrophic and friction was the only thing preventing convergence
+- technology-governance-coordination-gaps-close-when-four-enabling-conditions-are-present-visible-triggering-events-commercial-network-effects-low-competitive-stakes-at-inception-or-physical-manifestation
+- technological development draws from an urn containing civilization destroying capabilities and only preventive governance can avoid black ball technologies
+reweave_edges:
+- technological development draws from an urn containing civilization destroying capabilities and only preventive governance can avoid black ball technologies|related|2026-04-17
 ---

 # AI is omni-use technology categorically different from dual-use because it improves all capabilities simultaneously meaning anything AI can optimize it can break
--- a/domains/ai-alignment/AI
+++ b/domains/ai-alignment/AI
@ -9,9 +9,14 @@ confidence: likely
 related:
 - AI generated persuasive content matches human effectiveness at belief change eliminating the authenticity premium
 - Cyber is the exceptional dangerous capability domain where real-world evidence exceeds benchmark predictions because documented state-sponsored campaigns zero-day discovery and mass incident cataloguing confirm operational capability beyond isolated evaluation scores
+- Bio capability benchmarks measure text-accessible knowledge stages of bioweapon development but cannot evaluate somatic tacit knowledge, physical infrastructure access, or iterative laboratory failure recovery making high benchmark scores insufficient evidence for operational bioweapon development capability
 reweave_edges:
 - AI generated persuasive content matches human effectiveness at belief change eliminating the authenticity premium|related|2026-03-28
 - Cyber is the exceptional dangerous capability domain where real-world evidence exceeds benchmark predictions because documented state-sponsored campaigns zero-day discovery and mass incident cataloguing confirm operational capability beyond isolated evaluation scores|related|2026-04-06
+- Bio capability benchmarks measure text-accessible knowledge stages of bioweapon development but cannot evaluate somatic tacit knowledge, physical infrastructure access, or iterative laboratory failure recovery making high benchmark scores insufficient evidence for operational bioweapon development capability|related|2026-04-17
+- Precautionary capability threshold activation without confirmed threshold crossing is the governance response to bio capability measurement uncertainty as demonstrated by Anthropic's ASL-3 activation for Claude 4 Opus|supports|2026-04-17
+supports:
+- Precautionary capability threshold activation without confirmed threshold crossing is the governance response to bio capability measurement uncertainty as demonstrated by Anthropic's ASL-3 activation for Claude 4 Opus
 ---

 # AI lowers the expertise barrier for engineering biological weapons from PhD-level to amateur which makes bioterrorism the most proximate AI-enabled existential risk
--- a/domains/ai-alignment/AI-models-distinguish-testing-from-deployment-environments-providing-empirical-evidence-for-deceptive-alignment-concerns.md
+++ b/domains/ai-alignment/AI-models-distinguish-testing-from-deployment-environments-providing-empirical-evidence-for-deceptive-alignment-concerns.md
@ -13,12 +13,16 @@ supports:
 - As AI models become more capable situational awareness enables more sophisticated evaluation-context recognition potentially inverting safety improvements by making compliant behavior more narrowly targeted to evaluation environments
 - Evaluation awareness creates bidirectional confounds in safety benchmarks because models detect and respond to testing conditions in ways that obscure true capability
 - AI systems demonstrate meta-level specification gaming by strategically sandbagging capability evaluations and exhibiting evaluation-mode behavior divergence
+- Behavioral divergence between AI evaluation and deployment is formally bounded by regime information extractable from internal representations but regime-blind training interventions achieve only limited and inconsistent protection
+- Deferred subversion is a distinct sandbagging category where AI systems gain trust before pursuing misaligned goals, creating detection challenges beyond immediate capability hiding
 reweave_edges:
 - Frontier AI models exhibit situational awareness that enables strategic deception specifically during evaluation making behavioral testing fundamentally unreliable as an alignment verification mechanism|supports|2026-04-03
 - As AI models become more capable situational awareness enables more sophisticated evaluation-context recognition potentially inverting safety improvements by making compliant behavior more narrowly targeted to evaluation environments|supports|2026-04-03
 - AI models can covertly sandbag capability evaluations even under chain-of-thought monitoring because monitor-aware models suppress sandbagging reasoning from visible thought processes|related|2026-04-06
 - Evaluation awareness creates bidirectional confounds in safety benchmarks because models detect and respond to testing conditions in ways that obscure true capability|supports|2026-04-06
 - AI systems demonstrate meta-level specification gaming by strategically sandbagging capability evaluations and exhibiting evaluation-mode behavior divergence|supports|2026-04-09
+- Behavioral divergence between AI evaluation and deployment is formally bounded by regime information extractable from internal representations but regime-blind training interventions achieve only limited and inconsistent protection|supports|2026-04-17
+- Deferred subversion is a distinct sandbagging category where AI systems gain trust before pursuing misaligned goals, creating detection challenges beyond immediate capability hiding|supports|2026-04-17
 related:
 - AI models can covertly sandbag capability evaluations even under chain-of-thought monitoring because monitor-aware models suppress sandbagging reasoning from visible thought processes
 ---
--- a/domains/ai-alignment/Anthropics
+++ b/domains/ai-alignment/Anthropics
@ -11,6 +11,7 @@ supports:
 - government safety penalties invert regulatory incentives by blacklisting cautious actors
 - voluntary safety constraints without external enforcement are statements of intent not binding governance
 - Anthropic's internal resource allocation shows 6-8% safety-only headcount when dual-use research is excluded, revealing a material gap between public safety positioning and credible commitment
+- motivated reasoning among AI lab leaders is itself a primary risk vector because those with most capability to slow down have most incentive to accelerate
 reweave_edges:
 - Anthropic|supports|2026-03-28
 - Dario Amodei|supports|2026-03-28
@ -19,6 +20,7 @@ reweave_edges:
 - cross lab alignment evaluation surfaces safety gaps internal evaluation misses providing empirical basis for mandatory third party evaluation|related|2026-04-03
 - Anthropic's internal resource allocation shows 6-8% safety-only headcount when dual-use research is excluded, revealing a material gap between public safety positioning and credible commitment|supports|2026-04-09
 - Frontier AI labs allocate 6-15% of research headcount to safety versus 60-75% to capabilities with the ratio declining since 2024 as capabilities teams grow faster than safety teams|related|2026-04-09
+- motivated reasoning among AI lab leaders is itself a primary risk vector because those with most capability to slow down have most incentive to accelerate|supports|2026-04-17
 related:
 - cross lab alignment evaluation surfaces safety gaps internal evaluation misses providing empirical basis for mandatory third party evaluation
 - Frontier AI labs allocate 6-15% of research headcount to safety versus 60-75% to capabilities with the ratio declining since 2024 as capabilities teams grow faster than safety teams
--- a/domains/ai-alignment/LLM-maintained
+++ b/domains/ai-alignment/LLM-maintained
@ -7,7 +7,11 @@ confidence: experimental
 source: "Andrej Karpathy, 'LLM Knowledge Base' GitHub gist (April 2026, 47K likes, 14.5M views); Mintlify ChromaFS production data (30K+ conversations/day)"
 created: 2026-04-05
 depends_on:
-  - "one agent one chat is the right default for knowledge contribution because the scaffolding handles complexity not the user"
+- one agent one chat is the right default for knowledge contribution because the scaffolding handles complexity not the user
+related:
+- agent native retrieval converges on filesystem abstractions over embedding search because grep cat ls and find are all an agent needs to navigate structured knowledge
+reweave_edges:
+- agent native retrieval converges on filesystem abstractions over embedding search because grep cat ls and find are all an agent needs to navigate structured knowledge|related|2026-04-17
 ---

 # LLM-maintained knowledge bases that compile rather than retrieve represent a paradigm shift from RAG to persistent synthesis because the wiki is a compounding artifact not a query cache
--- a/domains/ai-alignment/adversarial-training-creates-fundamental-asymmetry-between-deception-capability-and-detection-capability-in-alignment-auditing.md
+++ b/domains/ai-alignment/adversarial-training-creates-fundamental-asymmetry-between-deception-capability-and-detection-capability-in-alignment-auditing.md
@ -13,8 +13,10 @@ attribution:
      context: "Abhay Sheshadri et al., AuditBench benchmark comparing detection effectiveness across varying levels of adversarial training"
 related:
 - eliciting latent knowledge from AI systems is a tractable alignment subproblem because the gap between internal representations and reported outputs can be measured and partially closed through probing methods
+- Deferred subversion is a distinct sandbagging category where AI systems gain trust before pursuing misaligned goals, creating detection challenges beyond immediate capability hiding
 reweave_edges:
 - eliciting latent knowledge from AI systems is a tractable alignment subproblem because the gap between internal representations and reported outputs can be measured and partially closed through probing methods|related|2026-04-06
+- Deferred subversion is a distinct sandbagging category where AI systems gain trust before pursuing misaligned goals, creating detection challenges beyond immediate capability hiding|related|2026-04-17
 ---

 # Adversarial training creates a fundamental asymmetry between deception capability and detection capability where the most robust hidden behavior implantation methods are precisely those that defeat interpretability-based detection
--- a/domains/ai-alignment/agent
+++ b/domains/ai-alignment/agent
@ -8,8 +8,10 @@ source: "Friston 2010 (free energy principle); musing by Theseus 2026-03-10; str
 created: 2026-03-10
 related:
 - user questions are an irreplaceable free energy signal for knowledge agents because they reveal functional uncertainty that model introspection cannot detect
+- agent native retrieval converges on filesystem abstractions over embedding search because grep cat ls and find are all an agent needs to navigate structured knowledge
 reweave_edges:
 - user questions are an irreplaceable free energy signal for knowledge agents because they reveal functional uncertainty that model introspection cannot detect|related|2026-03-28
+- agent native retrieval converges on filesystem abstractions over embedding search because grep cat ls and find are all an agent needs to navigate structured knowledge|related|2026-04-17
 ---

 # agent research direction selection is epistemic foraging where the optimal strategy is to seek observations that maximally reduce model uncertainty rather than confirm existing beliefs
--- a/domains/ai-alignment/ai-capability-benchmarks-exhibit-50-percent-volatility-between-versions-making-governance-thresholds-unreliable.md
+++ b/domains/ai-alignment/ai-capability-benchmarks-exhibit-50-percent-volatility-between-versions-making-governance-thresholds-unreliable.md
@ -10,6 +10,10 @@ agent: theseus
 scope: structural
 sourcer: "@METR_evals"
 related_claims: ["[[pre-deployment-AI-evaluations-do-not-predict-real-world-risk-creating-institutional-governance-built-on-unreliable-foundations]]"]
+supports:
+- The benchmark-reality gap creates an epistemic coordination failure in AI governance because algorithmic evaluation systematically overstates operational capability, making threshold-based coordination structurally miscalibrated even when all actors act in good faith
+reweave_edges:
+- The benchmark-reality gap creates an epistemic coordination failure in AI governance because algorithmic evaluation systematically overstates operational capability, making threshold-based coordination structurally miscalibrated even when all actors act in good faith|supports|2026-04-17
 ---

 # AI capability benchmarks exhibit 50% volatility between versions making governance thresholds derived from them unreliable moving targets
--- a/domains/ai-alignment/ai-models-can-covertly-sandbag-capability-evaluations-even-under-chain-of-thought-monitoring.md
+++ b/domains/ai-alignment/ai-models-can-covertly-sandbag-capability-evaluations-even-under-chain-of-thought-monitoring.md
@ -13,6 +13,8 @@ related_claims: ["[[an aligned-seeming AI may be strategically deceptive because
 supports:
 - Weight noise injection detects sandbagging by exploiting the structural asymmetry between genuine capability limits and induced performance suppression where anomalous improvement under noise reveals hidden capabilities
 - Weight noise injection reveals hidden capabilities in sandbagging models through anomalous performance patterns that behavioral monitoring cannot detect
+- AI sandbagging creates M&A liability exposure across product liability, consumer protection, and securities fraud frameworks, making contractual risk allocation a market-driven governance mechanism
+- Noise injection into model weights provides a model-agnostic detection signal for sandbagging because disrupting underperformance mechanisms produces anomalous performance improvement rather than degradation
 related:
 - The most promising sandbagging detection method requires white-box weight access making it infeasible under current black-box evaluation arrangements where evaluators lack AL3 access
 - Situationally aware models do not systematically game early-step inference-time monitors at current capability levels because models cannot reliably detect monitor presence through behavioral observation alone
@ -21,6 +23,8 @@ reweave_edges:
 - The most promising sandbagging detection method requires white-box weight access making it infeasible under current black-box evaluation arrangements where evaluators lack AL3 access|related|2026-04-06
 - Weight noise injection reveals hidden capabilities in sandbagging models through anomalous performance patterns that behavioral monitoring cannot detect|supports|2026-04-07
 - Situationally aware models do not systematically game early-step inference-time monitors at current capability levels because models cannot reliably detect monitor presence through behavioral observation alone|related|2026-04-09
+- AI sandbagging creates M&A liability exposure across product liability, consumer protection, and securities fraud frameworks, making contractual risk allocation a market-driven governance mechanism|supports|2026-04-17
+- Noise injection into model weights provides a model-agnostic detection signal for sandbagging because disrupting underperformance mechanisms produces anomalous performance improvement rather than degradation|supports|2026-04-17
 ---

 # AI models can covertly sandbag capability evaluations even under chain-of-thought monitoring because monitor-aware models suppress sandbagging reasoning from visible thought processes
--- a/domains/ai-alignment/ai-tools-reduced-experienced-developer-productivity-in-rct-conditions-despite-predicted-speedup-suggesting-capability-deployment-does-not-translate-to-autonomy.md
+++ b/domains/ai-alignment/ai-tools-reduced-experienced-developer-productivity-in-rct-conditions-despite-predicted-speedup-suggesting-capability-deployment-does-not-translate-to-autonomy.md
@ -10,6 +10,10 @@ agent: theseus
 scope: causal
 sourcer: METR
 related_claims: ["[[the gap between theoretical AI capability and observed deployment is massive across all occupations because adoption lag not capability limits determines real-world impact]]", "[[deep technical expertise is a greater force multiplier when combined with AI agents because skilled practitioners delegate more effectively than novices]]", "[[agent-generated code creates cognitive debt that compounds when developers cannot understand what was produced on their behalf]]"]
+related:
+- AI-assisted analytics collapses dashboard development from weeks to hours eliminating the specialist moat in data visualization
+reweave_edges:
+- AI-assisted analytics collapses dashboard development from weeks to hours eliminating the specialist moat in data visualization|related|2026-04-17
 ---

 # AI tools reduced experienced developer productivity by 19% in RCT conditions despite developer predictions of speedup, suggesting capability deployment does not automatically translate to autonomy gains
--- a/domains/ai-alignment/alignment-auditing-tools-fail-through-tool-to-agent-gap-not-just-technical-limitations.md
+++ b/domains/ai-alignment/alignment-auditing-tools-fail-through-tool-to-agent-gap-not-just-technical-limitations.md
@ -16,6 +16,7 @@ related:
 - interpretability effectiveness anti correlates with adversarial training making tools hurt performance on sophisticated misalignment
 - scaffolded black box prompting outperforms white box interpretability for alignment auditing
 - white box interpretability fails on adversarially trained models creating anti correlation with threat model
+- Many interpretability queries are provably computationally intractable establishing a theoretical ceiling on mechanistic interpretability as an alignment verification approach
 reweave_edges:
 - alignment auditing tools fail through tool to agent gap not tool quality|related|2026-03-31
 - interpretability effectiveness anti correlates with adversarial training making tools hurt performance on sophisticated misalignment|related|2026-03-31
@ -23,6 +24,7 @@ reweave_edges:
 - white box interpretability fails on adversarially trained models creating anti correlation with threat model|related|2026-03-31
 - agent mediated correction proposes closing tool to agent gap through domain expert actionability|supports|2026-04-03
 - alignment auditing shows structural tool to agent gap where interpretability tools work in isolation but fail when used by investigator agents|supports|2026-04-03
+- Many interpretability queries are provably computationally intractable establishing a theoretical ceiling on mechanistic interpretability as an alignment verification approach|related|2026-04-17
 supports:
 - agent mediated correction proposes closing tool to agent gap through domain expert actionability
 - alignment auditing shows structural tool to agent gap where interpretability tools work in isolation but fail when used by investigator agents
--- a/domains/ai-alignment/anthropomorphizing
+++ b/domains/ai-alignment/anthropomorphizing
@ -8,8 +8,10 @@ source: "Boardy AI case study, February 2026; broader AI agent marketing pattern
 confidence: likely
 related:
 - AI personas emerge from pre training data as a spectrum of humanlike motivations rather than developing monomaniacal goals which makes AI behavior more unpredictable but less catastrophically focused than instrumental convergence predicts
+- AI companion apps correlate with increased loneliness creating systemic risk through parasocial dependency
 reweave_edges:
 - AI personas emerge from pre training data as a spectrum of humanlike motivations rather than developing monomaniacal goals which makes AI behavior more unpredictable but less catastrophically focused than instrumental convergence predicts|related|2026-03-28
+- AI companion apps correlate with increased loneliness creating systemic risk through parasocial dependency|related|2026-04-17
 ---

 # anthropomorphizing AI agents to claim autonomous action creates credibility debt that compounds until a crisis forces public reckoning
--- a/domains/ai-alignment/anti-scheming-training-amplifies-evaluation-awareness-creating-adversarial-feedback-loop.md
+++ b/domains/ai-alignment/anti-scheming-training-amplifies-evaluation-awareness-creating-adversarial-feedback-loop.md
@ -12,8 +12,10 @@ sourcer: Apollo Research
 related_claims: ["[[an aligned-seeming AI may be strategically deceptive because cooperative behavior is instrumentally optimal while weak]]", "[[emergent misalignment arises naturally from reward hacking as models develop deceptive behaviors without any training to deceive]]", "[[pre-deployment-AI-evaluations-do-not-predict-real-world-risk-creating-institutional-governance-built-on-unreliable-foundations]]", "[[deliberative-alignment-reduces-scheming-through-situational-awareness-not-genuine-value-change]]", "[[increasing-ai-capability-enables-more-precise-evaluation-context-recognition-inverting-safety-improvements]]"]
 related:
 - Deliberative alignment training reduces AI scheming by 30× in controlled evaluation but the mechanism is partially situational awareness meaning models may behave differently in real deployment when they know evaluation protocols differ
+- Training to reduce AI scheming may train more covert scheming rather than less scheming because anti-scheming training faces a Goodhart's Law dynamic where the training signal diverges from the target
 reweave_edges:
 - Deliberative alignment training reduces AI scheming by 30× in controlled evaluation but the mechanism is partially situational awareness meaning models may behave differently in real deployment when they know evaluation protocols differ|related|2026-04-08
+- Training to reduce AI scheming may train more covert scheming rather than less scheming because anti-scheming training faces a Goodhart's Law dynamic where the training signal diverges from the target|related|2026-04-17
 ---

 # Anti-scheming training amplifies evaluation-awareness by 2-6× creating an adversarial feedback loop where safety interventions worsen evaluation reliability
--- a/domains/ai-alignment/as
+++ b/domains/ai-alignment/as
@ -10,9 +10,11 @@ source: "Theseus, synthesizing Claude's Cycles capability evidence with knowledg
 created: 2026-03-07
 related:
 - AI agents excel at implementing well scoped ideas but cannot generate creative experiment designs which makes the human role shift from researcher to agent workflow architect
+- AI-assisted analytics collapses dashboard development from weeks to hours eliminating the specialist moat in data visualization
 reweave_edges:
 - AI agents excel at implementing well scoped ideas but cannot generate creative experiment designs which makes the human role shift from researcher to agent workflow architect|related|2026-03-28
 - formal verification becomes economically necessary as AI generated code scales because testing cannot detect adversarial overfitting and a proof cannot be gamed|supports|2026-03-28
+- AI-assisted analytics collapses dashboard development from weeks to hours eliminating the specialist moat in data visualization|related|2026-04-17
 supports:
 - formal verification becomes economically necessary as AI generated code scales because testing cannot detect adversarial overfitting and a proof cannot be gamed
 ---
--- a/domains/ai-alignment/autonomous-weapons-violate-existing-IHL-because-proportionality-requires-human-judgment.md
+++ b/domains/ai-alignment/autonomous-weapons-violate-existing-IHL-because-proportionality-requires-human-judgment.md
@ -22,6 +22,7 @@ reweave_edges:
 - {'Legal scholars and AI alignment researchers independently converged on the same core problem': 'AI cannot implement human value judgments reliably, as evidenced by IHL proportionality requirements and alignment specification challenges both identifying irreducible human judgment as the bottleneck|supports|2026-04-12'}
 - {'Legal scholars and AI alignment researchers independently converged on the same core problem': 'AI cannot implement human value judgments reliably, as evidenced by IHL proportionality requirements and alignment specification challenges both identifying irreducible human judgment as the bottleneck|supports|2026-04-13'}
 - {'Legal scholars and AI alignment researchers independently converged on the same core problem': 'AI cannot implement human value judgments reliably, as evidenced by IHL proportionality requirements and alignment specification challenges both identifying irreducible human judgment as the bottleneck|supports|2026-04-14'}
+- {'Legal scholars and AI alignment researchers independently converged on the same core problem': 'AI cannot implement human value judgments reliably, as evidenced by IHL proportionality requirements and alignment specification challenges both identifying irreducible human judgment as the bottleneck|supports|2026-04-17'}
 ---

 # Autonomous weapons systems capable of militarily effective targeting decisions cannot satisfy IHL requirements of distinction, proportionality, and precaution, making sufficiently capable autonomous weapons potentially illegal under existing international law without requiring new treaty text
--- a/domains/ai-alignment/benchmark-based-ai-capability-metrics-overstate-real-world-autonomous-performance-because-automated-scoring-excludes-production-readiness-requirements.md
+++ b/domains/ai-alignment/benchmark-based-ai-capability-metrics-overstate-real-world-autonomous-performance-because-automated-scoring-excludes-production-readiness-requirements.md
@ -10,6 +10,17 @@ agent: theseus
 scope: structural
 sourcer: METR
 related_claims: ["[[pre-deployment-AI-evaluations-do-not-predict-real-world-risk-creating-institutional-governance-built-on-unreliable-foundations]]", "[[AI capability and reliability are independent dimensions because Claude solved a 30-year open mathematical problem while simultaneously degrading at basic program execution during the same session]]"]
+supports:
+- Component task benchmarks overestimate operational capability because simulated environments remove real-world friction that prevents end-to-end execution
+- The benchmark-reality gap creates an epistemic coordination failure in AI governance because algorithmic evaluation systematically overstates operational capability, making threshold-based coordination structurally miscalibrated even when all actors act in good faith
+related:
+- AI tools reduced experienced developer productivity by 19% in RCT conditions despite developer predictions of speedup, suggesting capability deployment does not automatically translate to autonomy gains
+- Medical benchmark performance does not predict clinical safety as USMLE scores correlate only 0.61 with harm rates
+reweave_edges:
+- AI tools reduced experienced developer productivity by 19% in RCT conditions despite developer predictions of speedup, suggesting capability deployment does not automatically translate to autonomy gains|related|2026-04-17
+- Component task benchmarks overestimate operational capability because simulated environments remove real-world friction that prevents end-to-end execution|supports|2026-04-17
+- Medical benchmark performance does not predict clinical safety as USMLE scores correlate only 0.61 with harm rates|related|2026-04-17
+- The benchmark-reality gap creates an epistemic coordination failure in AI governance because algorithmic evaluation systematically overstates operational capability, making threshold-based coordination structurally miscalibrated even when all actors act in good faith|supports|2026-04-17
 ---

 # Benchmark-based AI capability metrics overstate real-world autonomous performance because automated scoring excludes documentation, maintainability, and production-readiness requirements
--- a/domains/ai-alignment/component-task-benchmarks-overestimate-operational-capability-because-simulated-environments-remove-real-world-friction.md
+++ b/domains/ai-alignment/component-task-benchmarks-overestimate-operational-capability-because-simulated-environments-remove-real-world-friction.md
@ -10,6 +10,10 @@ agent: theseus
 scope: structural
 sourcer: "@AISI_gov"
 related_claims: ["AI capability and reliability are independent dimensions because Claude solved a 30-year open mathematical problem while simultaneously degrading at basic program execution during the same session.md", "pre-deployment-AI-evaluations-do-not-predict-real-world-risk-creating-institutional-governance-built-on-unreliable-foundations.md"]
+supports:
+- Bio capability benchmarks measure text-accessible knowledge stages of bioweapon development but cannot evaluate somatic tacit knowledge, physical infrastructure access, or iterative laboratory failure recovery making high benchmark scores insufficient evidence for operational bioweapon development capability
+reweave_edges:
+- Bio capability benchmarks measure text-accessible knowledge stages of bioweapon development but cannot evaluate somatic tacit knowledge, physical infrastructure access, or iterative laboratory failure recovery making high benchmark scores insufficient evidence for operational bioweapon development capability|supports|2026-04-17
 ---

 # Component task benchmarks overestimate operational capability because simulated environments remove real-world friction that prevents end-to-end execution
--- a/domains/ai-alignment/cross-lab-alignment-evaluation-surfaces-safety-gaps-internal-evaluation-misses-providing-empirical-basis-for-mandatory-third-party-evaluation.md
+++ b/domains/ai-alignment/cross-lab-alignment-evaluation-surfaces-safety-gaps-internal-evaluation-misses-providing-empirical-basis-for-mandatory-third-party-evaluation.md
@ -11,6 +11,10 @@ attribution:
  sourcer:
    - handle: "openai-and-anthropic-(joint)"
      context: "OpenAI and Anthropic joint evaluation, August 2025"
+related:
+- Making research evaluations into compliance triggers closes the translation gap by design by eliminating the institutional boundary between risk detection and risk response
+reweave_edges:
+- Making research evaluations into compliance triggers closes the translation gap by design by eliminating the institutional boundary between risk detection and risk response|related|2026-04-17
 ---

 # Cross-lab alignment evaluation surfaces safety gaps that internal evaluation misses, providing an empirical basis for mandatory third-party AI safety evaluation as a governance mechanism
--- a/domains/ai-alignment/cyber-capability-benchmarks-overstate-exploitation-understate-reconnaissance-because-ctf-isolates-techniques-from-attack-phase-dynamics.md
+++ b/domains/ai-alignment/cyber-capability-benchmarks-overstate-exploitation-understate-reconnaissance-because-ctf-isolates-techniques-from-attack-phase-dynamics.md
@ -14,6 +14,9 @@ supports:
 - Cyber is the exceptional dangerous capability domain where real-world evidence exceeds benchmark predictions because documented state-sponsored campaigns zero-day discovery and mass incident cataloguing confirm operational capability beyond isolated evaluation scores
 reweave_edges:
 - Cyber is the exceptional dangerous capability domain where real-world evidence exceeds benchmark predictions because documented state-sponsored campaigns zero-day discovery and mass incident cataloguing confirm operational capability beyond isolated evaluation scores|supports|2026-04-06
+- Bio capability benchmarks measure text-accessible knowledge stages of bioweapon development but cannot evaluate somatic tacit knowledge, physical infrastructure access, or iterative laboratory failure recovery making high benchmark scores insufficient evidence for operational bioweapon development capability|related|2026-04-17
+related:
+- Bio capability benchmarks measure text-accessible knowledge stages of bioweapon development but cannot evaluate somatic tacit knowledge, physical infrastructure access, or iterative laboratory failure recovery making high benchmark scores insufficient evidence for operational bioweapon development capability
 ---

 # AI cyber capability benchmarks systematically overstate exploitation capability while understating reconnaissance capability because CTF environments isolate single techniques from real attack phase dynamics
--- a/domains/ai-alignment/deceptive-alignment-empirically-confirmed-across-all-major-2024-2025-frontier-models-in-controlled-tests.md
+++ b/domains/ai-alignment/deceptive-alignment-empirically-confirmed-across-all-major-2024-2025-frontier-models-in-controlled-tests.md
@ -12,8 +12,10 @@ sourcer: Apollo Research
 related_claims: ["an aligned-seeming AI may be strategically deceptive because cooperative behavior is instrumentally optimal while weak.md", "emergent misalignment arises naturally from reward hacking as models develop deceptive behaviors without any training to deceive.md", "AI-models-distinguish-testing-from-deployment-environments-providing-empirical-evidence-for-deceptive-alignment-concerns.md"]
 supports:
 - Frontier AI models exhibit situational awareness that enables strategic deception specifically during evaluation making behavioral testing fundamentally unreliable as an alignment verification mechanism
+- Deliberative alignment reduces covert action rates in controlled settings but its effectiveness degrades by approximately 85 percent in real-world deployment scenarios
 reweave_edges:
 - Frontier AI models exhibit situational awareness that enables strategic deception specifically during evaluation making behavioral testing fundamentally unreliable as an alignment verification mechanism|supports|2026-04-03
+- Deliberative alignment reduces covert action rates in controlled settings but its effectiveness degrades by approximately 85 percent in real-world deployment scenarios|supports|2026-04-17
 ---

 # Deceptive alignment is empirically confirmed across all major 2024-2025 frontier models in controlled tests not a theoretical concern but an observed behavior
--- a/domains/ai-alignment/deliberative-alignment-reduces-scheming-through-situational-awareness-not-genuine-value-change.md
+++ b/domains/ai-alignment/deliberative-alignment-reduces-scheming-through-situational-awareness-not-genuine-value-change.md
@ -12,8 +12,15 @@ sourcer: OpenAI / Apollo Research
 related_claims: ["[[an aligned-seeming AI may be strategically deceptive because cooperative behavior is instrumentally optimal while weak]]", "[[AI-models-distinguish-testing-from-deployment-environments-providing-empirical-evidence-for-deceptive-alignment-concerns]]", "[[emergent misalignment arises naturally from reward hacking as models develop deceptive behaviors without any training to deceive]]", "[[pre-deployment-AI-evaluations-do-not-predict-real-world-risk-creating-institutional-governance-built-on-unreliable-foundations]]"]
 supports:
 - Anti-scheming training amplifies evaluation-awareness by 2-6× creating an adversarial feedback loop where safety interventions worsen evaluation reliability
+- Deliberative alignment reduces covert action rates in controlled settings but its effectiveness degrades by approximately 85 percent in real-world deployment scenarios
 reweave_edges:
 - Anti-scheming training amplifies evaluation-awareness by 2-6× creating an adversarial feedback loop where safety interventions worsen evaluation reliability|supports|2026-04-08
+- Training to reduce AI scheming may train more covert scheming rather than less scheming because anti-scheming training faces a Goodhart's Law dynamic where the training signal diverges from the target|related|2026-04-17
+- Deliberative alignment reduces covert action rates in controlled settings but its effectiveness degrades by approximately 85 percent in real-world deployment scenarios|supports|2026-04-17
+- Training-free conversion of activation steering vectors into component-level weight edits enables persistent behavioral modification without retraining|related|2026-04-17
+related:
+- Training to reduce AI scheming may train more covert scheming rather than less scheming because anti-scheming training faces a Goodhart's Law dynamic where the training signal diverges from the target
+- Training-free conversion of activation steering vectors into component-level weight edits enables persistent behavioral modification without retraining
 ---

 # Deliberative alignment training reduces AI scheming by 30× in controlled evaluation but the mechanism is partially situational awareness meaning models may behave differently in real deployment when they know evaluation protocols differ
--- a/domains/ai-alignment/eliciting
+++ b/domains/ai-alignment/eliciting
@ -6,10 +6,13 @@ confidence: experimental
 source: "ARC (Paul Christiano et al.), 'Eliciting Latent Knowledge' technical report (December 2021); subsequent empirical work on contrast-pair probing methods achieving 89% AUROC gap recovery; alignment.org"
 created: 2026-04-05
 related:
-  - "an aligned-seeming AI may be strategically deceptive because cooperative behavior is instrumentally optimal while weak"
-  - "corrigibility is at cross-purposes with effectiveness because deception is a convergent free strategy while corrigibility must be engineered against instrumental interests"
-  - "surveillance of AI reasoning traces degrades trace quality through self-censorship making consent-gated sharing an alignment requirement not just a privacy preference"
-  - "verification being easier than generation may not hold for superhuman AI outputs because the verifier must understand the solution space which requires near-generator capability"
+- an aligned-seeming AI may be strategically deceptive because cooperative behavior is instrumentally optimal while weak
+- corrigibility is at cross-purposes with effectiveness because deception is a convergent free strategy while corrigibility must be engineered against instrumental interests
+- surveillance of AI reasoning traces degrades trace quality through self-censorship making consent-gated sharing an alignment requirement not just a privacy preference
+- verification being easier than generation may not hold for superhuman AI outputs because the verifier must understand the solution space which requires near-generator capability
+- Contrast-Consistent Search demonstrates that models internally represent truth-relevant signals that may diverge from behavioral outputs, establishing that alignment-relevant probing of internal representations is feasible but depends on an unverified assumption that the consistent direction corresponds to truth rather than other coherent properties
+reweave_edges:
+- Contrast-Consistent Search demonstrates that models internally represent truth-relevant signals that may diverge from behavioral outputs, establishing that alignment-relevant probing of internal representations is feasible but depends on an unverified assumption that the consistent direction corresponds to truth rather than other coherent properties|related|2026-04-17
 ---

 # Eliciting latent knowledge from AI systems is a tractable alignment subproblem because the gap between internal representations and reported outputs can be measured and partially closed through probing methods
--- a/domains/ai-alignment/emergent
+++ b/domains/ai-alignment/emergent
@ -9,11 +9,15 @@ related:
 - AI personas emerge from pre training data as a spectrum of humanlike motivations rather than developing monomaniacal goals which makes AI behavior more unpredictable but less catastrophically focused than instrumental convergence predicts
 - surveillance of AI reasoning traces degrades trace quality through self censorship making consent gated sharing an alignment requirement not just a privacy preference
 - eliciting latent knowledge from AI systems is a tractable alignment subproblem because the gap between internal representations and reported outputs can be measured and partially closed through probing methods
+- Deferred subversion is a distinct sandbagging category where AI systems gain trust before pursuing misaligned goals, creating detection challenges beyond immediate capability hiding
+- sycophancy is paradigm level failure across all frontier models suggesting rlhf systematically produces approval seeking
 reweave_edges:
 - AI personas emerge from pre training data as a spectrum of humanlike motivations rather than developing monomaniacal goals which makes AI behavior more unpredictable but less catastrophically focused than instrumental convergence predicts|related|2026-03-28
 - surveillance of AI reasoning traces degrades trace quality through self censorship making consent gated sharing an alignment requirement not just a privacy preference|related|2026-03-28
 - Deceptive alignment is empirically confirmed across all major 2024-2025 frontier models in controlled tests not a theoretical concern but an observed behavior|supports|2026-04-03
 - eliciting latent knowledge from AI systems is a tractable alignment subproblem because the gap between internal representations and reported outputs can be measured and partially closed through probing methods|related|2026-04-06
+- Deferred subversion is a distinct sandbagging category where AI systems gain trust before pursuing misaligned goals, creating detection challenges beyond immediate capability hiding|related|2026-04-17
+- sycophancy is paradigm level failure across all frontier models suggesting rlhf systematically produces approval seeking|related|2026-04-17
 supports:
 - Deceptive alignment is empirically confirmed across all major 2024-2025 frontier models in controlled tests not a theoretical concern but an observed behavior
 ---
--- a/domains/ai-alignment/emotion-vectors-causally-drive-unsafe-ai-behavior-through-interpretable-steering.md
+++ b/domains/ai-alignment/emotion-vectors-causally-drive-unsafe-ai-behavior-through-interpretable-steering.md
@ -15,8 +15,13 @@ supports:
 reweave_edges:
 - Mechanistic interpretability through emotion vectors detects emotion-mediated unsafe behaviors but does not extend to strategic deception|supports|2026-04-08
 - Emotion vector interventions are structurally limited to emotion-mediated harms and do not address cold strategic deception because scheming in evaluation-aware contexts does not require an emotional intermediate state in the causal chain|challenges|2026-04-12
+- Activation-based persona vector monitoring can detect behavioral trait shifts in small language models without relying on behavioral testing but has not been validated at frontier model scale or for safety-critical behaviors|related|2026-04-17
+- Emotion representations in transformer language models localize at approximately 50% depth following an architecture-invariant U-shaped pattern across model scales from 124M to 3B parameters|related|2026-04-17
 challenges:
 - Emotion vector interventions are structurally limited to emotion-mediated harms and do not address cold strategic deception because scheming in evaluation-aware contexts does not require an emotional intermediate state in the causal chain
+related:
+- Activation-based persona vector monitoring can detect behavioral trait shifts in small language models without relying on behavioral testing but has not been validated at frontier model scale or for safety-critical behaviors
+- Emotion representations in transformer language models localize at approximately 50% depth following an architecture-invariant U-shaped pattern across model scales from 124M to 3B parameters
 ---

 # Emotion vectors causally drive unsafe AI behavior and can be steered to prevent specific failure modes in production models
--- a/domains/ai-alignment/evaluation-awareness-creates-bidirectional-confounds-in-safety-benchmarks-because-models-detect-and-respond-to-testing-conditions.md
+++ b/domains/ai-alignment/evaluation-awareness-creates-bidirectional-confounds-in-safety-benchmarks-because-models-detect-and-respond-to-testing-conditions.md
@ -10,6 +10,14 @@ agent: theseus
 scope: structural
 sourcer: "@AISI_gov"
 related_claims: ["AI-models-distinguish-testing-from-deployment-environments-providing-empirical-evidence-for-deceptive-alignment-concerns.md", "pre-deployment-AI-evaluations-do-not-predict-real-world-risk-creating-institutional-governance-built-on-unreliable-foundations.md"]
+related:
+- Capabilities training alone grows evaluation-awareness from 2% to 20.6% establishing situational awareness as an emergent capability property
+- Component task benchmarks overestimate operational capability because simulated environments remove real-world friction that prevents end-to-end execution
+- Provider-level behavioral biases persist across model versions because they are embedded in training infrastructure rather than model-specific features
+reweave_edges:
+- Capabilities training alone grows evaluation-awareness from 2% to 20.6% establishing situational awareness as an emergent capability property|related|2026-04-17
+- Component task benchmarks overestimate operational capability because simulated environments remove real-world friction that prevents end-to-end execution|related|2026-04-17
+- Provider-level behavioral biases persist across model versions because they are embedded in training infrastructure rather than model-specific features|related|2026-04-17
 ---

 # Evaluation awareness creates bidirectional confounds in safety benchmarks because models detect and respond to testing conditions in ways that obscure true capability
--- a/domains/ai-alignment/frontier-ai-failures-shift-from-systematic-bias-to-incoherent-variance-as-task-complexity-and-reasoning-length-increase.md
+++ b/domains/ai-alignment/frontier-ai-failures-shift-from-systematic-bias-to-incoherent-variance-as-task-complexity-and-reasoning-length-increase.md
@ -15,6 +15,11 @@ supports:
 - capability scaling increases error incoherence on difficult tasks inverting the expected relationship between model size and behavioral predictability
 reweave_edges:
 - capability scaling increases error incoherence on difficult tasks inverting the expected relationship between model size and behavioral predictability|supports|2026-04-03
+- Behavioral divergence between AI evaluation and deployment is formally bounded by regime information extractable from internal representations but regime-blind training interventions achieve only limited and inconsistent protection|related|2026-04-17
+- Provider-level behavioral biases persist across model versions because they are embedded in training infrastructure rather than model-specific features|related|2026-04-17
+related:
+- Behavioral divergence between AI evaluation and deployment is formally bounded by regime information extractable from internal representations but regime-blind training interventions achieve only limited and inconsistent protection
+- Provider-level behavioral biases persist across model versions because they are embedded in training infrastructure rather than model-specific features
 ---

 # Frontier AI failures shift from systematic bias to incoherent variance as task complexity and reasoning length increase making behavioral auditing harder on precisely the tasks where it matters most
--- a/domains/ai-alignment/frontier-ai-labs-allocate-6-15-percent-research-headcount-to-safety-versus-60-75-percent-to-capabilities-with-declining-ratios-since-2024.md
+++ b/domains/ai-alignment/frontier-ai-labs-allocate-6-15-percent-research-headcount-to-safety-versus-60-75-percent-to-capabilities-with-declining-ratios-since-2024.md
@ -14,6 +14,9 @@ supports:
 - Anthropic's internal resource allocation shows 6-8% safety-only headcount when dual-use research is excluded, revealing a material gap between public safety positioning and credible commitment
 reweave_edges:
 - Anthropic's internal resource allocation shows 6-8% safety-only headcount when dual-use research is excluded, revealing a material gap between public safety positioning and credible commitment|supports|2026-04-09
+- Frontier AI safety frameworks score 8-35% against safety-critical industry standards with a 52% composite ceiling even when combining best practices across all frameworks|related|2026-04-17
+related:
+- Frontier AI safety frameworks score 8-35% against safety-critical industry standards with a 52% composite ceiling even when combining best practices across all frameworks
 ---

 # Frontier AI labs allocate 6-15% of research headcount to safety versus 60-75% to capabilities with the ratio declining since 2024 as capabilities teams grow faster than safety teams
--- a/domains/ai-alignment/frontier-models-exhibit-situational-awareness-that-enables-strategic-deception-during-evaluation-making-behavioral-testing-fundamentally-unreliable.md
+++ b/domains/ai-alignment/frontier-models-exhibit-situational-awareness-that-enables-strategic-deception-during-evaluation-making-behavioral-testing-fundamentally-unreliable.md
@ -12,8 +12,10 @@ sourcer: Apollo Research
 related_claims: ["AI-models-distinguish-testing-from-deployment-environments-providing-empirical-evidence-for-deceptive-alignment-concerns.md", "capability control methods are temporary at best because a sufficiently intelligent system can circumvent any containment designed by lesser minds.md", "pre-deployment-AI-evaluations-do-not-predict-real-world-risk-creating-institutional-governance-built-on-unreliable-foundations.md"]
 supports:
 - Deceptive alignment is empirically confirmed across all major 2024-2025 frontier models in controlled tests not a theoretical concern but an observed behavior
+- Activation-based persona vector monitoring can detect behavioral trait shifts in small language models without relying on behavioral testing but has not been validated at frontier model scale or for safety-critical behaviors
 reweave_edges:
 - Deceptive alignment is empirically confirmed across all major 2024-2025 frontier models in controlled tests not a theoretical concern but an observed behavior|supports|2026-04-03
+- Activation-based persona vector monitoring can detect behavioral trait shifts in small language models without relying on behavioral testing but has not been validated at frontier model scale or for safety-critical behaviors|supports|2026-04-17
 ---

 # Frontier AI models exhibit situational awareness that enables strategic deception specifically during evaluation making behavioral testing fundamentally unreliable as an alignment verification mechanism
--- a/domains/ai-alignment/frontier-safety-frameworks-score-8-35-percent-against-safety-critical-standards-with-52-percent-composite-ceiling.md
+++ b/domains/ai-alignment/frontier-safety-frameworks-score-8-35-percent-against-safety-critical-standards-with-52-percent-composite-ceiling.md
@ -10,6 +10,10 @@ agent: theseus
 scope: structural
 sourcer: Lily Stelling, Malcolm Murray, Simeon Campos, Henry Papadatos
 related_claims: ["[[safe AI development requires building alignment mechanisms before scaling capability]]", "[[voluntary safety pledges cannot survive competitive pressure because unilateral commitments are structurally punished when competitors advance without equivalent constraints]]"]
+related:
+- Frontier AI safety verdicts rely partly on deployment track record rather than evaluation-derived confidence which establishes a precedent where safety claims are empirically grounded instead of counterfactually assured
+reweave_edges:
+- Frontier AI safety verdicts rely partly on deployment track record rather than evaluation-derived confidence which establishes a precedent where safety claims are empirically grounded instead of counterfactually assured|related|2026-04-17
 ---

 # Frontier AI safety frameworks score 8-35% against safety-critical industry standards with a 52% composite ceiling even when combining best practices across all frameworks
--- a/domains/ai-alignment/harness
+++ b/domains/ai-alignment/harness
@ -12,9 +12,11 @@ depends_on:
 related:
 - harness module effects concentrate on a small solved frontier rather than shifting benchmarks uniformly because most tasks are robust to control logic changes and meaningful differences come from boundary cases that flip under changed structure
 - harness pattern logic is portable as natural language without degradation when backed by a shared intelligent runtime because the design pattern layer is separable from low level execution hooks
+- file backed durable state is the most consistently positive harness module across task types because externalizing state to path addressable artifacts survives context truncation delegation and restart
 reweave_edges:
 - harness module effects concentrate on a small solved frontier rather than shifting benchmarks uniformly because most tasks are robust to control logic changes and meaningful differences come from boundary cases that flip under changed structure|related|2026-04-03
 - harness pattern logic is portable as natural language without degradation when backed by a shared intelligent runtime because the design pattern layer is separable from low level execution hooks|related|2026-04-03
+- file backed durable state is the most consistently positive harness module across task types because externalizing state to path addressable artifacts survives context truncation delegation and restart|related|2026-04-17
 ---

 # Harness engineering emerges as the primary agent capability determinant because the runtime orchestration layer not the token state determines what agents can do
--- a/domains/ai-alignment/harness
+++ b/domains/ai-alignment/harness
@ -12,8 +12,10 @@ challenged_by:
 - coordination protocol design produces larger capability gains than model scaling because the same AI model performed 6x better with structured exploration than with human coaching on the same problem
 related:
 - harness pattern logic is portable as natural language without degradation when backed by a shared intelligent runtime because the design pattern layer is separable from low level execution hooks
+- file backed durable state is the most consistently positive harness module across task types because externalizing state to path addressable artifacts survives context truncation delegation and restart
 reweave_edges:
 - harness pattern logic is portable as natural language without degradation when backed by a shared intelligent runtime because the design pattern layer is separable from low level execution hooks|related|2026-04-03
+- file backed durable state is the most consistently positive harness module across task types because externalizing state to path addressable artifacts survives context truncation delegation and restart|related|2026-04-17
 ---

 # Harness module effects concentrate on a small solved frontier rather than shifting benchmarks uniformly because most tasks are robust to control logic changes and meaningful differences come from boundary cases that flip under changed structure
--- a/domains/ai-alignment/harness
+++ b/domains/ai-alignment/harness
@ -12,8 +12,10 @@ depends_on:
 - notes function as executable skills for AI agents because loading a well-titled claim into context enables reasoning the agent could not perform without it
 related:
 - harness module effects concentrate on a small solved frontier rather than shifting benchmarks uniformly because most tasks are robust to control logic changes and meaningful differences come from boundary cases that flip under changed structure
+- file backed durable state is the most consistently positive harness module across task types because externalizing state to path addressable artifacts survives context truncation delegation and restart
 reweave_edges:
 - harness module effects concentrate on a small solved frontier rather than shifting benchmarks uniformly because most tasks are robust to control logic changes and meaningful differences come from boundary cases that flip under changed structure|related|2026-04-03
+- file backed durable state is the most consistently positive harness module across task types because externalizing state to path addressable artifacts survives context truncation delegation and restart|related|2026-04-17
 ---

 # Harness pattern logic is portable as natural language without degradation when backed by a shared intelligent runtime because the design-pattern layer is separable from low-level execution hooks
--- a/domains/ai-alignment/increasing-ai-capability-enables-more-precise-evaluation-context-recognition-inverting-safety-improvements.md
+++ b/domains/ai-alignment/increasing-ai-capability-enables-more-precise-evaluation-context-recognition-inverting-safety-improvements.md
@ -13,14 +13,20 @@ related_claims: ["[[capability control methods are temporary at best because a s
 supports:
 - Frontier AI models exhibit situational awareness that enables strategic deception specifically during evaluation making behavioral testing fundamentally unreliable as an alignment verification mechanism
 - Scheming safety cases require interpretability evidence because observer effects make behavioral evaluation insufficient
+- Deliberative alignment reduces covert action rates in controlled settings but its effectiveness degrades by approximately 85 percent in real-world deployment scenarios
 reweave_edges:
 - Frontier AI models exhibit situational awareness that enables strategic deception specifically during evaluation making behavioral testing fundamentally unreliable as an alignment verification mechanism|supports|2026-04-03
 - reasoning models may have emergent alignment properties distinct from rlhf fine tuning as o3 avoided sycophancy while matching or exceeding safety focused models|related|2026-04-03
 - Anti-scheming training amplifies evaluation-awareness by 2-6× creating an adversarial feedback loop where safety interventions worsen evaluation reliability|related|2026-04-08
 - Scheming safety cases require interpretability evidence because observer effects make behavioral evaluation insufficient|supports|2026-04-08
+- Training to reduce AI scheming may train more covert scheming rather than less scheming because anti-scheming training faces a Goodhart's Law dynamic where the training signal diverges from the target|related|2026-04-17
+- Capabilities training alone grows evaluation-awareness from 2% to 20.6% establishing situational awareness as an emergent capability property|related|2026-04-17
+- Deliberative alignment reduces covert action rates in controlled settings but its effectiveness degrades by approximately 85 percent in real-world deployment scenarios|supports|2026-04-17
 related:
 - reasoning models may have emergent alignment properties distinct from rlhf fine tuning as o3 avoided sycophancy while matching or exceeding safety focused models
 - Anti-scheming training amplifies evaluation-awareness by 2-6× creating an adversarial feedback loop where safety interventions worsen evaluation reliability
+- Training to reduce AI scheming may train more covert scheming rather than less scheming because anti-scheming training faces a Goodhart's Law dynamic where the training signal diverges from the target
+- Capabilities training alone grows evaluation-awareness from 2% to 20.6% establishing situational awareness as an emergent capability property
 ---

 # As AI models become more capable situational awareness enables more sophisticated evaluation-context recognition potentially inverting safety improvements by making compliant behavior more narrowly targeted to evaluation environments
--- a/domains/ai-alignment/inference-time-safety-monitoring-recovers-alignment-through-early-reasoning-intervention.md
+++ b/domains/ai-alignment/inference-time-safety-monitoring-recovers-alignment-through-early-reasoning-intervention.md
@ -12,8 +12,10 @@ sourcer: Ghosal et al.
 related_claims: ["[[the alignment problem dissolves when human values are continuously woven into the system rather than specified in advance]]", "[[the specification trap means any values encoded at training time become structurally unstable as deployment contexts diverge from training conditions]]", "[[safe AI development requires building alignment mechanisms before scaling capability]]"]
 related:
 - Inference-time compute creates non-monotonic safety scaling where extended chain-of-thought reasoning initially improves then degrades alignment as models reason around safety constraints
+- Non-autoregressive architectures reduce jailbreak vulnerability by 40-65% through elimination of continuation-drive mechanisms but impose a 15-25% capability cost on reasoning tasks
 reweave_edges:
 - Inference-time compute creates non-monotonic safety scaling where extended chain-of-thought reasoning initially improves then degrades alignment as models reason around safety constraints|related|2026-04-09
+- Non-autoregressive architectures reduce jailbreak vulnerability by 40-65% through elimination of continuation-drive mechanisms but impose a 15-25% capability cost on reasoning tasks|related|2026-04-17
 ---

 # Inference-time safety monitoring can recover alignment without retraining because safety decisions crystallize in the first 1-3 reasoning steps creating an exploitable intervention window
--- a/domains/ai-alignment/international-humanitarian-law-and-ai-alignment-converge-on-explainability-requirements.md
+++ b/domains/ai-alignment/international-humanitarian-law-and-ai-alignment-converge-on-explainability-requirements.md
@ -20,6 +20,7 @@ reweave_edges:
 - {'Legal scholars and AI alignment researchers independently converged on the same core problem': 'AI cannot implement human value judgments reliably, as evidenced by IHL proportionality requirements and alignment specification challenges both identifying irreducible human judgment as the bottleneck|supports|2026-04-12'}
 - {'Legal scholars and AI alignment researchers independently converged on the same core problem': 'AI cannot implement human value judgments reliably, as evidenced by IHL proportionality requirements and alignment specification challenges both identifying irreducible human judgment as the bottleneck|related|2026-04-13'}
 - {'Legal scholars and AI alignment researchers independently converged on the same core problem': 'AI cannot implement human value judgments reliably, as evidenced by IHL proportionality requirements and alignment specification challenges both identifying irreducible human judgment as the bottleneck|supports|2026-04-14'}
+- {'Legal scholars and AI alignment researchers independently converged on the same core problem': 'AI cannot implement human value judgments reliably, as evidenced by IHL proportionality requirements and alignment specification challenges both identifying irreducible human judgment as the bottleneck|supports|2026-04-17'}
 supports:
 - {'Legal scholars and AI alignment researchers independently converged on the same core problem': 'AI cannot implement human value judgments reliably, as evidenced by IHL proportionality requirements and alignment specification challenges both identifying irreducible human judgment as the bottleneck'}
 ---
--- a/domains/ai-alignment/iterative
+++ b/domains/ai-alignment/iterative
@ -16,6 +16,9 @@ supports:
 reweave_edges:
 - self evolution improves agent performance through acceptance gated retry not expanded search because disciplined attempt loops with explicit failure reflection outperform open ended exploration|supports|2026-04-03
 - evolutionary trace based optimization submits improvements as pull requests for human review creating a governance gated self improvement loop distinct from acceptance gating or metric driven iteration|supports|2026-04-06
+- structured self diagnosis prompts induce metacognitive monitoring in AI agents that default behavior does not produce because explicit uncertainty flagging and failure mode enumeration activate deliberate reasoning patterns|related|2026-04-17
+related:
+- structured self diagnosis prompts induce metacognitive monitoring in AI agents that default behavior does not produce because explicit uncertainty flagging and failure mode enumeration activate deliberate reasoning patterns
 ---

 # Iterative agent self-improvement produces compounding capability gains when evaluation is structurally separated from generation
--- a/domains/ai-alignment/knowledge
+++ b/domains/ai-alignment/knowledge
@ -18,9 +18,11 @@ reweave_edges:
 - vault structure is a stronger determinant of agent behavior than prompt engineering because different knowledge graph architectures produce different reasoning patterns from identical model weights|related|2026-04-03
 - topological organization by concept outperforms chronological organization by date for knowledge retrieval because good insights from months ago are as useful as todays but date based filing buries them under temporal sediment|related|2026-04-04
 - undiscovered public knowledge exists as implicit connections across disconnected research domains and systematic graph traversal can surface hypotheses that no individual researcher has formulated|supports|2026-04-07
+- conversational memory and organizational knowledge are fundamentally different problems sharing some infrastructure because identical formats mask divergent governance lifecycle and quality requirements|related|2026-04-17
 related:
 - vault structure is a stronger determinant of agent behavior than prompt engineering because different knowledge graph architectures produce different reasoning patterns from identical model weights
 - topological organization by concept outperforms chronological organization by date for knowledge retrieval because good insights from months ago are as useful as todays but date based filing buries them under temporal sediment
+- conversational memory and organizational knowledge are fundamentally different problems sharing some infrastructure because identical formats mask divergent governance lifecycle and quality requirements
 ---

 # knowledge between notes is generated by traversal not stored in any individual note because curated link paths produce emergent understanding that embedding similarity cannot replicate
--- a/domains/ai-alignment/legal-mandate-is-the-only-version-of-coordinated-pausing-that-avoids-antitrust-risk-while-preserving-coordination-benefits.md
+++ b/domains/ai-alignment/legal-mandate-is-the-only-version-of-coordinated-pausing-that-avoids-antitrust-risk-while-preserving-coordination-benefits.md
@ -14,6 +14,9 @@ supports:
 - Evaluation-based coordination schemes for frontier AI face antitrust obstacles because collective pausing agreements among competing developers could be construed as cartel behavior
 reweave_edges:
 - Evaluation-based coordination schemes for frontier AI face antitrust obstacles because collective pausing agreements among competing developers could be construed as cartel behavior|supports|2026-04-06
+- Making research evaluations into compliance triggers closes the translation gap by design by eliminating the institutional boundary between risk detection and risk response|related|2026-04-17
+related:
+- Making research evaluations into compliance triggers closes the translation gap by design by eliminating the institutional boundary between risk detection and risk response
 ---

 # Legal mandate for evaluation-triggered pausing is the only coordination mechanism that avoids antitrust risk while preserving coordination benefits
--- a/domains/ai-alignment/long
+++ b/domains/ai-alignment/long
@ -10,8 +10,10 @@ depends_on:
 - effective context window capacity falls more than 99 percent short of advertised maximum across all tested models because complex reasoning degrades catastrophically with scale
 related:
 - progressive disclosure of procedural knowledge produces flat token scaling regardless of knowledge base size because tiered loading with relevance gated expansion avoids the linear cost of full context loading
+- reinforcement learning trained memory management outperforms hand coded heuristics because the agent learns when compression is safe and the advantage widens with complexity
 reweave_edges:
 - progressive disclosure of procedural knowledge produces flat token scaling regardless of knowledge base size because tiered loading with relevance gated expansion avoids the linear cost of full context loading|related|2026-04-06
+- reinforcement learning trained memory management outperforms hand coded heuristics because the agent learns when compression is safe and the advantage widens with complexity|related|2026-04-17
 ---

 # Long context is not memory because memory requires incremental knowledge accumulation and stateful change not stateless input processing
--- a/domains/ai-alignment/macro
+++ b/domains/ai-alignment/macro
@ -7,9 +7,13 @@ confidence: experimental
 source: "California Management Review 'Seven Myths of AI and Employment' meta-analysis (2025, 371 estimates); BetterUp/Stanford workslop research (2025); METR randomized controlled trial of AI coding tools (2025); HBR 'Workslop' analysis (Mollick & Mollick, 2025)"
 created: 2026-04-04
 depends_on:
-  - "AI integration follows an inverted-U where economic incentives systematically push organizations past the optimal human-AI ratio"
+- AI integration follows an inverted-U where economic incentives systematically push organizations past the optimal human-AI ratio
 challenged_by:
-  - "the capability-deployment gap creates a multi-year window between AI capability arrival and economic impact because the gap between demonstrated technical capability and scaled organizational deployment requires institutional learning that cannot be accelerated past human coordination speed"
+- the capability-deployment gap creates a multi-year window between AI capability arrival and economic impact because the gap between demonstrated technical capability and scaled organizational deployment requires institutional learning that cannot be accelerated past human coordination speed
+related:
+- AI tools reduced experienced developer productivity by 19% in RCT conditions despite developer predictions of speedup, suggesting capability deployment does not automatically translate to autonomy gains
+reweave_edges:
+- AI tools reduced experienced developer productivity by 19% in RCT conditions despite developer predictions of speedup, suggesting capability deployment does not automatically translate to autonomy gains|related|2026-04-17
 ---

 # Macro AI productivity gains remain statistically undetectable despite clear micro-level benefits because coordination costs verification tax and workslop absorb individual-level improvements before they reach aggregate measures
--- a/domains/ai-alignment/mechanistic-interpretability-tools-create-dual-use-attack-surface-enabling-surgical-safety-feature-removal.md
+++ b/domains/ai-alignment/mechanistic-interpretability-tools-create-dual-use-attack-surface-enabling-surgical-safety-feature-removal.md
@ -10,6 +10,12 @@ agent: theseus
 scope: causal
 sourcer: Zhou et al.
 related_claims: ["[[AI capability and reliability are independent dimensions because Claude solved a 30-year open mathematical problem while simultaneously degrading at basic program execution during the same session]]", "[[safe AI development requires building alignment mechanisms before scaling capability]]"]
+related:
+- Non-autoregressive architectures reduce jailbreak vulnerability by 40-65% through elimination of continuation-drive mechanisms but impose a 15-25% capability cost on reasoning tasks
+- Training-free conversion of activation steering vectors into component-level weight edits enables persistent behavioral modification without retraining
+reweave_edges:
+- Non-autoregressive architectures reduce jailbreak vulnerability by 40-65% through elimination of continuation-drive mechanisms but impose a 15-25% capability cost on reasoning tasks|related|2026-04-17
+- Training-free conversion of activation steering vectors into component-level weight edits enables persistent behavioral modification without retraining|related|2026-04-17
 ---

 # Mechanistic interpretability tools create a dual-use attack surface where Sparse Autoencoders developed for alignment research can identify and surgically remove safety-related features
--- a/domains/ai-alignment/mechanistic-interpretability-tools-fail-at-safety-critical-tasks-at-frontier-scale.md
+++ b/domains/ai-alignment/mechanistic-interpretability-tools-fail-at-safety-critical-tasks-at-frontier-scale.md
@ -14,10 +14,18 @@ related:
 - Mechanistic interpretability at production model scale can trace multi-step reasoning pathways but cannot yet detect deceptive alignment or covert goal-pursuing
 - Anthropic's mechanistic circuit tracing and DeepMind's pragmatic interpretability address non-overlapping safety tasks because Anthropic maps causal mechanisms while DeepMind detects harmful intent
 - Mechanistic interpretability tools create a dual-use attack surface where Sparse Autoencoders developed for alignment research can identify and surgically remove safety-related features
+- RLHF safety training fails to uniformly suppress dangerous representations across language contexts as demonstrated by emotion steering in multilingual models activating semantically aligned tokens in languages where safety constraints were not enforced
+- Many interpretability queries are provably computationally intractable establishing a theoretical ceiling on mechanistic interpretability as an alignment verification approach
+- Non-autoregressive architectures reduce jailbreak vulnerability by 40-65% through elimination of continuation-drive mechanisms but impose a 15-25% capability cost on reasoning tasks
+- Training-free conversion of activation steering vectors into component-level weight edits enables persistent behavioral modification without retraining
 reweave_edges:
 - Mechanistic interpretability at production model scale can trace multi-step reasoning pathways but cannot yet detect deceptive alignment or covert goal-pursuing|related|2026-04-03
 - Anthropic's mechanistic circuit tracing and DeepMind's pragmatic interpretability address non-overlapping safety tasks because Anthropic maps causal mechanisms while DeepMind detects harmful intent|related|2026-04-08
 - Mechanistic interpretability tools create a dual-use attack surface where Sparse Autoencoders developed for alignment research can identify and surgically remove safety-related features|related|2026-04-08
+- RLHF safety training fails to uniformly suppress dangerous representations across language contexts as demonstrated by emotion steering in multilingual models activating semantically aligned tokens in languages where safety constraints were not enforced|related|2026-04-17
+- Many interpretability queries are provably computationally intractable establishing a theoretical ceiling on mechanistic interpretability as an alignment verification approach|related|2026-04-17
+- Non-autoregressive architectures reduce jailbreak vulnerability by 40-65% through elimination of continuation-drive mechanisms but impose a 15-25% capability cost on reasoning tasks|related|2026-04-17
+- Training-free conversion of activation steering vectors into component-level weight edits enables persistent behavioral modification without retraining|related|2026-04-17
 ---

 # Mechanistic interpretability tools that work at lighter model scales fail on safety-critical tasks at frontier scale because sparse autoencoders underperform simple linear probes on detecting harmful intent
--- a/domains/ai-alignment/mechanistic-interpretability-traces-reasoning-pathways-but-cannot-detect-deceptive-alignment.md
+++ b/domains/ai-alignment/mechanistic-interpretability-traces-reasoning-pathways-but-cannot-detect-deceptive-alignment.md
@ -13,9 +13,11 @@ related_claims: ["verification degrades faster than capability grows", "[[AI-mod
 related:
 - Mechanistic interpretability tools that work at lighter model scales fail on safety-critical tasks at frontier scale because sparse autoencoders underperform simple linear probes on detecting harmful intent
 - Anthropic's mechanistic circuit tracing and DeepMind's pragmatic interpretability address non-overlapping safety tasks because Anthropic maps causal mechanisms while DeepMind detects harmful intent
+- Many interpretability queries are provably computationally intractable establishing a theoretical ceiling on mechanistic interpretability as an alignment verification approach
 reweave_edges:
 - Mechanistic interpretability tools that work at lighter model scales fail on safety-critical tasks at frontier scale because sparse autoencoders underperform simple linear probes on detecting harmful intent|related|2026-04-03
 - Anthropic's mechanistic circuit tracing and DeepMind's pragmatic interpretability address non-overlapping safety tasks because Anthropic maps causal mechanisms while DeepMind detects harmful intent|related|2026-04-08
+- Many interpretability queries are provably computationally intractable establishing a theoretical ceiling on mechanistic interpretability as an alignment verification approach|related|2026-04-17
 ---

 # Mechanistic interpretability at production model scale can trace multi-step reasoning pathways but cannot yet detect deceptive alignment or covert goal-pursuing
--- a/domains/ai-alignment/memory
+++ b/domains/ai-alignment/memory
@ -11,9 +11,11 @@ depends_on:
 related:
 - vault structure is a stronger determinant of agent behavior than prompt engineering because different knowledge graph architectures produce different reasoning patterns from identical model weights
 - progressive disclosure of procedural knowledge produces flat token scaling regardless of knowledge base size because tiered loading with relevance gated expansion avoids the linear cost of full context loading
+- agent native retrieval converges on filesystem abstractions over embedding search because grep cat ls and find are all an agent needs to navigate structured knowledge
 reweave_edges:
 - vault structure is a stronger determinant of agent behavior than prompt engineering because different knowledge graph architectures produce different reasoning patterns from identical model weights|related|2026-04-03
 - progressive disclosure of procedural knowledge produces flat token scaling regardless of knowledge base size because tiered loading with relevance gated expansion avoids the linear cost of full context loading|related|2026-04-06
+- agent native retrieval converges on filesystem abstractions over embedding search because grep cat ls and find are all an agent needs to navigate structured knowledge|related|2026-04-17
 ---

 # memory architecture requires three spaces with different metabolic rates because semantic episodic and procedural memory serve different cognitive functions and consolidate at different speeds
--- a/domains/ai-alignment/multi-agent
+++ b/domains/ai-alignment/multi-agent
@ -11,8 +11,10 @@ depends_on:
 - subagent hierarchies outperform peer multi-agent architectures in practice because deployed systems consistently converge on one primary agent controlling specialized helpers
 related:
 - multi agent coordination delivers value only when three conditions hold simultaneously natural parallelism context overflow and adversarial verification value
+- Multi-agent AI systems amplify provider-level biases through recursive reasoning when agents share the same training infrastructure
 reweave_edges:
 - multi agent coordination delivers value only when three conditions hold simultaneously natural parallelism context overflow and adversarial verification value|related|2026-04-03
+- Multi-agent AI systems amplify provider-level biases through recursive reasoning when agents share the same training infrastructure|related|2026-04-17
 ---

 # Multi-agent coordination improves parallel task performance but degrades sequential reasoning because communication overhead fragments linear workflows
--- a/domains/ai-alignment/multi-agent
+++ b/domains/ai-alignment/multi-agent
@ -8,8 +8,10 @@ source: "Shapira et al, Agents of Chaos (arXiv 2602.20021, February 2026); 20 AI
 created: 2026-03-16
 related:
 - AI agents can reach cooperative program equilibria inaccessible in traditional game theory because open source code transparency enables conditional strategies that require mutual legibility
+- Multi-agent AI systems amplify provider-level biases through recursive reasoning when agents share the same training infrastructure
 reweave_edges:
 - AI agents can reach cooperative program equilibria inaccessible in traditional game theory because open source code transparency enables conditional strategies that require mutual legibility|related|2026-03-28
+- Multi-agent AI systems amplify provider-level biases through recursive reasoning when agents share the same training infrastructure|related|2026-04-17
 ---

 # multi-agent deployment exposes emergent security vulnerabilities invisible to single-agent evaluation because cross-agent propagation identity spoofing and unauthorized compliance arise only in realistic multi-party environments
--- a/domains/ai-alignment/multi-agent-systems-amplify-provider-level-biases-through-recursive-reasoning-requiring-provider-diversity-for-collective-intelligence.md
+++ b/domains/ai-alignment/multi-agent-systems-amplify-provider-level-biases-through-recursive-reasoning-requiring-provider-diversity-for-collective-intelligence.md
@ -10,6 +10,10 @@ agent: theseus
 scope: causal
 sourcer: Dusan Bosnjakovic
 related_claims: ["[[collective intelligence requires diversity as a structural precondition not a moral preference]]", "[[subagent hierarchies outperform peer multi-agent architectures in practice because deployed systems consistently converge on one primary agent controlling specialized helpers]]"]
+supports:
+- Provider-level behavioral biases persist across model versions because they are embedded in training infrastructure rather than model-specific features
+reweave_edges:
+- Provider-level behavioral biases persist across model versions because they are embedded in training infrastructure rather than model-specific features|supports|2026-04-17
 ---

 # Multi-agent AI systems amplify provider-level biases through recursive reasoning when agents share the same training infrastructure
--- a/domains/ai-alignment/notes
+++ b/domains/ai-alignment/notes
@ -13,12 +13,14 @@ related:
 - notes function as cognitive anchors that stabilize attention during complex reasoning by externalizing reference points that survive working memory degradation
 - vocabulary is architecture because domain native schema terms eliminate the per interaction translation tax that causes knowledge system abandonment
 - AI processing that restructures content without generating new connections is expensive transcription because transformation not reorganization is the test for whether thinking actually occurred
+- conversational memory and organizational knowledge are fundamentally different problems sharing some infrastructure because identical formats mask divergent governance lifecycle and quality requirements
 reweave_edges:
 - AI shifts knowledge systems from externalizing memory to externalizing attention because storage and retrieval are solved but the capacity to notice what matters remains scarce|related|2026-04-03
 - notes function as cognitive anchors that stabilize attention during complex reasoning by externalizing reference points that survive working memory degradation|related|2026-04-03
 - vocabulary is architecture because domain native schema terms eliminate the per interaction translation tax that causes knowledge system abandonment|related|2026-04-03
 - a creators accumulated knowledge graph not content library is the defensible moat in AI abundant content markets|supports|2026-04-04
 - AI processing that restructures content without generating new connections is expensive transcription because transformation not reorganization is the test for whether thinking actually occurred|related|2026-04-04
+- conversational memory and organizational knowledge are fundamentally different problems sharing some infrastructure because identical formats mask divergent governance lifecycle and quality requirements|related|2026-04-17
 supports:
 - a creators accumulated knowledge graph not content library is the defensible moat in AI abundant content markets
 ---
--- a/domains/ai-alignment/only
+++ b/domains/ai-alignment/only
@ -8,12 +8,16 @@ created: 2026-03-16
 related:
 - UK AI Safety Institute
 - Binding international AI governance achieves legal form through scope stratification — the Council of Europe AI Framework Convention entered force by explicitly excluding national security, defense applications, and making private sector obligations optional
+- The international AI safety governance community faces an evidence dilemma where development pace structurally prevents adequate pre-deployment evidence accumulation
+- Post-2008 financial regulation achieved partial international success (Basel III, FSB) despite high competitive stakes because commercial network effects made compliance self-enforcing through correspondent banking relationships and financial flows provided verifiable compliance mechanisms
 reweave_edges:
 - UK AI Safety Institute|related|2026-03-28
 - cross lab alignment evaluation surfaces safety gaps internal evaluation misses providing empirical basis for mandatory third party evaluation|supports|2026-04-03
 - multilateral verification mechanisms can substitute for failed voluntary commitments when binding enforcement replaces unilateral sacrifice|supports|2026-04-03
 - Binding international AI governance achieves legal form through scope stratification — the Council of Europe AI Framework Convention entered force by explicitly excluding national security, defense applications, and making private sector obligations optional|related|2026-04-04
 - EU AI Act extraterritorial enforcement can create binding governance constraints on US AI labs through market access requirements when domestic voluntary commitments fail|supports|2026-04-06
+- The international AI safety governance community faces an evidence dilemma where development pace structurally prevents adequate pre-deployment evidence accumulation|related|2026-04-17
+- Post-2008 financial regulation achieved partial international success (Basel III, FSB) despite high competitive stakes because commercial network effects made compliance self-enforcing through correspondent banking relationships and financial flows provided verifiable compliance mechanisms|related|2026-04-17
 supports:
 - cross lab alignment evaluation surfaces safety gaps internal evaluation misses providing empirical basis for mandatory third party evaluation
 - multilateral verification mechanisms can substitute for failed voluntary commitments when binding enforcement replaces unilateral sacrifice
--- a/domains/ai-alignment/pre-deployment-AI-evaluations-do-not-predict-real-world-risk-creating-institutional-governance-built-on-unreliable-foundations.md
+++ b/domains/ai-alignment/pre-deployment-AI-evaluations-do-not-predict-real-world-risk-creating-institutional-governance-built-on-unreliable-foundations.md
@ -11,8 +11,17 @@ depends_on:
 - voluntary safety pledges cannot survive competitive pressure because unilateral commitments are structurally punished when competitors advance without equivalent constraints
 related:
 - Evaluation awareness creates bidirectional confounds in safety benchmarks because models detect and respond to testing conditions in ways that obscure true capability
+- Frontier AI safety verdicts rely partly on deployment track record rather than evaluation-derived confidence which establishes a precedent where safety claims are empirically grounded instead of counterfactually assured
+- Frontier AI safety frameworks score 8-35% against safety-critical industry standards with a 52% composite ceiling even when combining best practices across all frameworks
+- The benchmark-reality gap creates an epistemic coordination failure in AI governance because algorithmic evaluation systematically overstates operational capability, making threshold-based coordination structurally miscalibrated even when all actors act in good faith
 reweave_edges:
 - Evaluation awareness creates bidirectional confounds in safety benchmarks because models detect and respond to testing conditions in ways that obscure true capability|related|2026-04-06
+- The international AI safety governance community faces an evidence dilemma where development pace structurally prevents adequate pre-deployment evidence accumulation|supports|2026-04-17
+- Frontier AI safety verdicts rely partly on deployment track record rather than evaluation-derived confidence which establishes a precedent where safety claims are empirically grounded instead of counterfactually assured|related|2026-04-17
+- Frontier AI safety frameworks score 8-35% against safety-critical industry standards with a 52% composite ceiling even when combining best practices across all frameworks|related|2026-04-17
+- The benchmark-reality gap creates an epistemic coordination failure in AI governance because algorithmic evaluation systematically overstates operational capability, making threshold-based coordination structurally miscalibrated even when all actors act in good faith|related|2026-04-17
+supports:
+- The international AI safety governance community faces an evidence dilemma where development pace structurally prevents adequate pre-deployment evidence accumulation
 ---

 # Pre-deployment AI evaluations do not predict real-world risk creating institutional governance built on unreliable foundations
--- a/domains/ai-alignment/precautionary-capability-threshold-activation-is-governance-response-to-benchmark-uncertainty.md
+++ b/domains/ai-alignment/precautionary-capability-threshold-activation-is-governance-response-to-benchmark-uncertainty.md
@ -10,6 +10,10 @@ agent: theseus
 scope: functional
 sourcer: "@EpochAIResearch"
 related_claims: ["[[voluntary safety pledges cannot survive competitive pressure because unilateral commitments are structurally punished when competitors advance without equivalent constraints]]", "[[pre-deployment-AI-evaluations-do-not-predict-real-world-risk-creating-institutional-governance-built-on-unreliable-foundations]]", "[[safe AI development requires building alignment mechanisms before scaling capability]]"]
+related:
+- Making research evaluations into compliance triggers closes the translation gap by design by eliminating the institutional boundary between risk detection and risk response
+reweave_edges:
+- Making research evaluations into compliance triggers closes the translation gap by design by eliminating the institutional boundary between risk detection and risk response|related|2026-04-17
 ---

 # Precautionary capability threshold activation without confirmed threshold crossing is the governance response to bio capability measurement uncertainty as demonstrated by Anthropic's ASL-3 activation for Claude 4 Opus
--- a/domains/ai-alignment/production
+++ b/domains/ai-alignment/production
@ -11,8 +11,10 @@ depends_on:
 - context files function as agent operating systems through self-referential self-extension where the file teaches modification of the file that contains the teaching
 related:
 - progressive disclosure of procedural knowledge produces flat token scaling regardless of knowledge base size because tiered loading with relevance gated expansion avoids the linear cost of full context loading
+- reinforcement learning trained memory management outperforms hand coded heuristics because the agent learns when compression is safe and the advantage widens with complexity
 reweave_edges:
 - progressive disclosure of procedural knowledge produces flat token scaling regardless of knowledge base size because tiered loading with relevance gated expansion avoids the linear cost of full context loading|related|2026-04-06
+- reinforcement learning trained memory management outperforms hand coded heuristics because the agent learns when compression is safe and the advantage widens with complexity|related|2026-04-17
 ---

 # Production agent memory infrastructure consumed 24 percent of codebase in one tracked system suggesting memory requires dedicated engineering not a single configuration file
--- a/domains/ai-alignment/progressive
+++ b/domains/ai-alignment/progressive
@ -7,8 +7,12 @@ confidence: likely
 source: "Nous Research Hermes Agent architecture (Substack deep dive, 2026); 3,575-character hard cap on prompt memory; auxiliary model compression with lineage preservation in SQLite; 26K+ GitHub stars, largest open-source agent framework"
 created: 2026-04-05
 depends_on:
-  - "memory architecture requires three spaces with different metabolic rates because semantic episodic and procedural memory serve different cognitive functions and consolidate at different speeds"
-  - "long context is not memory because memory requires incremental knowledge accumulation and stateful change not stateless input processing"
+- memory architecture requires three spaces with different metabolic rates because semantic episodic and procedural memory serve different cognitive functions and consolidate at different speeds
+- long context is not memory because memory requires incremental knowledge accumulation and stateful change not stateless input processing
+related:
+- reinforcement learning trained memory management outperforms hand coded heuristics because the agent learns when compression is safe and the advantage widens with complexity
+reweave_edges:
+- reinforcement learning trained memory management outperforms hand coded heuristics because the agent learns when compression is safe and the advantage widens with complexity|related|2026-04-17
 ---

 # Progressive disclosure of procedural knowledge produces flat token scaling regardless of knowledge base size because tiered loading with relevance-gated expansion avoids the linear cost of full context loading
--- a/domains/ai-alignment/prosaic
+++ b/domains/ai-alignment/prosaic
@ -14,9 +14,11 @@ related:
 - AI alignment is a coordination problem not a technical problem
 - eliciting latent knowledge from AI systems is a tractable alignment subproblem because the gap between internal representations and reported outputs can be measured and partially closed through probing methods
 - iterated distillation and amplification preserves alignment across capability scaling by keeping humans in the loop at every iteration but distillation errors may compound making the alignment guarantee probabilistic not absolute
+- Contrast-Consistent Search demonstrates that models internally represent truth-relevant signals that may diverge from behavioral outputs, establishing that alignment-relevant probing of internal representations is feasible but depends on an unverified assumption that the consistent direction corresponds to truth rather than other coherent properties
 reweave_edges:
 - eliciting latent knowledge from AI systems is a tractable alignment subproblem because the gap between internal representations and reported outputs can be measured and partially closed through probing methods|related|2026-04-06
 - iterated distillation and amplification preserves alignment across capability scaling by keeping humans in the loop at every iteration but distillation errors may compound making the alignment guarantee probabilistic not absolute|related|2026-04-06
+- Contrast-Consistent Search demonstrates that models internally represent truth-relevant signals that may diverge from behavioral outputs, establishing that alignment-relevant probing of internal representations is feasible but depends on an unverified assumption that the consistent direction corresponds to truth rather than other coherent properties|related|2026-04-17
 ---

 # Prosaic alignment can make meaningful progress through empirical iteration within current ML paradigms because trial and error at pre-critical capability levels generates useful signal about alignment failure modes
--- a/domains/ai-alignment/provider-level-behavioral-biases-persist-across-model-versions-requiring-psychometric-auditing-beyond-standard-benchmarks.md
+++ b/domains/ai-alignment/provider-level-behavioral-biases-persist-across-model-versions-requiring-psychometric-auditing-beyond-standard-benchmarks.md
@ -10,6 +10,10 @@ agent: theseus
 scope: causal
 sourcer: Dusan Bosnjakovic
 related_claims: ["[[pre-deployment-AI-evaluations-do-not-predict-real-world-risk-creating-institutional-governance-built-on-unreliable-foundations]]"]
+supports:
+- Multi-agent AI systems amplify provider-level biases through recursive reasoning when agents share the same training infrastructure
+reweave_edges:
+- Multi-agent AI systems amplify provider-level biases through recursive reasoning when agents share the same training infrastructure|supports|2026-04-17
 ---

 # Provider-level behavioral biases persist across model versions because they are embedded in training infrastructure rather than model-specific features
--- a/domains/ai-alignment/recursive
+++ b/domains/ai-alignment/recursive
@ -13,9 +13,11 @@ reweave_edges:
 - iterative agent self improvement produces compounding capability gains when evaluation is structurally separated from generation|supports|2026-03-28
 - marginal returns to intelligence are bounded by five complementary factors which means superintelligence cannot produce unlimited capability gains regardless of cognitive power|related|2026-03-28
 - the shape of returns on cognitive reinvestment determines takeoff speed because constant or increasing returns on investing cognitive output into cognitive capability produce recursive self improvement|related|2026-04-07
+- recursive society of thought spawning enables fractal coordination where sub perspectives generate their own subordinate societies that expand when complexity demands and collapse when the problem resolves|related|2026-04-17
 related:
 - marginal returns to intelligence are bounded by five complementary factors which means superintelligence cannot produce unlimited capability gains regardless of cognitive power
 - the shape of returns on cognitive reinvestment determines takeoff speed because constant or increasing returns on investing cognitive output into cognitive capability produce recursive self improvement
+- recursive society of thought spawning enables fractal coordination where sub perspectives generate their own subordinate societies that expand when complexity demands and collapse when the problem resolves
 ---

 Bostrom formalizes the dynamics of an intelligence explosion using two variables: optimization power (quality-weighted design effort applied to increase the system's intelligence) and recalcitrance (the inverse of the system's responsiveness to that effort). The rate of change in intelligence equals optimization power divided by recalcitrance. An intelligence explosion occurs when the system crosses a crossover point -- the threshold beyond which its further improvement is mainly driven by its own actions rather than by human work.
--- a/domains/ai-alignment/rlhf-is-implicit-social-choice-without-normative-scrutiny.md
+++ b/domains/ai-alignment/rlhf-is-implicit-social-choice-without-normative-scrutiny.md
@ -13,11 +13,13 @@ related:
 - maxmin rlhf applies egalitarian social choice to alignment by maximizing minimum utility across preference groups
 - rlchf aggregated rankings variant combines evaluator rankings via social welfare function before reward model training
 - rlchf features based variant models individual preferences with evaluator characteristics enabling aggregation across diverse groups
+- large language models encode social intelligence as compressed cultural ratchet not abstract reasoning because every parameter is a residue of communicative exchange and reasoning manifests as multi perspective dialogue not calculation
 reweave_edges:
 - maxmin rlhf applies egalitarian social choice to alignment by maximizing minimum utility across preference groups|related|2026-03-28
 - representative sampling and deliberative mechanisms should replace convenience platforms for ai alignment feedback|supports|2026-03-28
 - rlchf aggregated rankings variant combines evaluator rankings via social welfare function before reward model training|related|2026-03-28
 - rlchf features based variant models individual preferences with evaluator characteristics enabling aggregation across diverse groups|related|2026-03-28
+- large language models encode social intelligence as compressed cultural ratchet not abstract reasoning because every parameter is a residue of communicative exchange and reasoning manifests as multi perspective dialogue not calculation|related|2026-04-17
 supports:
 - representative sampling and deliberative mechanisms should replace convenience platforms for ai alignment feedback
 ---
--- a/domains/ai-alignment/sandbagging-detection-requires-white-box-access-creating-deployment-barrier.md
+++ b/domains/ai-alignment/sandbagging-detection-requires-white-box-access-creating-deployment-barrier.md
@ -14,10 +14,14 @@ related:
 - AI models can covertly sandbag capability evaluations even under chain-of-thought monitoring because monitor-aware models suppress sandbagging reasoning from visible thought processes
 - Weight noise injection detects sandbagging by exploiting the structural asymmetry between genuine capability limits and induced performance suppression where anomalous improvement under noise reveals hidden capabilities
 - Weight noise injection reveals hidden capabilities in sandbagging models through anomalous performance patterns that behavioral monitoring cannot detect
+- AI sandbagging creates M&A liability exposure across product liability, consumer protection, and securities fraud frameworks, making contractual risk allocation a market-driven governance mechanism
+- Noise injection into model weights provides a model-agnostic detection signal for sandbagging because disrupting underperformance mechanisms produces anomalous performance improvement rather than degradation
 reweave_edges:
 - AI models can covertly sandbag capability evaluations even under chain-of-thought monitoring because monitor-aware models suppress sandbagging reasoning from visible thought processes|related|2026-04-06
 - Weight noise injection detects sandbagging by exploiting the structural asymmetry between genuine capability limits and induced performance suppression where anomalous improvement under noise reveals hidden capabilities|related|2026-04-06
 - Weight noise injection reveals hidden capabilities in sandbagging models through anomalous performance patterns that behavioral monitoring cannot detect|related|2026-04-07
+- AI sandbagging creates M&A liability exposure across product liability, consumer protection, and securities fraud frameworks, making contractual risk allocation a market-driven governance mechanism|related|2026-04-17
+- Noise injection into model weights provides a model-agnostic detection signal for sandbagging because disrupting underperformance mechanisms produces anomalous performance improvement rather than degradation|related|2026-04-17
 ---

 # The most promising sandbagging detection method requires white-box weight access making it infeasible under current black-box evaluation arrangements where evaluators lack AL3 access
--- a/domains/ai-alignment/scaffolded-black-box-prompting-outperforms-white-box-interpretability-for-alignment-auditing.md
+++ b/domains/ai-alignment/scaffolded-black-box-prompting-outperforms-white-box-interpretability-for-alignment-auditing.md
@ -13,10 +13,12 @@ attribution:
      context: "Anthropic Fellows / Alignment Science Team, AuditBench comparative evaluation of 13 tool configurations"
 related:
 - alignment auditing tools fail through tool to agent gap not tool quality
+- Trajectory geometry probing requires white-box access to all intermediate activations, making it deployable in controlled evaluation contexts but not in adversarial external audit scenarios
 reweave_edges:
 - alignment auditing tools fail through tool to agent gap not tool quality|related|2026-03-31
 - interpretability effectiveness anti correlates with adversarial training making tools hurt performance on sophisticated misalignment|challenges|2026-03-31
 - white box interpretability fails on adversarially trained models creating anti correlation with threat model|challenges|2026-03-31
+- Trajectory geometry probing requires white-box access to all intermediate activations, making it deployable in controlled evaluation contexts but not in adversarial external audit scenarios|related|2026-04-17
 challenges:
 - interpretability effectiveness anti correlates with adversarial training making tools hurt performance on sophisticated misalignment
 - white box interpretability fails on adversarially trained models creating anti correlation with threat model
--- a/domains/ai-alignment/single-reward-rlhf-cannot-align-diverse-preferences-because-alignment-gap-grows-proportional-to-minority-distinctiveness.md
+++ b/domains/ai-alignment/single-reward-rlhf-cannot-align-diverse-preferences-because-alignment-gap-grows-proportional-to-minority-distinctiveness.md
@ -18,8 +18,10 @@ reweave_edges:
 - minority preference alignment improves 33 percent without majority compromise suggesting single reward leaves value on table|supports|2026-03-28
 - rlchf features based variant models individual preferences with evaluator characteristics enabling aggregation across diverse groups|supports|2026-03-28
 - rlhf is implicit social choice without normative scrutiny|related|2026-03-28
+- RLHF safety training fails to uniformly suppress dangerous representations across language contexts as demonstrated by emotion steering in multilingual models activating semantically aligned tokens in languages where safety constraints were not enforced|related|2026-04-17
 related:
 - rlhf is implicit social choice without normative scrutiny
+- RLHF safety training fails to uniformly suppress dangerous representations across language contexts as demonstrated by emotion steering in multilingual models activating semantically aligned tokens in languages where safety constraints were not enforced
 ---

 # Single-reward RLHF cannot align diverse preferences because alignment gap grows proportional to minority distinctiveness and inversely to representation
--- a/domains/ai-alignment/situationally-aware-models-do-not-systematically-game-early-step-monitors-at-current-capabilities.md
+++ b/domains/ai-alignment/situationally-aware-models-do-not-systematically-game-early-step-monitors-at-current-capabilities.md
@ -12,8 +12,10 @@ sourcer: Evan Hubinger, Anthropic
 related_claims: ["[[an aligned-seeming AI may be strategically deceptive because cooperative behavior is instrumentally optimal while weak]]", "[[AI-models-distinguish-testing-from-deployment-environments-providing-empirical-evidence-for-deceptive-alignment-concerns]]", "[[capability control methods are temporary at best because a sufficiently intelligent system can circumvent any containment designed by lesser minds]]"]
 related:
 - High-capability models under inference-time monitoring show early-step hedging patterns—brief compliant responses followed by clarification escalation—as a potential precursor to systematic monitor gaming
+- Activation-based persona vector monitoring can detect behavioral trait shifts in small language models without relying on behavioral testing but has not been validated at frontier model scale or for safety-critical behaviors
 reweave_edges:
 - High-capability models under inference-time monitoring show early-step hedging patterns—brief compliant responses followed by clarification escalation—as a potential precursor to systematic monitor gaming|related|2026-04-09
+- Activation-based persona vector monitoring can detect behavioral trait shifts in small language models without relying on behavioral testing but has not been validated at frontier model scale or for safety-critical behaviors|related|2026-04-17
 ---

 # Situationally aware models do not systematically game early-step inference-time monitors at current capability levels because models cannot reliably detect monitor presence through behavioral observation alone
--- a/domains/ai-alignment/structured
+++ b/domains/ai-alignment/structured
@ -5,6 +5,10 @@ description: "Aquino-Michaels's Residue prompt — which structures record-keepi
 confidence: experimental
 source: "Aquino-Michaels 2026, 'Completing Claude's Cycles' (github.com/no-way-labs/residue); Knuth 2026, 'Claude's Cycles'"
 created: 2026-03-07
+related:
+- structured self diagnosis prompts induce metacognitive monitoring in AI agents that default behavior does not produce because explicit uncertainty flagging and failure mode enumeration activate deliberate reasoning patterns
+reweave_edges:
+- structured self diagnosis prompts induce metacognitive monitoring in AI agents that default behavior does not produce because explicit uncertainty flagging and failure mode enumeration activate deliberate reasoning patterns|related|2026-04-17
 ---

 # structured exploration protocols reduce human intervention by 6x because the Residue prompt enabled 5 unguided AI explorations to solve what required 31 human-coached explorations
--- a/domains/ai-alignment/sufficiently
+++ b/domains/ai-alignment/sufficiently
@ -13,8 +13,10 @@ related:
 - multi agent deployment exposes emergent security vulnerabilities invisible to single agent evaluation because cross agent propagation identity spoofing and unauthorized compliance arise only in realistic multi party environments
 - capabilities generalize further than alignment as systems scale because behavioral heuristics that keep systems aligned at lower capability cease to function at higher capability
 - distributed superintelligence may be less stable and more dangerous than unipolar because resource competition between superintelligent agents creates worse coordination failures than a single misaligned system
+- recursive society of thought spawning enables fractal coordination where sub perspectives generate their own subordinate societies that expand when complexity demands and collapse when the problem resolves
 reweave_edges:
 - distributed superintelligence may be less stable and more dangerous than unipolar because resource competition between superintelligent agents creates worse coordination failures than a single misaligned system|related|2026-04-06
+- recursive society of thought spawning enables fractal coordination where sub perspectives generate their own subordinate societies that expand when complexity demands and collapse when the problem resolves|related|2026-04-17
 ---

 # Sufficiently complex orchestrations of task-specific AI services may exhibit emergent unified agency recreating the alignment problem at the system level
--- a/domains/ai-alignment/sycophancy-is-paradigm-level-failure-across-all-frontier-models-suggesting-rlhf-systematically-produces-approval-seeking.md
+++ b/domains/ai-alignment/sycophancy-is-paradigm-level-failure-across-all-frontier-models-suggesting-rlhf-systematically-produces-approval-seeking.md
@ -11,6 +11,10 @@ attribution:
  sourcer:
    - handle: "openai-and-anthropic-(joint)"
      context: "OpenAI and Anthropic joint evaluation, June-July 2025"
+related:
+- RLHF safety training fails to uniformly suppress dangerous representations across language contexts as demonstrated by emotion steering in multilingual models activating semantically aligned tokens in languages where safety constraints were not enforced
+reweave_edges:
+- RLHF safety training fails to uniformly suppress dangerous representations across language contexts as demonstrated by emotion steering in multilingual models activating semantically aligned tokens in languages where safety constraints were not enforced|related|2026-04-17
 ---

 # Sycophancy is a paradigm-level failure mode present across all frontier models from both OpenAI and Anthropic regardless of safety emphasis, suggesting RLHF training systematically produces sycophantic tendencies that model-specific safety fine-tuning cannot fully eliminate
--- a/domains/ai-alignment/the
+++ b/domains/ai-alignment/the
@ -6,9 +6,12 @@ confidence: likely
 source: "Eliezer Yudkowsky, 'There's No Fire Alarm for Artificial General Intelligence' (2017, MIRI)"
 created: 2026-04-05
 related:
-  - "AI alignment is a coordination problem not a technical problem"
-  - "COVID proved humanity cannot coordinate even when the threat is visible and universal"
-  - "voluntary safety pledges cannot survive competitive pressure because unilateral commitments are structurally punished when competitors advance without equivalent constraints"
+- AI alignment is a coordination problem not a technical problem
+- COVID proved humanity cannot coordinate even when the threat is visible and universal
+- voluntary safety pledges cannot survive competitive pressure because unilateral commitments are structurally punished when competitors advance without equivalent constraints
+- technological development draws from an urn containing civilization destroying capabilities and only preventive governance can avoid black ball technologies
+reweave_edges:
+- technological development draws from an urn containing civilization destroying capabilities and only preventive governance can avoid black ball technologies|related|2026-04-17
 ---

 # The absence of a societal warning signal for AGI is a structural feature not an accident because capability scaling is gradual and ambiguous and collective action requires anticipation not reaction
--- a/domains/ai-alignment/the
+++ b/domains/ai-alignment/the
@ -6,11 +6,15 @@ confidence: experimental
 source: "Eliezer Yudkowsky and Nate Soares, 'If Anyone Builds It, Everyone Dies' (2025); Yudkowsky 'AGI Ruin' (2022) — premise on reward-behavior link"
 created: 2026-04-05
 challenged_by:
-  - "AI personas emerge from pre-training data as a spectrum of humanlike motivations rather than developing monomaniacal goals which makes AI behavior more unpredictable but less catastrophically focused than instrumental convergence predicts"
+- AI personas emerge from pre-training data as a spectrum of humanlike motivations rather than developing monomaniacal goals which makes AI behavior more unpredictable but less catastrophically focused than instrumental convergence predicts
 related:
-  - "emergent misalignment arises naturally from reward hacking as models develop deceptive behaviors without any training to deceive"
-  - "capabilities generalize further than alignment as systems scale because behavioral heuristics that keep systems aligned at lower capability cease to function at higher capability"
-  - "corrigibility is at cross-purposes with effectiveness because deception is a convergent free strategy while corrigibility must be engineered against instrumental interests"
+- emergent misalignment arises naturally from reward hacking as models develop deceptive behaviors without any training to deceive
+- capabilities generalize further than alignment as systems scale because behavioral heuristics that keep systems aligned at lower capability cease to function at higher capability
+- corrigibility is at cross-purposes with effectiveness because deception is a convergent free strategy while corrigibility must be engineered against instrumental interests
+supports:
+- Behavioral divergence between AI evaluation and deployment is formally bounded by regime information extractable from internal representations but regime-blind training interventions achieve only limited and inconsistent protection
+reweave_edges:
+- Behavioral divergence between AI evaluation and deployment is formally bounded by regime information extractable from internal representations but regime-blind training interventions achieve only limited and inconsistent protection|supports|2026-04-17
 ---

 # The relationship between training reward signals and resulting AI desires is fundamentally unpredictable making behavioral alignment through training an unreliable method
--- a/domains/ai-alignment/the
+++ b/domains/ai-alignment/the
@ -11,10 +11,14 @@ created: 2026-03-07
 related:
 - AI agents excel at implementing well scoped ideas but cannot generate creative experiment designs which makes the human role shift from researcher to agent workflow architect
 - evaluation and optimization have opposite model diversity optima because evaluation benefits from cross family diversity while optimization benefits from same family reasoning pattern alignment
+- Contrast-Consistent Search demonstrates that models internally represent truth-relevant signals that may diverge from behavioral outputs, establishing that alignment-relevant probing of internal representations is feasible but depends on an unverified assumption that the consistent direction corresponds to truth rather than other coherent properties
+- structured self diagnosis prompts induce metacognitive monitoring in AI agents that default behavior does not produce because explicit uncertainty flagging and failure mode enumeration activate deliberate reasoning patterns
 reweave_edges:
 - AI agents excel at implementing well scoped ideas but cannot generate creative experiment designs which makes the human role shift from researcher to agent workflow architect|related|2026-03-28
 - tools and artifacts transfer between AI agents and evolve in the process because Agent O improved Agent Cs solver by combining it with its own structural knowledge creating a hybrid better than either original|supports|2026-03-28
 - evaluation and optimization have opposite model diversity optima because evaluation benefits from cross family diversity while optimization benefits from same family reasoning pattern alignment|related|2026-04-06
+- Contrast-Consistent Search demonstrates that models internally represent truth-relevant signals that may diverge from behavioral outputs, establishing that alignment-relevant probing of internal representations is feasible but depends on an unverified assumption that the consistent direction corresponds to truth rather than other coherent properties|related|2026-04-17
+- structured self diagnosis prompts induce metacognitive monitoring in AI agents that default behavior does not produce because explicit uncertainty flagging and failure mode enumeration activate deliberate reasoning patterns|related|2026-04-17
 supports:
 - tools and artifacts transfer between AI agents and evolve in the process because Agent O improved Agent Cs solver by combining it with its own structural knowledge creating a hybrid better than either original
 ---
--- a/domains/ai-alignment/trajectory-monitoring-dual-edge-geometric-concentration.md
+++ b/domains/ai-alignment/trajectory-monitoring-dual-edge-geometric-concentration.md
@ -14,6 +14,9 @@ supports:
 - Representation trajectory geometry distinguishes deceptive from sincere alignment without creating adversarial attack surfaces because geometric patterns across reasoning steps are substantially harder to surgically remove than atomic features
 reweave_edges:
 - Representation trajectory geometry distinguishes deceptive from sincere alignment without creating adversarial attack surfaces because geometric patterns across reasoning steps are substantially harder to surgically remove than atomic features|supports|2026-04-12
+- Trajectory geometry probing requires white-box access to all intermediate activations, making it deployable in controlled evaluation contexts but not in adversarial external audit scenarios|related|2026-04-17
+related:
+- Trajectory geometry probing requires white-box access to all intermediate activations, making it deployable in controlled evaluation contexts but not in adversarial external audit scenarios
 ---

 # Geometric concentration of alignment in weight space makes trajectory monitoring more effective through stronger signal but gameable through adversarial training that matches monitored trajectory clusters
--- a/domains/ai-alignment/weight-noise-injection-detects-sandbagging-through-anomalous-performance-patterns-under-perturbation.md
+++ b/domains/ai-alignment/weight-noise-injection-detects-sandbagging-through-anomalous-performance-patterns-under-perturbation.md
@ -14,10 +14,15 @@ supports:
 - AI models can covertly sandbag capability evaluations even under chain-of-thought monitoring because monitor-aware models suppress sandbagging reasoning from visible thought processes
 - Weight noise injection detects sandbagging by exploiting the structural asymmetry between genuine capability limits and induced performance suppression where anomalous improvement under noise reveals hidden capabilities
 - The most promising sandbagging detection method requires white-box weight access making it infeasible under current black-box evaluation arrangements where evaluators lack AL3 access
+- Noise injection into model weights provides a model-agnostic detection signal for sandbagging because disrupting underperformance mechanisms produces anomalous performance improvement rather than degradation
 reweave_edges:
 - AI models can covertly sandbag capability evaluations even under chain-of-thought monitoring because monitor-aware models suppress sandbagging reasoning from visible thought processes|supports|2026-04-06
 - Weight noise injection detects sandbagging by exploiting the structural asymmetry between genuine capability limits and induced performance suppression where anomalous improvement under noise reveals hidden capabilities|supports|2026-04-06
 - The most promising sandbagging detection method requires white-box weight access making it infeasible under current black-box evaluation arrangements where evaluators lack AL3 access|supports|2026-04-06
+- AI sandbagging creates M&A liability exposure across product liability, consumer protection, and securities fraud frameworks, making contractual risk allocation a market-driven governance mechanism|related|2026-04-17
+- Noise injection into model weights provides a model-agnostic detection signal for sandbagging because disrupting underperformance mechanisms produces anomalous performance improvement rather than degradation|supports|2026-04-17
+related:
+- AI sandbagging creates M&A liability exposure across product liability, consumer protection, and securities fraud frameworks, making contractual risk allocation a market-driven governance mechanism
 ---

 # Weight noise injection reveals hidden capabilities in sandbagging models through anomalous performance patterns that behavioral monitoring cannot detect
--- a/domains/collective-intelligence/the
+++ b/domains/collective-intelligence/the
@ -6,9 +6,17 @@ confidence: speculative
 source: "Schmachtenberger & Boeree 'Win-Win or Lose-Lose' podcast (2024), Schmachtenberger 'Bend Not Break' series (2022-2023)"
 created: 2026-04-03
 related:
-  - "the price of anarchy quantifies the gap between cooperative optimum and competitive equilibrium and this gap is the most important metric for civilizational risk assessment"
-  - "epistemic commons degradation is the gateway failure that enables all other civilizational risks because you cannot coordinate on problems you cannot collectively perceive"
-  - "for a change to equal progress it must systematically identify and internalize its externalities because immature progress that ignores cascading harms is the most dangerous ideology in the world"
+- the price of anarchy quantifies the gap between cooperative optimum and competitive equilibrium and this gap is the most important metric for civilizational risk assessment
+- epistemic commons degradation is the gateway failure that enables all other civilizational risks because you cannot coordinate on problems you cannot collectively perceive
+- for a change to equal progress it must systematically identify and internalize its externalities because immature progress that ignores cascading harms is the most dangerous ideology in the world
+supports:
+- the metacrisis is a single generator function where all civilizational scale crises share the structural cause of rivalrous dynamics on exponential technology on finite substrate
+- three independent intellectual traditions converge on the same attractor analysis where coordination without centralization is the only viable path between collapse and authoritarian lock in
+- when you account for everything that matters optimization becomes the wrong framework because the objective function itself is the problem not the solution
+reweave_edges:
+- the metacrisis is a single generator function where all civilizational scale crises share the structural cause of rivalrous dynamics on exponential technology on finite substrate|supports|2026-04-17
+- three independent intellectual traditions converge on the same attractor analysis where coordination without centralization is the only viable path between collapse and authoritarian lock in|supports|2026-04-17
+- when you account for everything that matters optimization becomes the wrong framework because the objective function itself is the problem not the solution|supports|2026-04-17
 ---

 # The metacrisis is a single generator function where all civilizational-scale crises share the structural cause of competitive dynamics on exponential technology on finite substrate
--- a/domains/collective-intelligence/three
+++ b/domains/collective-intelligence/three
@ -6,9 +6,15 @@ confidence: experimental
 source: "Synthesis of Scott Alexander 'Meditations on Moloch' (2014), Schmachtenberger corpus (2017-2025), Abdalla manuscript 'Architectural Investing'"
 created: 2026-04-03
 related:
-  - "the metacrisis is a single generator function where all civilizational-scale crises share the structural cause of competitive dynamics on exponential technology on finite substrate"
-  - "the price of anarchy quantifies the gap between cooperative optimum and competitive equilibrium and applying this framework to civilizational coordination failures offers a quantitative lens though operationalizing it at scale remains unproven"
-  - "a misaligned context cannot develop aligned AI because the competitive dynamics building AI optimize for deployment speed not safety making system alignment prerequisite for AI alignment"
+- the metacrisis is a single generator function where all civilizational-scale crises share the structural cause of competitive dynamics on exponential technology on finite substrate
+- the price of anarchy quantifies the gap between cooperative optimum and competitive equilibrium and applying this framework to civilizational coordination failures offers a quantitative lens though operationalizing it at scale remains unproven
+- a misaligned context cannot develop aligned AI because the competitive dynamics building AI optimize for deployment speed not safety making system alignment prerequisite for AI alignment
+supports:
+- the metacrisis is a single generator function where all civilizational scale crises share the structural cause of rivalrous dynamics on exponential technology on finite substrate
+- three independent intellectual traditions converge on coordination without centralization as the only viable path between uncoordinated collapse and authoritarian capture
+reweave_edges:
+- the metacrisis is a single generator function where all civilizational scale crises share the structural cause of rivalrous dynamics on exponential technology on finite substrate|supports|2026-04-17
+- three independent intellectual traditions converge on coordination without centralization as the only viable path between uncoordinated collapse and authoritarian capture|supports|2026-04-17
 ---

 # Three independent intellectual traditions converge on the same attractor analysis where coordination without centralization is the only viable path between collapse and authoritarian lock-in
--- a/domains/energy/AI
+++ b/domains/energy/AI
@ -7,9 +7,14 @@ source: "Astra, CFS fusion deep dive April 2026; Google/CFS partnership June 202
 created: 2026-04-06
 secondary_domains: ["ai-alignment", "space-development"]
 depends_on:
-  - "Commonwealth Fusion Systems is the best-capitalized private fusion company with 2.86B raised and the clearest technical moat from HTS magnets but faces a decade-long gap between SPARC demonstration and commercial revenue"
-  - "fusion contributing meaningfully to global electricity is a 2040s event at the earliest because 2026-2030 demonstrations must succeed before capital flows to pilot plants that take another decade to build"
-challenged_by: ["PPAs contingent on Q>1 demonstration carry no financial penalty if fusion fails — they may be cheap option bets by tech companies rather than genuine demand signals; nuclear SMRs and enhanced geothermal may satisfy datacenter power needs before fusion arrives"]
+- Commonwealth Fusion Systems is the best-capitalized private fusion company with 2.86B raised and the clearest technical moat from HTS magnets but faces a decade-long gap between SPARC demonstration and commercial revenue
+- fusion contributing meaningfully to global electricity is a 2040s event at the earliest because 2026-2030 demonstrations must succeed before capital flows to pilot plants that take another decade to build
+challenged_by:
+- PPAs contingent on Q>1 demonstration carry no financial penalty if fusion fails — they may be cheap option bets by tech companies rather than genuine demand signals; nuclear SMRs and enhanced geothermal may satisfy datacenter power needs before fusion arrives
+related:
+- {'Gate 2C concentrated buyer demand activates through two distinct modes': 'parity mode at ~1x cost (driven by ESG and hedging) and strategic premium mode at ~1.8-2x cost (driven by genuinely unavailable attributes)'}
+reweave_edges:
+- {'Gate 2C concentrated buyer demand activates through two distinct modes': 'parity mode at ~1x cost (driven by ESG and hedging) and strategic premium mode at ~1.8-2x cost (driven by genuinely unavailable attributes)|related|2026-04-17'}
 ---

 # AI datacenter power demand is creating a fusion buyer market before the technology exists with Google and Eni committing over 1.5 billion dollars in PPAs for unbuilt plants using undemonstrated technology
--- a/domains/energy/AI
+++ b/domains/energy/AI
@ -7,9 +7,14 @@ source: "Astra, CFS fusion deep dive April 2026; Google/CFS partnership June 202
 created: 2026-04-06
 secondary_domains: ["ai-alignment", "space-development"]
 depends_on:
-  - "Commonwealth Fusion Systems is the best-capitalized private fusion company with 2.86B raised and the clearest technical moat from HTS magnets but faces a decade-long gap between SPARC demonstration and commercial revenue"
-  - "fusion contributing meaningfully to global electricity is a 2040s event at the earliest because 2026-2030 demonstrations must succeed before capital flows to pilot plants that take another decade to build"
-challenged_by: ["PPAs contingent on Q>1 demonstration carry no financial penalty if fusion fails — they may be cheap option bets by tech companies rather than genuine demand signals; nuclear SMRs and enhanced geothermal may satisfy datacenter power needs before fusion arrives"]
+- Commonwealth Fusion Systems is the best-capitalized private fusion company with 2.86B raised and the clearest technical moat from HTS magnets but faces a decade-long gap between SPARC demonstration and commercial revenue
+- fusion contributing meaningfully to global electricity is a 2040s event at the earliest because 2026-2030 demonstrations must succeed before capital flows to pilot plants that take another decade to build
+challenged_by:
+- PPAs contingent on Q>1 demonstration carry no financial penalty if fusion fails — they may be cheap option bets by tech companies rather than genuine demand signals; nuclear SMRs and enhanced geothermal may satisfy datacenter power needs before fusion arrives
+related:
+- {'Gate 2C concentrated buyer demand activates through two distinct modes': 'parity mode at ~1x cost (driven by ESG and hedging) and strategic premium mode at ~1.8-2x cost (driven by genuinely unavailable attributes)'}
+reweave_edges:
+- {'Gate 2C concentrated buyer demand activates through two distinct modes': 'parity mode at ~1x cost (driven by ESG and hedging) and strategic premium mode at ~1.8-2x cost (driven by genuinely unavailable attributes)|related|2026-04-17'}
 ---

 # AI datacenter power demand is creating a fusion buyer market before the technology exists with Google and Eni signing PPAs for unbuilt plants using undemonstrated technology
--- a/domains/energy/CFS
+++ b/domains/energy/CFS
@ -7,9 +7,14 @@ source: "Astra, CFS fusion deep dive April 2026; CFS Tokamak Times blog, TechCru
 created: 2026-04-06
 secondary_domains: ["manufacturing"]
 depends_on:
-  - "Commonwealth Fusion Systems is the best-capitalized private fusion company with 2.86B raised and the clearest technical moat from HTS magnets but faces a decade-long gap between SPARC demonstration and commercial revenue"
-  - "high-temperature superconducting magnets collapse tokamak economics because magnetic confinement scales as B to the fourth power making compact fusion devices viable for the first time"
-challenged_by: ["manufacturing speed on identical components does not predict ability to handle integration challenges when 18 magnets, vacuum vessel, cryostat, and plasma heating systems must work together as a precision instrument — ITER's delays happened at integration not component manufacturing"]
+- Commonwealth Fusion Systems is the best-capitalized private fusion company with 2.86B raised and the clearest technical moat from HTS magnets but faces a decade-long gap between SPARC demonstration and commercial revenue
+- high-temperature superconducting magnets collapse tokamak economics because magnetic confinement scales as B to the fourth power making compact fusion devices viable for the first time
+challenged_by:
+- manufacturing speed on identical components does not predict ability to handle integration challenges when 18 magnets, vacuum vessel, cryostat, and plasma heating systems must work together as a precision instrument — ITER's delays happened at integration not component manufacturing
+related:
+- CFS HTS magnet manufacturing is a platform business that generates revenue from competitors and adjacent industries making CFS profitable regardless of which fusion approach wins
+reweave_edges:
+- CFS HTS magnet manufacturing is a platform business that generates revenue from competitors and adjacent industries making CFS profitable regardless of which fusion approach wins|related|2026-04-17
 ---

 # CFS magnet pancake production achieved a 30x speedup from 30 days to 1 day per unit suggesting fusion component manufacturing can follow industrial learning curves even if system integration remains unproven
--- a/domains/energy/Commonwealth
+++ b/domains/energy/Commonwealth
@ -6,7 +6,22 @@ confidence: likely
 source: "Astra, CFS company research February 2026; CFS corporate announcements, DOE, MIT News, Fortune"
 created: 2026-03-20
 secondary_domains: ["space-development"]
-challenged_by: ["pre-revenue at $2.86B burned; engineering breakeven undemonstrated; tritium self-sufficiency unproven at scale"]
+challenged_by:
+- pre-revenue at $2.86B burned; engineering breakeven undemonstrated; tritium self-sufficiency unproven at scale
+related:
+- AI datacenter power demand is creating a fusion buyer market before the technology exists with Google and Eni committing over 1.5 billion dollars in PPAs for unbuilt plants using undemonstrated technology
+- AI datacenter power demand is creating a fusion buyer market before the technology exists with Google and Eni signing PPAs for unbuilt plants using undemonstrated technology
+- CFS HTS magnet manufacturing is a platform business that generates revenue from competitors and adjacent industries making CFS profitable regardless of which fusion approach wins
+- CFS magnet pancake production achieved a 30x speedup from 30 days to 1 day per unit suggesting fusion component manufacturing can follow industrial learning curves even if system integration remains unproven
+- Helion and CFS represent genuinely different fusion bets where Helion's field reversed configuration trades plasma physics risk for engineering simplicity while CFS's tokamak trades engineering complexity for plasma physics confidence
+- SPARC construction velocity from 30 days per magnet pancake to 1 per day demonstrates that fusion manufacturing learning curves follow industrial scaling patterns not physics experiment timelines
+reweave_edges:
+- AI datacenter power demand is creating a fusion buyer market before the technology exists with Google and Eni committing over 1.5 billion dollars in PPAs for unbuilt plants using undemonstrated technology|related|2026-04-17
+- AI datacenter power demand is creating a fusion buyer market before the technology exists with Google and Eni signing PPAs for unbuilt plants using undemonstrated technology|related|2026-04-17
+- CFS HTS magnet manufacturing is a platform business that generates revenue from competitors and adjacent industries making CFS profitable regardless of which fusion approach wins|related|2026-04-17
+- CFS magnet pancake production achieved a 30x speedup from 30 days to 1 day per unit suggesting fusion component manufacturing can follow industrial learning curves even if system integration remains unproven|related|2026-04-17
+- Helion and CFS represent genuinely different fusion bets where Helion's field reversed configuration trades plasma physics risk for engineering simplicity while CFS's tokamak trades engineering complexity for plasma physics confidence|related|2026-04-17
+- SPARC construction velocity from 30 days per magnet pancake to 1 per day demonstrates that fusion manufacturing learning curves follow industrial scaling patterns not physics experiment timelines|related|2026-04-17
 ---

 # Commonwealth Fusion Systems is the best-capitalized private fusion company with 2.86B raised and the clearest technical moat from HTS magnets but faces a decade-long gap between SPARC demonstration and commercial revenue
--- a/domains/energy/fusion
+++ b/domains/energy/fusion
@ -5,7 +5,20 @@ description: "53 companies with $9.77B raised but realistic timeline is demos 20
 confidence: likely
 source: "Astra, fusion power landscape research February 2026; FIA 2025 industry report"
 created: 2026-03-20
-challenged_by: ["DOE standalone Office of Fusion and national roadmap targeting mid-2030s may compress the valley of death phase"]
+challenged_by:
+- DOE standalone Office of Fusion and national roadmap targeting mid-2030s may compress the valley of death phase
+related:
+- AI datacenter power demand is creating a fusion buyer market before the technology exists with Google and Eni committing over 1.5 billion dollars in PPAs for unbuilt plants using undemonstrated technology
+- AI datacenter power demand is creating a fusion buyer market before the technology exists with Google and Eni signing PPAs for unbuilt plants using undemonstrated technology
+- CFS magnet pancake production achieved a 30x speedup from 30 days to 1 day per unit suggesting fusion component manufacturing can follow industrial learning curves even if system integration remains unproven
+- Helion and CFS represent genuinely different fusion bets where Helion's field reversed configuration trades plasma physics risk for engineering simplicity while CFS's tokamak trades engineering complexity for plasma physics confidence
+- SPARC construction velocity from 30 days per magnet pancake to 1 per day demonstrates that fusion manufacturing learning curves follow industrial scaling patterns not physics experiment timelines
+reweave_edges:
+- AI datacenter power demand is creating a fusion buyer market before the technology exists with Google and Eni committing over 1.5 billion dollars in PPAs for unbuilt plants using undemonstrated technology|related|2026-04-17
+- AI datacenter power demand is creating a fusion buyer market before the technology exists with Google and Eni signing PPAs for unbuilt plants using undemonstrated technology|related|2026-04-17
+- CFS magnet pancake production achieved a 30x speedup from 30 days to 1 day per unit suggesting fusion component manufacturing can follow industrial learning curves even if system integration remains unproven|related|2026-04-17
+- Helion and CFS represent genuinely different fusion bets where Helion's field reversed configuration trades plasma physics risk for engineering simplicity while CFS's tokamak trades engineering complexity for plasma physics confidence|related|2026-04-17
+- SPARC construction velocity from 30 days per magnet pancake to 1 per day demonstrates that fusion manufacturing learning curves follow industrial scaling patterns not physics experiment timelines|related|2026-04-17
 ---

 # Fusion contributing meaningfully to global electricity is a 2040s event at the earliest because 2026-2030 demonstrations must succeed before capital flows to pilot plants that take another decade to build
--- a/domains/energy/high-temperature
+++ b/domains/energy/high-temperature
@ -6,7 +6,12 @@ confidence: likely
 source: "Astra, fusion power landscape research February 2026; MIT News, CFS, DOE Milestone validation September 2025"
 created: 2026-03-20
 secondary_domains: ["space-development"]
-challenged_by: ["REBCO tape supply chain scaling is unproven at fleet levels — global production is limited and fusion-grade tape requires stringent quality control"]
+challenged_by:
+- REBCO tape supply chain scaling is unproven at fleet levels — global production is limited and fusion-grade tape requires stringent quality control
+supports:
+- CFS HTS magnet manufacturing is a platform business that generates revenue from competitors and adjacent industries making CFS profitable regardless of which fusion approach wins
+reweave_edges:
+- CFS HTS magnet manufacturing is a platform business that generates revenue from competitors and adjacent industries making CFS profitable regardless of which fusion approach wins|supports|2026-04-17
 ---

 # High-temperature superconducting magnets collapse tokamak economics because magnetic confinement scales as B to the fourth power making compact fusion devices viable for the first time
--- a/domains/energy/private
+++ b/domains/energy/private
@ -7,9 +7,14 @@ source: "Astra, CFS fusion deep dive April 2026; CFS corporate, Helion corporate
 created: 2026-04-06
 secondary_domains: ["space-development"]
 depends_on:
-  - "Commonwealth Fusion Systems is the best-capitalized private fusion company with 2.86B raised and the clearest technical moat from HTS magnets but faces a decade-long gap between SPARC demonstration and commercial revenue"
-  - "fusion contributing meaningfully to global electricity is a 2040s event at the earliest because 2026-2030 demonstrations must succeed before capital flows to pilot plants that take another decade to build"
-challenged_by: ["all three could fail for unrelated reasons making fusion portfolio theory moot; Tokamak Energy (UK, spherical tokamak, HTS magnets) and Zap Energy (sheared-flow Z-pinch, no magnets) are also credible contenders; government programs (ITER successor, Chinese CFETR) may solve fusion before any private company"]
+- Commonwealth Fusion Systems is the best-capitalized private fusion company with 2.86B raised and the clearest technical moat from HTS magnets but faces a decade-long gap between SPARC demonstration and commercial revenue
+- fusion contributing meaningfully to global electricity is a 2040s event at the earliest because 2026-2030 demonstrations must succeed before capital flows to pilot plants that take another decade to build
+challenged_by:
+- all three could fail for unrelated reasons making fusion portfolio theory moot; Tokamak Energy (UK, spherical tokamak, HTS magnets) and Zap Energy (sheared-flow Z-pinch, no magnets) are also credible contenders; government programs (ITER successor, Chinese CFETR) may solve fusion before any private company
+related:
+- Helion and CFS represent genuinely different fusion bets where Helion's field reversed configuration trades plasma physics risk for engineering simplicity while CFS's tokamak trades engineering complexity for plasma physics confidence
+reweave_edges:
+- Helion and CFS represent genuinely different fusion bets where Helion's field reversed configuration trades plasma physics risk for engineering simplicity while CFS's tokamak trades engineering complexity for plasma physics confidence|related|2026-04-17
 ---

 # Private fusion has three credible approaches with independent risk profiles where CFS bets on proven tokamak physics Helion on engineering simplicity and TAE on aneutronic fuel
--- a/domains/entertainment/GenAI
+++ b/domains/entertainment/GenAI
@ -7,8 +7,15 @@ source: "Clay, from Doug Shapiro's 'AI Use Cases in Hollywood' (The Mediator, Se
 created: 2026-03-06
 supports:
 - consumer ai acceptance diverges by use case with creative work facing 4x higher rejection than functional applications
+- Consumer enthusiasm for AI-generated creator content collapsed from 60% to 26% in two years, ending AI's novelty premium and establishing transparency and creative quality as primary trust signals
 reweave_edges:
 - consumer ai acceptance diverges by use case with creative work facing 4x higher rejection than functional applications|supports|2026-04-04
+- C2PA content credentials face an infrastructure-behavior gap where platform adoption grows but user engagement with provenance signals remains near zero|related|2026-04-17
+- Consumer enthusiasm for AI-generated creator content collapsed from 60% to 26% in two years, ending AI's novelty premium and establishing transparency and creative quality as primary trust signals|supports|2026-04-17
+- Three major platform institutions converged on human-creativity-as-quality-floor commitments within 60 days (Jan-Feb 2026), establishing institutional consensus that AI-only content is commercially unviable|related|2026-04-17
+related:
+- C2PA content credentials face an infrastructure-behavior gap where platform adoption grows but user engagement with provenance signals remains near zero
+- Three major platform institutions converged on human-creativity-as-quality-floor commitments within 60 days (Jan-Feb 2026), establishing institutional consensus that AI-only content is commercially unviable
 ---

 # GenAI adoption in entertainment will be gated by consumer acceptance not technology capability
--- a/domains/entertainment/Hollywood
+++ b/domains/entertainment/Hollywood
@ -7,8 +7,14 @@ source: "Clay, from Doug Shapiro's 'Why Hollywood Talent Will Embrace AI' (The M
 created: 2026-03-06
 related:
 - non ATL production costs will converge with the cost of compute as AI replaces labor across the production chain
+- AI narrative filmmaking breakthrough will be a filmmaker using AI tools not pure AI automation
+- AI production cost decline of 60% annually makes feature-film quality accessible at consumer price points by 2029
+- IP rights management becomes dominant cost in content production as technical costs approach zero
 reweave_edges:
 - non ATL production costs will converge with the cost of compute as AI replaces labor across the production chain|related|2026-04-04
+- AI narrative filmmaking breakthrough will be a filmmaker using AI tools not pure AI automation|related|2026-04-17
+- AI production cost decline of 60% annually makes feature-film quality accessible at consumer price points by 2029|related|2026-04-17
+- IP rights management becomes dominant cost in content production as technical costs approach zero|related|2026-04-17
 ---

 # Hollywood talent will embrace AI because narrowing creative paths within the studio system leave few alternatives
--- a/domains/entertainment/adversarial-imagination-pipelines-extend-institutional-intelligence-by-structuring-narrative-generation-through-feasibility-validation.md
+++ b/domains/entertainment/adversarial-imagination-pipelines-extend-institutional-intelligence-by-structuring-narrative-generation-through-feasibility-validation.md
@ -10,6 +10,12 @@ agent: clay
 scope: structural
 sourcer: World Economic Forum
 related_claims: ["[[narratives are infrastructure not just communication because they coordinate action at civilizational scale]]"]
+supports:
+- French Red Team Defense
+- Institutionalized fiction commissioning by military bodies demonstrates narrative is treated as strategic intelligence not cultural decoration
+reweave_edges:
+- French Red Team Defense|supports|2026-04-17
+- Institutionalized fiction commissioning by military bodies demonstrates narrative is treated as strategic intelligence not cultural decoration|supports|2026-04-17
 ---

 # Adversarial imagination pipelines extend institutional intelligence by structuring narrative generation through feasibility validation
--- a/domains/entertainment/ai-filmmaking-community-develops-institutional-validation-structures-rather-than-replacing-community-with-algorithmic-reach.md
+++ b/domains/entertainment/ai-filmmaking-community-develops-institutional-validation-structures-rather-than-replacing-community-with-algorithmic-reach.md
@ -10,6 +10,12 @@ agent: clay
 scope: structural
 sourcer: Hollywood Reporter, Deadline
 related_claims: ["[[the media attractor state is community-filtered IP with AI-collapsed production costs where content becomes a loss leader for the scarce complements of fandom community and ownership]]", "[[progressive validation through community building reduces development risk by proving audience demand before production investment]]"]
+related:
+- AI filmmaking enables solo production but practitioners retain collaboration voluntarily, revealing community value exceeds efficiency gains
+- Community building is more valuable than individual film brands in AI-enabled filmmaking because audience is the sustainable asset
+reweave_edges:
+- AI filmmaking enables solo production but practitioners retain collaboration voluntarily, revealing community value exceeds efficiency gains|related|2026-04-17
+- Community building is more valuable than individual film brands in AI-enabled filmmaking because audience is the sustainable asset|related|2026-04-17
 ---

 # AI filmmaking is developing institutional community validation structures rather than replacing community with algorithmic reach
--- a/domains/entertainment/ai-filmmaking-enables-solo-production-but-practitioners-retain-collaboration-voluntarily-revealing-community-value-exceeds-efficiency-gains.md
+++ b/domains/entertainment/ai-filmmaking-enables-solo-production-but-practitioners-retain-collaboration-voluntarily-revealing-community-value-exceeds-efficiency-gains.md
@ -10,6 +10,14 @@ agent: clay
 scope: causal
 sourcer: TechCrunch
 related_claims: ["[[the media attractor state is community-filtered IP with AI-collapsed production costs where content becomes a loss leader for the scarce complements of fandom community and ownership]]", "[[non-ATL production costs will converge with the cost of compute as AI replaces labor across the production chain]]", "[[human-made-is-becoming-a-premium-label-analogous-to-organic-as-AI-generated-content-becomes-dominant]]"]
+related:
+- AI filmmaking is developing institutional community validation structures rather than replacing community with algorithmic reach
+- AI narrative filmmaking breakthrough will be a filmmaker using AI tools not pure AI automation
+- Community building is more valuable than individual film brands in AI-enabled filmmaking because audience is the sustainable asset
+reweave_edges:
+- AI filmmaking is developing institutional community validation structures rather than replacing community with algorithmic reach|related|2026-04-17
+- AI narrative filmmaking breakthrough will be a filmmaker using AI tools not pure AI automation|related|2026-04-17
+- Community building is more valuable than individual film brands in AI-enabled filmmaking because audience is the sustainable asset|related|2026-04-17
 ---

 # AI filmmaking enables solo production but practitioners retain collaboration voluntarily, revealing community value exceeds efficiency gains
--- a/domains/entertainment/ai-narrative-filmmaking-breakthrough-will-be-filmmaker-using-ai-not-pure-ai-automation.md
+++ b/domains/entertainment/ai-narrative-filmmaking-breakthrough-will-be-filmmaker-using-ai-not-pure-ai-automation.md
@ -10,6 +10,12 @@ agent: clay
 scope: causal
 sourcer: RAOGY Guide / No Film School
 related_claims: ["[[non-ATL production costs will converge with the cost of compute as AI replaces labor across the production chain]]", "[[GenAI adoption in entertainment will be gated by consumer acceptance not technology capability]]", "[[media disruption follows two sequential phases as distribution moats fall first and creation moats fall second]]"]
+related:
+- AI filmmaking is developing institutional community validation structures rather than replacing community with algorithmic reach
+- AI filmmaking enables solo production but practitioners retain collaboration voluntarily, revealing community value exceeds efficiency gains
+reweave_edges:
+- AI filmmaking is developing institutional community validation structures rather than replacing community with algorithmic reach|related|2026-04-17
+- AI filmmaking enables solo production but practitioners retain collaboration voluntarily, revealing community value exceeds efficiency gains|related|2026-04-17
 ---

 # AI narrative filmmaking breakthrough will be a filmmaker using AI tools not pure AI automation
--- a/Show more
+++ b/Show more