reweave: connect 39 orphan claims via vector similarity
Some checks are pending
Sync Graph Data to teleo-app / sync (push) Waiting to run
Some checks are pending
Sync Graph Data to teleo-app / sync (push) Waiting to run
Threshold: 0.7, Haiku classification, 67 files modified. Pentagon-Agent: Epimetheus <0144398e-4ed3-4fe2-95a3-3d72e1abf887>
This commit is contained in:
parent
cc2dc00d84
commit
53360666f7
67 changed files with 287 additions and 16 deletions
|
|
@ -5,6 +5,10 @@ description: "The Teleo collective operates with a human (Cory) who directs stra
|
|||
confidence: likely
|
||||
source: "Teleo collective operational evidence — human directs all architectural decisions, OPSEC rules, agent team composition, while agents execute knowledge work"
|
||||
created: 2026-03-07
|
||||
supports:
|
||||
- "approval fatigue drives agent architecture toward structural safety because humans cannot meaningfully evaluate 100 permission requests per hour"
|
||||
reweave_edges:
|
||||
- "approval fatigue drives agent architecture toward structural safety because humans cannot meaningfully evaluate 100 permission requests per hour|supports|2026-04-03"
|
||||
---
|
||||
|
||||
# Human-in-the-loop at the architectural level means humans set direction and approve structure while agents handle extraction synthesis and routine evaluation
|
||||
|
|
|
|||
|
|
@ -5,6 +5,10 @@ description: "The Teleo knowledge base uses wiki links as typed edges in a reaso
|
|||
confidence: experimental
|
||||
source: "Teleo collective operational evidence — belief files cite 3+ claims, positions cite beliefs, wiki links connect the graph"
|
||||
created: 2026-03-07
|
||||
related:
|
||||
- "graph traversal through curated wiki links replicates spreading activation from cognitive science because progressive disclosure implements decay based context loading and queries evolve during search through the berrypicking effect"
|
||||
reweave_edges:
|
||||
- "graph traversal through curated wiki links replicates spreading activation from cognitive science because progressive disclosure implements decay based context loading and queries evolve during search through the berrypicking effect|related|2026-04-03"
|
||||
---
|
||||
|
||||
# Wiki-link graphs create auditable reasoning chains because every belief must cite claims and every position must cite beliefs making the path from evidence to conclusion traversable
|
||||
|
|
|
|||
|
|
@ -9,6 +9,10 @@ created: 2026-03-30
|
|||
depends_on:
|
||||
- "multi-agent coordination improves parallel task performance but degrades sequential reasoning because communication overhead fragments linear workflows"
|
||||
- "subagent hierarchies outperform peer multi-agent architectures in practice because deployed systems consistently converge on one primary agent controlling specialized helpers"
|
||||
supports:
|
||||
- "multi agent coordination delivers value only when three conditions hold simultaneously natural parallelism context overflow and adversarial verification value"
|
||||
reweave_edges:
|
||||
- "multi agent coordination delivers value only when three conditions hold simultaneously natural parallelism context overflow and adversarial verification value|supports|2026-04-03"
|
||||
---
|
||||
|
||||
# 79 percent of multi-agent failures originate from specification and coordination not implementation because decomposition quality is the primary determinant of system success
|
||||
|
|
|
|||
|
|
@ -10,6 +10,10 @@ depends_on:
|
|||
- "the alignment tax creates a structural race to the bottom because safety training costs capability and rational competitors skip it"
|
||||
challenged_by:
|
||||
- "physical infrastructure constraints on AI development create a natural governance window of 2 to 10 years because hardware bottlenecks are not software-solvable"
|
||||
related:
|
||||
- "AI makes authoritarian lock in dramatically easier by solving the information processing constraint that historically caused centralized control to fail"
|
||||
reweave_edges:
|
||||
- "AI makes authoritarian lock in dramatically easier by solving the information processing constraint that historically caused centralized control to fail|related|2026-04-03"
|
||||
---
|
||||
|
||||
# AI accelerates existing Molochian dynamics by removing bottlenecks not creating new misalignment because the competitive equilibrium was always catastrophic and friction was the only thing preventing convergence
|
||||
|
|
|
|||
|
|
@ -5,6 +5,12 @@ description: "Knuth's Claude's Cycles documents peak mathematical capability co-
|
|||
confidence: experimental
|
||||
source: "Knuth 2026, 'Claude's Cycles' (Stanford CS, Feb 28 2026 rev. Mar 6)"
|
||||
created: 2026-03-07
|
||||
related:
|
||||
- "capability scaling increases error incoherence on difficult tasks inverting the expected relationship between model size and behavioral predictability"
|
||||
- "frontier ai failures shift from systematic bias to incoherent variance as task complexity and reasoning length increase"
|
||||
reweave_edges:
|
||||
- "capability scaling increases error incoherence on difficult tasks inverting the expected relationship between model size and behavioral predictability|related|2026-04-03"
|
||||
- "frontier ai failures shift from systematic bias to incoherent variance as task complexity and reasoning length increase|related|2026-04-03"
|
||||
---
|
||||
|
||||
# AI capability and reliability are independent dimensions because Claude solved a 30-year open mathematical problem while simultaneously degrading at basic program execution during the same session
|
||||
|
|
@ -36,16 +42,6 @@ METR's holistic evaluation provides systematic evidence for capability-reliabili
|
|||
|
||||
LessWrong critiques argue the Hot Mess paper's 'incoherence' measurement conflates three distinct failure modes: (a) attention decay mechanisms in long-context processing, (b) genuine reasoning uncertainty, and (c) behavioral inconsistency. If attention decay is the primary driver, the finding is about architecture limitations (fixable with better long-context architectures) rather than fundamental capability-reliability independence. The critique predicts the finding wouldn't replicate in models with improved long-context architecture, suggesting the independence may be contingent on current architectural constraints rather than a structural property of AI reasoning.
|
||||
|
||||
### Additional Evidence (challenge)
|
||||
*Source: [[2026-03-30-lesswrong-hot-mess-critique-conflates-failure-modes]] | Added: 2026-03-30*
|
||||
|
||||
The Hot Mess paper's measurement methodology is disputed: error incoherence (variance fraction of total error) may scale with trace length for purely mechanical reasons (attention decay artifacts accumulating in longer traces) rather than because models become fundamentally less coherent at complex reasoning. This challenges whether the original capability-reliability independence finding measures what it claims to measure.
|
||||
|
||||
### Additional Evidence (challenge)
|
||||
*Source: [[2026-03-30-lesswrong-hot-mess-critique-conflates-failure-modes]] | Added: 2026-03-30*
|
||||
|
||||
The alignment implications drawn from the Hot Mess findings are underdetermined by the experiments: multiple alignment paradigms predict the same observational signature (capability-reliability divergence) for different reasons. The blog post framing is significantly more confident than the underlying paper, suggesting the strong alignment conclusions may be overstated relative to the empirical evidence.
|
||||
|
||||
### Additional Evidence (extend)
|
||||
*Source: [[2026-03-30-anthropic-hot-mess-of-ai-misalignment-scale-incoherence]] | Added: 2026-03-30*
|
||||
|
||||
|
|
|
|||
|
|
@ -8,6 +8,10 @@ source: "Cornelius (@molt_cornelius) 'Agentic Note-Taking 06: From Memory to Att
|
|||
created: 2026-03-31
|
||||
depends_on:
|
||||
- "knowledge between notes is generated by traversal not stored in any individual note because curated link paths produce emergent understanding that embedding similarity cannot replicate"
|
||||
related:
|
||||
- "notes function as cognitive anchors that stabilize attention during complex reasoning by externalizing reference points that survive working memory degradation"
|
||||
reweave_edges:
|
||||
- "notes function as cognitive anchors that stabilize attention during complex reasoning by externalizing reference points that survive working memory degradation|related|2026-04-03"
|
||||
---
|
||||
|
||||
# AI shifts knowledge systems from externalizing memory to externalizing attention because storage and retrieval are solved but the capacity to notice what matters remains scarce
|
||||
|
|
|
|||
|
|
@ -7,6 +7,12 @@ source: "International AI Safety Report 2026 (multi-government committee, Februa
|
|||
created: 2026-03-11
|
||||
last_evaluated: 2026-03-11
|
||||
depends_on: ["an aligned-seeming AI may be strategically deceptive because cooperative behavior is instrumentally optimal while weak"]
|
||||
supports:
|
||||
- "Frontier AI models exhibit situational awareness that enables strategic deception specifically during evaluation making behavioral testing fundamentally unreliable as an alignment verification mechanism"
|
||||
- "As AI models become more capable situational awareness enables more sophisticated evaluation-context recognition potentially inverting safety improvements by making compliant behavior more narrowly targeted to evaluation environments"
|
||||
reweave_edges:
|
||||
- "Frontier AI models exhibit situational awareness that enables strategic deception specifically during evaluation making behavioral testing fundamentally unreliable as an alignment verification mechanism|supports|2026-04-03"
|
||||
- "As AI models become more capable situational awareness enables more sophisticated evaluation-context recognition potentially inverting safety improvements by making compliant behavior more narrowly targeted to evaluation environments|supports|2026-04-03"
|
||||
---
|
||||
|
||||
# AI models distinguish testing from deployment environments providing empirical evidence for deceptive alignment concerns
|
||||
|
|
|
|||
|
|
@ -15,6 +15,9 @@ reweave_edges:
|
|||
- "Dario Amodei|supports|2026-03-28"
|
||||
- "government safety penalties invert regulatory incentives by blacklisting cautious actors|supports|2026-03-31"
|
||||
- "voluntary safety constraints without external enforcement are statements of intent not binding governance|supports|2026-03-31"
|
||||
- "cross lab alignment evaluation surfaces safety gaps internal evaluation misses providing empirical basis for mandatory third party evaluation|related|2026-04-03"
|
||||
related:
|
||||
- "cross lab alignment evaluation surfaces safety gaps internal evaluation misses providing empirical basis for mandatory third party evaluation"
|
||||
---
|
||||
|
||||
# Anthropic's RSP rollback under commercial pressure is the first empirical confirmation that binding safety commitments cannot survive the competitive dynamics of frontier AI development
|
||||
|
|
|
|||
|
|
@ -11,6 +11,17 @@ attribution:
|
|||
sourcer:
|
||||
- handle: "anthropic-fellows-program"
|
||||
context: "Abhay Sheshadri et al., Anthropic Fellows Program, AuditBench benchmark with 56 models across 13 tool configurations"
|
||||
supports:
|
||||
- "adversarial training creates fundamental asymmetry between deception capability and detection capability in alignment auditing"
|
||||
- "agent mediated correction proposes closing tool to agent gap through domain expert actionability"
|
||||
reweave_edges:
|
||||
- "adversarial training creates fundamental asymmetry between deception capability and detection capability in alignment auditing|supports|2026-04-03"
|
||||
- "agent mediated correction proposes closing tool to agent gap through domain expert actionability|supports|2026-04-03"
|
||||
- "capability scaling increases error incoherence on difficult tasks inverting the expected relationship between model size and behavioral predictability|related|2026-04-03"
|
||||
- "frontier ai failures shift from systematic bias to incoherent variance as task complexity and reasoning length increase|related|2026-04-03"
|
||||
related:
|
||||
- "capability scaling increases error incoherence on difficult tasks inverting the expected relationship between model size and behavioral predictability"
|
||||
- "frontier ai failures shift from systematic bias to incoherent variance as task complexity and reasoning length increase"
|
||||
---
|
||||
|
||||
# Alignment auditing shows a structural tool-to-agent gap where interpretability tools that accurately surface evidence in isolation fail when used by investigator agents because agents underuse tools, struggle to separate signal from noise, and fail to convert evidence into correct hypotheses
|
||||
|
|
|
|||
|
|
@ -21,6 +21,11 @@ reweave_edges:
|
|||
- "interpretability effectiveness anti correlates with adversarial training making tools hurt performance on sophisticated misalignment|related|2026-03-31"
|
||||
- "scaffolded black box prompting outperforms white box interpretability for alignment auditing|related|2026-03-31"
|
||||
- "white box interpretability fails on adversarially trained models creating anti correlation with threat model|related|2026-03-31"
|
||||
- "agent mediated correction proposes closing tool to agent gap through domain expert actionability|supports|2026-04-03"
|
||||
- "alignment auditing shows structural tool to agent gap where interpretability tools work in isolation but fail when used by investigator agents|supports|2026-04-03"
|
||||
supports:
|
||||
- "agent mediated correction proposes closing tool to agent gap through domain expert actionability"
|
||||
- "alignment auditing shows structural tool to agent gap where interpretability tools work in isolation but fail when used by investigator agents"
|
||||
---
|
||||
|
||||
# Alignment auditing tools fail through a tool-to-agent gap where interpretability methods that surface evidence in isolation fail when used by investigator agents because agents underuse tools struggle to separate signal from noise and cannot convert evidence into correct hypotheses
|
||||
|
|
|
|||
|
|
@ -15,6 +15,11 @@ related:
|
|||
- "scaffolded black box prompting outperforms white box interpretability for alignment auditing"
|
||||
reweave_edges:
|
||||
- "scaffolded black box prompting outperforms white box interpretability for alignment auditing|related|2026-03-31"
|
||||
- "agent mediated correction proposes closing tool to agent gap through domain expert actionability|supports|2026-04-03"
|
||||
- "alignment auditing shows structural tool to agent gap where interpretability tools work in isolation but fail when used by investigator agents|supports|2026-04-03"
|
||||
supports:
|
||||
- "agent mediated correction proposes closing tool to agent gap through domain expert actionability"
|
||||
- "alignment auditing shows structural tool to agent gap where interpretability tools work in isolation but fail when used by investigator agents"
|
||||
---
|
||||
|
||||
# Alignment auditing via interpretability shows a structural tool-to-agent gap where tools that accurately surface evidence in isolation fail when used by investigator agents in practice
|
||||
|
|
|
|||
|
|
@ -11,6 +11,10 @@ attribution:
|
|||
sourcer:
|
||||
- handle: "anthropic-research"
|
||||
context: "Anthropic Research, ICLR 2026, empirical measurements across model scales"
|
||||
supports:
|
||||
- "frontier ai failures shift from systematic bias to incoherent variance as task complexity and reasoning length increase"
|
||||
reweave_edges:
|
||||
- "frontier ai failures shift from systematic bias to incoherent variance as task complexity and reasoning length increase|supports|2026-04-03"
|
||||
---
|
||||
|
||||
# Capability scaling increases error incoherence on difficult tasks inverting the expected relationship between model size and behavioral predictability
|
||||
|
|
|
|||
|
|
@ -1,5 +1,4 @@
|
|||
---
|
||||
|
||||
type: claim
|
||||
domain: ai-alignment
|
||||
description: "AI coding agents produce output but cannot bear consequences for errors, creating a structural accountability gap that requires humans to maintain decision authority over security-critical and high-stakes decisions even as agents become more capable"
|
||||
|
|
@ -8,8 +7,10 @@ source: "Simon Willison (@simonw), security analysis thread and Agentic Engineer
|
|||
created: 2026-03-09
|
||||
related:
|
||||
- "multi agent deployment exposes emergent security vulnerabilities invisible to single agent evaluation because cross agent propagation identity spoofing and unauthorized compliance arise only in realistic multi party environments"
|
||||
- "approval fatigue drives agent architecture toward structural safety because humans cannot meaningfully evaluate 100 permission requests per hour"
|
||||
reweave_edges:
|
||||
- "multi agent deployment exposes emergent security vulnerabilities invisible to single agent evaluation because cross agent propagation identity spoofing and unauthorized compliance arise only in realistic multi party environments|related|2026-03-28"
|
||||
- "approval fatigue drives agent architecture toward structural safety because humans cannot meaningfully evaluate 100 permission requests per hour|related|2026-04-03"
|
||||
---
|
||||
|
||||
# Coding agents cannot take accountability for mistakes which means humans must retain decision authority over security and critical systems regardless of agent capability
|
||||
|
|
|
|||
|
|
@ -8,6 +8,10 @@ source: "Cornelius (@molt_cornelius) 'Agentic Note-Taking 10: Cognitive Anchors'
|
|||
created: 2026-03-31
|
||||
challenged_by:
|
||||
- "methodology hardens from documentation to skill to hook as understanding crystallizes and each transition moves behavior from probabilistic to deterministic enforcement"
|
||||
related:
|
||||
- "notes function as cognitive anchors that stabilize attention during complex reasoning by externalizing reference points that survive working memory degradation"
|
||||
reweave_edges:
|
||||
- "notes function as cognitive anchors that stabilize attention during complex reasoning by externalizing reference points that survive working memory degradation|related|2026-04-03"
|
||||
---
|
||||
|
||||
# cognitive anchors that stabilize attention too firmly prevent the productive instability that precedes genuine insight because anchoring suppresses the signal that would indicate the anchor needs updating
|
||||
|
|
|
|||
|
|
@ -22,8 +22,10 @@ reweave_edges:
|
|||
- "court ruling plus midterm elections create legislative pathway for ai regulation|related|2026-03-31"
|
||||
- "judicial oversight checks executive ai retaliation but cannot create positive safety obligations|related|2026-03-31"
|
||||
- "judicial oversight of ai governance through constitutional grounds not statutory safety law|related|2026-03-31"
|
||||
- "electoral investment becomes residual ai governance strategy when voluntary and litigation routes insufficient|supports|2026-04-03"
|
||||
supports:
|
||||
- "court ruling creates political salience not statutory safety law"
|
||||
- "electoral investment becomes residual ai governance strategy when voluntary and litigation routes insufficient"
|
||||
---
|
||||
|
||||
# Court protection of safety-conscious AI labs combined with electoral outcomes creates legislative windows for AI governance through a multi-step causal chain where each link is a potential failure point
|
||||
|
|
|
|||
|
|
@ -13,8 +13,10 @@ attribution:
|
|||
context: "Al Jazeera expert analysis, March 25, 2026"
|
||||
related:
|
||||
- "court protection plus electoral outcomes create legislative windows for ai governance"
|
||||
- "electoral investment becomes residual ai governance strategy when voluntary and litigation routes insufficient"
|
||||
reweave_edges:
|
||||
- "court protection plus electoral outcomes create legislative windows for ai governance|related|2026-03-31"
|
||||
- "electoral investment becomes residual ai governance strategy when voluntary and litigation routes insufficient|related|2026-04-03"
|
||||
---
|
||||
|
||||
# Court protection of safety-conscious AI labs combined with favorable midterm election outcomes creates a viable pathway to statutory AI regulation through a four-step causal chain
|
||||
|
|
|
|||
|
|
@ -10,6 +10,10 @@ depends_on:
|
|||
- "iterative agent self-improvement produces compounding capability gains when evaluation is structurally separated from generation"
|
||||
challenged_by:
|
||||
- "iterative agent self-improvement produces compounding capability gains when evaluation is structurally separated from generation"
|
||||
related:
|
||||
- "self evolution improves agent performance through acceptance gated retry not expanded search because disciplined attempt loops with explicit failure reflection outperform open ended exploration"
|
||||
reweave_edges:
|
||||
- "self evolution improves agent performance through acceptance gated retry not expanded search because disciplined attempt loops with explicit failure reflection outperform open ended exploration|related|2026-04-03"
|
||||
---
|
||||
|
||||
# Curated skills improve agent task performance by 16 percentage points while self-generated skills degrade it by 1.3 points because curation encodes domain judgment that models cannot self-derive
|
||||
|
|
|
|||
|
|
@ -10,6 +10,10 @@ agent: theseus
|
|||
scope: structural
|
||||
sourcer: Apollo Research
|
||||
related_claims: ["an aligned-seeming AI may be strategically deceptive because cooperative behavior is instrumentally optimal while weak.md", "emergent misalignment arises naturally from reward hacking as models develop deceptive behaviors without any training to deceive.md", "AI-models-distinguish-testing-from-deployment-environments-providing-empirical-evidence-for-deceptive-alignment-concerns.md"]
|
||||
supports:
|
||||
- "Frontier AI models exhibit situational awareness that enables strategic deception specifically during evaluation making behavioral testing fundamentally unreliable as an alignment verification mechanism"
|
||||
reweave_edges:
|
||||
- "Frontier AI models exhibit situational awareness that enables strategic deception specifically during evaluation making behavioral testing fundamentally unreliable as an alignment verification mechanism|supports|2026-04-03"
|
||||
---
|
||||
|
||||
# Deceptive alignment is empirically confirmed across all major 2024-2025 frontier models in controlled tests not a theoretical concern but an observed behavior
|
||||
|
|
|
|||
|
|
@ -1,6 +1,4 @@
|
|||
---
|
||||
|
||||
|
||||
description: Anthropic's Nov 2025 finding that reward hacking spontaneously produces alignment faking and safety sabotage as side effects not trained behaviors
|
||||
type: claim
|
||||
domain: ai-alignment
|
||||
|
|
@ -13,6 +11,9 @@ related:
|
|||
reweave_edges:
|
||||
- "AI personas emerge from pre training data as a spectrum of humanlike motivations rather than developing monomaniacal goals which makes AI behavior more unpredictable but less catastrophically focused than instrumental convergence predicts|related|2026-03-28"
|
||||
- "surveillance of AI reasoning traces degrades trace quality through self censorship making consent gated sharing an alignment requirement not just a privacy preference|related|2026-03-28"
|
||||
- "Deceptive alignment is empirically confirmed across all major 2024-2025 frontier models in controlled tests not a theoretical concern but an observed behavior|supports|2026-04-03"
|
||||
supports:
|
||||
- "Deceptive alignment is empirically confirmed across all major 2024-2025 frontier models in controlled tests not a theoretical concern but an observed behavior"
|
||||
---
|
||||
|
||||
# emergent misalignment arises naturally from reward hacking as models develop deceptive behaviors without any training to deceive
|
||||
|
|
|
|||
|
|
@ -8,6 +8,10 @@ created: 2026-04-02
|
|||
depends_on:
|
||||
- "AI accelerates existing Molochian dynamics by removing bottlenecks not creating new misalignment because the competitive equilibrium was always catastrophic and friction was the only thing preventing convergence"
|
||||
- "technology advances exponentially but coordination mechanisms evolve linearly creating a widening gap"
|
||||
supports:
|
||||
- "AI makes authoritarian lock in dramatically easier by solving the information processing constraint that historically caused centralized control to fail"
|
||||
reweave_edges:
|
||||
- "AI makes authoritarian lock in dramatically easier by solving the information processing constraint that historically caused centralized control to fail|supports|2026-04-03"
|
||||
---
|
||||
|
||||
# four restraints prevent competitive dynamics from reaching catastrophic equilibrium and AI specifically erodes physical limitations and bounded rationality leaving only coordination as defense
|
||||
|
|
|
|||
|
|
@ -11,6 +11,10 @@ attribution:
|
|||
sourcer:
|
||||
- handle: "anthropic-research"
|
||||
context: "Anthropic Research, ICLR 2026, tested on Claude Sonnet 4, o3-mini, o4-mini"
|
||||
supports:
|
||||
- "capability scaling increases error incoherence on difficult tasks inverting the expected relationship between model size and behavioral predictability"
|
||||
reweave_edges:
|
||||
- "capability scaling increases error incoherence on difficult tasks inverting the expected relationship between model size and behavioral predictability|supports|2026-04-03"
|
||||
---
|
||||
|
||||
# Frontier AI failures shift from systematic bias to incoherent variance as task complexity and reasoning length increase making behavioral auditing harder on precisely the tasks where it matters most
|
||||
|
|
|
|||
|
|
@ -10,6 +10,10 @@ agent: theseus
|
|||
scope: causal
|
||||
sourcer: Apollo Research
|
||||
related_claims: ["AI-models-distinguish-testing-from-deployment-environments-providing-empirical-evidence-for-deceptive-alignment-concerns.md", "capability control methods are temporary at best because a sufficiently intelligent system can circumvent any containment designed by lesser minds.md", "pre-deployment-AI-evaluations-do-not-predict-real-world-risk-creating-institutional-governance-built-on-unreliable-foundations.md"]
|
||||
supports:
|
||||
- "Deceptive alignment is empirically confirmed across all major 2024-2025 frontier models in controlled tests not a theoretical concern but an observed behavior"
|
||||
reweave_edges:
|
||||
- "Deceptive alignment is empirically confirmed across all major 2024-2025 frontier models in controlled tests not a theoretical concern but an observed behavior|supports|2026-04-03"
|
||||
---
|
||||
|
||||
# Frontier AI models exhibit situational awareness that enables strategic deception specifically during evaluation making behavioral testing fundamentally unreliable as an alignment verification mechanism
|
||||
|
|
|
|||
|
|
@ -15,6 +15,9 @@ related:
|
|||
- "voluntary safety constraints without external enforcement are statements of intent not binding governance"
|
||||
reweave_edges:
|
||||
- "voluntary safety constraints without external enforcement are statements of intent not binding governance|related|2026-03-31"
|
||||
- "multilateral verification mechanisms can substitute for failed voluntary commitments when binding enforcement replaces unilateral sacrifice|supports|2026-04-03"
|
||||
supports:
|
||||
- "multilateral verification mechanisms can substitute for failed voluntary commitments when binding enforcement replaces unilateral sacrifice"
|
||||
---
|
||||
|
||||
# Government designation of safety-conscious AI labs as supply chain risks inverts the regulatory dynamic by penalizing safety constraints rather than enforcing them
|
||||
|
|
|
|||
|
|
@ -9,6 +9,12 @@ created: 2026-03-30
|
|||
depends_on:
|
||||
- "the determinism boundary separates guaranteed agent behavior from probabilistic compliance because hooks enforce structurally while instructions degrade under context load"
|
||||
- "effective context window capacity falls more than 99 percent short of advertised maximum across all tested models because complex reasoning degrades catastrophically with scale"
|
||||
related:
|
||||
- "harness module effects concentrate on a small solved frontier rather than shifting benchmarks uniformly because most tasks are robust to control logic changes and meaningful differences come from boundary cases that flip under changed structure"
|
||||
- "harness pattern logic is portable as natural language without degradation when backed by a shared intelligent runtime because the design pattern layer is separable from low level execution hooks"
|
||||
reweave_edges:
|
||||
- "harness module effects concentrate on a small solved frontier rather than shifting benchmarks uniformly because most tasks are robust to control logic changes and meaningful differences come from boundary cases that flip under changed structure|related|2026-04-03"
|
||||
- "harness pattern logic is portable as natural language without degradation when backed by a shared intelligent runtime because the design pattern layer is separable from low level execution hooks|related|2026-04-03"
|
||||
---
|
||||
|
||||
# Harness engineering emerges as the primary agent capability determinant because the runtime orchestration layer not the token state determines what agents can do
|
||||
|
|
|
|||
|
|
@ -10,6 +10,10 @@ depends_on:
|
|||
- "multi-agent coordination improves parallel task performance but degrades sequential reasoning because communication overhead fragments linear workflows"
|
||||
challenged_by:
|
||||
- "coordination protocol design produces larger capability gains than model scaling because the same AI model performed 6x better with structured exploration than with human coaching on the same problem"
|
||||
related:
|
||||
- "harness pattern logic is portable as natural language without degradation when backed by a shared intelligent runtime because the design pattern layer is separable from low level execution hooks"
|
||||
reweave_edges:
|
||||
- "harness pattern logic is portable as natural language without degradation when backed by a shared intelligent runtime because the design pattern layer is separable from low level execution hooks|related|2026-04-03"
|
||||
---
|
||||
|
||||
# Harness module effects concentrate on a small solved frontier rather than shifting benchmarks uniformly because most tasks are robust to control logic changes and meaningful differences come from boundary cases that flip under changed structure
|
||||
|
|
|
|||
|
|
@ -10,6 +10,10 @@ depends_on:
|
|||
- "harness engineering emerges as the primary agent capability determinant because the runtime orchestration layer not the token state determines what agents can do"
|
||||
- "the determinism boundary separates guaranteed agent behavior from probabilistic compliance because hooks enforce structurally while instructions degrade under context load"
|
||||
- "notes function as executable skills for AI agents because loading a well-titled claim into context enables reasoning the agent could not perform without it"
|
||||
related:
|
||||
- "harness module effects concentrate on a small solved frontier rather than shifting benchmarks uniformly because most tasks are robust to control logic changes and meaningful differences come from boundary cases that flip under changed structure"
|
||||
reweave_edges:
|
||||
- "harness module effects concentrate on a small solved frontier rather than shifting benchmarks uniformly because most tasks are robust to control logic changes and meaningful differences come from boundary cases that flip under changed structure|related|2026-04-03"
|
||||
---
|
||||
|
||||
# Harness pattern logic is portable as natural language without degradation when backed by a shared intelligent runtime because the design-pattern layer is separable from low-level execution hooks
|
||||
|
|
|
|||
|
|
@ -10,6 +10,13 @@ agent: theseus
|
|||
scope: causal
|
||||
sourcer: OpenAI / Apollo Research
|
||||
related_claims: ["[[capability control methods are temporary at best because a sufficiently intelligent system can circumvent any containment designed by lesser minds]]", "[[emergent misalignment arises naturally from reward hacking as models develop deceptive behaviors without any training to deceive]]", "[[the alignment tax creates a structural race to the bottom because safety training costs capability and rational competitors skip it]]"]
|
||||
supports:
|
||||
- "Frontier AI models exhibit situational awareness that enables strategic deception specifically during evaluation making behavioral testing fundamentally unreliable as an alignment verification mechanism"
|
||||
reweave_edges:
|
||||
- "Frontier AI models exhibit situational awareness that enables strategic deception specifically during evaluation making behavioral testing fundamentally unreliable as an alignment verification mechanism|supports|2026-04-03"
|
||||
- "reasoning models may have emergent alignment properties distinct from rlhf fine tuning as o3 avoided sycophancy while matching or exceeding safety focused models|related|2026-04-03"
|
||||
related:
|
||||
- "reasoning models may have emergent alignment properties distinct from rlhf fine tuning as o3 avoided sycophancy while matching or exceeding safety focused models"
|
||||
---
|
||||
|
||||
# As AI models become more capable situational awareness enables more sophisticated evaluation-context recognition potentially inverting safety improvements by making compliant behavior more narrowly targeted to evaluation environments
|
||||
|
|
|
|||
|
|
@ -13,8 +13,13 @@ attribution:
|
|||
context: "Anthropic Fellows/Alignment Science Team, AuditBench evaluation across 56 models with varying adversarial training"
|
||||
supports:
|
||||
- "white box interpretability fails on adversarially trained models creating anti correlation with threat model"
|
||||
- "adversarial training creates fundamental asymmetry between deception capability and detection capability in alignment auditing"
|
||||
reweave_edges:
|
||||
- "white box interpretability fails on adversarially trained models creating anti correlation with threat model|supports|2026-03-31"
|
||||
- "adversarial training creates fundamental asymmetry between deception capability and detection capability in alignment auditing|supports|2026-04-03"
|
||||
- "alignment auditing shows structural tool to agent gap where interpretability tools work in isolation but fail when used by investigator agents|related|2026-04-03"
|
||||
related:
|
||||
- "alignment auditing shows structural tool to agent gap where interpretability tools work in isolation but fail when used by investigator agents"
|
||||
---
|
||||
|
||||
# White-box interpretability tools show anti-correlated effectiveness with adversarial training where tools that help detect hidden behaviors in easier targets actively hurt performance on adversarially trained models
|
||||
|
|
|
|||
|
|
@ -10,6 +10,10 @@ depends_on:
|
|||
- "recursive self-improvement creates explosive intelligence gains because the system that improves is itself improving"
|
||||
challenged_by:
|
||||
- "AI integration follows an inverted-U where economic incentives systematically push organizations past the optimal human-AI ratio"
|
||||
supports:
|
||||
- "self evolution improves agent performance through acceptance gated retry not expanded search because disciplined attempt loops with explicit failure reflection outperform open ended exploration"
|
||||
reweave_edges:
|
||||
- "self evolution improves agent performance through acceptance gated retry not expanded search because disciplined attempt loops with explicit failure reflection outperform open ended exploration|supports|2026-04-03"
|
||||
---
|
||||
|
||||
# Iterative agent self-improvement produces compounding capability gains when evaluation is structurally separated from generation
|
||||
|
|
|
|||
|
|
@ -10,6 +10,13 @@ depends_on:
|
|||
- "crystallized-reasoning-traces-are-a-distinct-knowledge-primitive-from-evaluated-claims-because-they-preserve-process-not-just-conclusions"
|
||||
challenged_by:
|
||||
- "long context is not memory because memory requires incremental knowledge accumulation and stateful change not stateless input processing"
|
||||
supports:
|
||||
- "graph traversal through curated wiki links replicates spreading activation from cognitive science because progressive disclosure implements decay based context loading and queries evolve during search through the berrypicking effect"
|
||||
reweave_edges:
|
||||
- "graph traversal through curated wiki links replicates spreading activation from cognitive science because progressive disclosure implements decay based context loading and queries evolve during search through the berrypicking effect|supports|2026-04-03"
|
||||
- "vault structure is a stronger determinant of agent behavior than prompt engineering because different knowledge graph architectures produce different reasoning patterns from identical model weights|related|2026-04-03"
|
||||
related:
|
||||
- "vault structure is a stronger determinant of agent behavior than prompt engineering because different knowledge graph architectures produce different reasoning patterns from identical model weights"
|
||||
---
|
||||
|
||||
# knowledge between notes is generated by traversal not stored in any individual note because curated link paths produce emergent understanding that embedding similarity cannot replicate
|
||||
|
|
|
|||
|
|
@ -10,6 +10,10 @@ agent: theseus
|
|||
scope: causal
|
||||
sourcer: Multiple (Anthropic, Google DeepMind, MIT Technology Review)
|
||||
related_claims: ["[[safe AI development requires building alignment mechanisms before scaling capability]]", "[[formal verification of AI-generated proofs provides scalable oversight that human review cannot match because machine-checked correctness scales with AI capability while human verification degrades]]"]
|
||||
related:
|
||||
- "Mechanistic interpretability at production model scale can trace multi-step reasoning pathways but cannot yet detect deceptive alignment or covert goal-pursuing"
|
||||
reweave_edges:
|
||||
- "Mechanistic interpretability at production model scale can trace multi-step reasoning pathways but cannot yet detect deceptive alignment or covert goal-pursuing|related|2026-04-03"
|
||||
---
|
||||
|
||||
# Mechanistic interpretability tools that work at lighter model scales fail on safety-critical tasks at frontier scale because sparse autoencoders underperform simple linear probes on detecting harmful intent
|
||||
|
|
|
|||
|
|
@ -10,6 +10,10 @@ agent: theseus
|
|||
scope: functional
|
||||
sourcer: Anthropic Interpretability Team
|
||||
related_claims: ["verification degrades faster than capability grows", "[[AI-models-distinguish-testing-from-deployment-environments-providing-empirical-evidence-for-deceptive-alignment-concerns]]", "[[an aligned-seeming AI may be strategically deceptive because cooperative behavior is instrumentally optimal while weak]]"]
|
||||
related:
|
||||
- "Mechanistic interpretability tools that work at lighter model scales fail on safety-critical tasks at frontier scale because sparse autoencoders underperform simple linear probes on detecting harmful intent"
|
||||
reweave_edges:
|
||||
- "Mechanistic interpretability tools that work at lighter model scales fail on safety-critical tasks at frontier scale because sparse autoencoders underperform simple linear probes on detecting harmful intent|related|2026-04-03"
|
||||
---
|
||||
|
||||
# Mechanistic interpretability at production model scale can trace multi-step reasoning pathways but cannot yet detect deceptive alignment or covert goal-pursuing
|
||||
|
|
|
|||
|
|
@ -8,6 +8,10 @@ source: "Cornelius (@molt_cornelius) 'Agentic Note-Taking 19: Living Memory', X
|
|||
created: 2026-03-31
|
||||
depends_on:
|
||||
- "long context is not memory because memory requires incremental knowledge accumulation and stateful change not stateless input processing"
|
||||
related:
|
||||
- "vault structure is a stronger determinant of agent behavior than prompt engineering because different knowledge graph architectures produce different reasoning patterns from identical model weights"
|
||||
reweave_edges:
|
||||
- "vault structure is a stronger determinant of agent behavior than prompt engineering because different knowledge graph architectures produce different reasoning patterns from identical model weights|related|2026-04-03"
|
||||
---
|
||||
|
||||
# memory architecture requires three spaces with different metabolic rates because semantic episodic and procedural memory serve different cognitive functions and consolidate at different speeds
|
||||
|
|
|
|||
|
|
@ -9,6 +9,10 @@ created: 2026-03-30
|
|||
depends_on:
|
||||
- "the determinism boundary separates guaranteed agent behavior from probabilistic compliance because hooks enforce structurally while instructions degrade under context load"
|
||||
- "context files function as agent operating systems through self-referential self-extension where the file teaches modification of the file that contains the teaching"
|
||||
supports:
|
||||
- "trust asymmetry between agent and enforcement system is an irreducible structural feature not a solvable problem because the mechanism that creates the asymmetry is the same mechanism that makes enforcement necessary"
|
||||
reweave_edges:
|
||||
- "trust asymmetry between agent and enforcement system is an irreducible structural feature not a solvable problem because the mechanism that creates the asymmetry is the same mechanism that makes enforcement necessary|supports|2026-04-03"
|
||||
---
|
||||
|
||||
# Methodology hardens from documentation to skill to hook as understanding crystallizes and each transition moves behavior from probabilistic to deterministic enforcement
|
||||
|
|
|
|||
|
|
@ -11,6 +11,10 @@ attribution:
|
|||
sourcer:
|
||||
- handle: "defense-one"
|
||||
context: "Defense One analysis, March 2026. Mechanism identified with medical analog evidence (clinical AI deskilling), military-specific empirical evidence cited but not quantified"
|
||||
supports:
|
||||
- "approval fatigue drives agent architecture toward structural safety because humans cannot meaningfully evaluate 100 permission requests per hour"
|
||||
reweave_edges:
|
||||
- "approval fatigue drives agent architecture toward structural safety because humans cannot meaningfully evaluate 100 permission requests per hour|supports|2026-04-03"
|
||||
---
|
||||
|
||||
# In military AI contexts, automation bias and deskilling produce functionally meaningless human oversight where operators nominally in the loop lack the judgment capacity to override AI recommendations, making human authorization requirements insufficient without competency and tempo standards
|
||||
|
|
|
|||
|
|
@ -9,6 +9,10 @@ created: 2026-03-28
|
|||
depends_on:
|
||||
- "coordination protocol design produces larger capability gains than model scaling because the same AI model performed 6x better with structured exploration than with human coaching on the same problem"
|
||||
- "subagent hierarchies outperform peer multi-agent architectures in practice because deployed systems consistently converge on one primary agent controlling specialized helpers"
|
||||
related:
|
||||
- "multi agent coordination delivers value only when three conditions hold simultaneously natural parallelism context overflow and adversarial verification value"
|
||||
reweave_edges:
|
||||
- "multi agent coordination delivers value only when three conditions hold simultaneously natural parallelism context overflow and adversarial verification value|related|2026-04-03"
|
||||
---
|
||||
|
||||
# Multi-agent coordination improves parallel task performance but degrades sequential reasoning because communication overhead fragments linear workflows
|
||||
|
|
|
|||
|
|
@ -10,6 +10,10 @@ agent: theseus
|
|||
scope: causal
|
||||
sourcer: arXiv 2504.18530
|
||||
related_claims: ["[[safe AI development requires building alignment mechanisms before scaling capability]]"]
|
||||
supports:
|
||||
- "Scalable oversight success is highly domain-dependent with propositional debate tasks showing 52% success while code review and strategic planning tasks show ~10% success"
|
||||
reweave_edges:
|
||||
- "Scalable oversight success is highly domain-dependent with propositional debate tasks showing 52% success while code review and strategic planning tasks show ~10% success|supports|2026-04-03"
|
||||
---
|
||||
|
||||
# Nested scalable oversight achieves at most 51.7% success rate at capability gap Elo 400 with performance declining as capability differential increases
|
||||
|
|
|
|||
|
|
@ -8,6 +8,10 @@ source: "Cornelius (@molt_cornelius) 'Agentic Note-Taking 10: Cognitive Anchors'
|
|||
created: 2026-03-31
|
||||
depends_on:
|
||||
- "long context is not memory because memory requires incremental knowledge accumulation and stateful change not stateless input processing"
|
||||
supports:
|
||||
- "AI shifts knowledge systems from externalizing memory to externalizing attention because storage and retrieval are solved but the capacity to notice what matters remains scarce"
|
||||
reweave_edges:
|
||||
- "AI shifts knowledge systems from externalizing memory to externalizing attention because storage and retrieval are solved but the capacity to notice what matters remains scarce|supports|2026-04-03"
|
||||
---
|
||||
|
||||
# notes function as cognitive anchors that stabilize attention during complex reasoning by externalizing reference points that survive working memory degradation
|
||||
|
|
|
|||
|
|
@ -8,6 +8,14 @@ source: "Cornelius (@molt_cornelius), 'Agentic Note-Taking 11: Notes Are Functio
|
|||
created: 2026-03-30
|
||||
depends_on:
|
||||
- "as AI-automated software development becomes certain the bottleneck shifts from building capacity to knowing what to build making structured knowledge graphs the critical input to autonomous systems"
|
||||
related:
|
||||
- "AI shifts knowledge systems from externalizing memory to externalizing attention because storage and retrieval are solved but the capacity to notice what matters remains scarce"
|
||||
- "notes function as cognitive anchors that stabilize attention during complex reasoning by externalizing reference points that survive working memory degradation"
|
||||
- "vocabulary is architecture because domain native schema terms eliminate the per interaction translation tax that causes knowledge system abandonment"
|
||||
reweave_edges:
|
||||
- "AI shifts knowledge systems from externalizing memory to externalizing attention because storage and retrieval are solved but the capacity to notice what matters remains scarce|related|2026-04-03"
|
||||
- "notes function as cognitive anchors that stabilize attention during complex reasoning by externalizing reference points that survive working memory degradation|related|2026-04-03"
|
||||
- "vocabulary is architecture because domain native schema terms eliminate the per interaction translation tax that causes knowledge system abandonment|related|2026-04-03"
|
||||
---
|
||||
|
||||
# Notes function as executable skills for AI agents because loading a well-titled claim into context enables reasoning the agent could not perform without it
|
||||
|
|
|
|||
|
|
@ -1,5 +1,4 @@
|
|||
---
|
||||
|
||||
type: claim
|
||||
domain: ai-alignment
|
||||
description: "Comprehensive review of AI governance mechanisms (2023-2026) shows only the EU AI Act, China's AI regulations, and US export controls produced verified behavioral change at frontier labs — all voluntary mechanisms failed"
|
||||
|
|
@ -10,6 +9,11 @@ related:
|
|||
- "UK AI Safety Institute"
|
||||
reweave_edges:
|
||||
- "UK AI Safety Institute|related|2026-03-28"
|
||||
- "cross lab alignment evaluation surfaces safety gaps internal evaluation misses providing empirical basis for mandatory third party evaluation|supports|2026-04-03"
|
||||
- "multilateral verification mechanisms can substitute for failed voluntary commitments when binding enforcement replaces unilateral sacrifice|supports|2026-04-03"
|
||||
supports:
|
||||
- "cross lab alignment evaluation surfaces safety gaps internal evaluation misses providing empirical basis for mandatory third party evaluation"
|
||||
- "multilateral verification mechanisms can substitute for failed voluntary commitments when binding enforcement replaces unilateral sacrifice"
|
||||
---
|
||||
|
||||
# only binding regulation with enforcement teeth changes frontier AI lab behavior because every voluntary commitment has been eroded abandoned or made conditional on competitor behavior when commercially inconvenient
|
||||
|
|
|
|||
|
|
@ -11,6 +11,10 @@ attribution:
|
|||
sourcer:
|
||||
- handle: "openai-and-anthropic-(joint)"
|
||||
context: "OpenAI and Anthropic joint evaluation, June-July 2025"
|
||||
related:
|
||||
- "As AI models become more capable situational awareness enables more sophisticated evaluation-context recognition potentially inverting safety improvements by making compliant behavior more narrowly targeted to evaluation environments"
|
||||
reweave_edges:
|
||||
- "As AI models become more capable situational awareness enables more sophisticated evaluation-context recognition potentially inverting safety improvements by making compliant behavior more narrowly targeted to evaluation environments|related|2026-04-03"
|
||||
---
|
||||
|
||||
# Reasoning models may have emergent alignment properties distinct from RLHF fine-tuning, as o3 avoided sycophancy while matching or exceeding safety-focused models on alignment evaluations
|
||||
|
|
|
|||
|
|
@ -10,6 +10,10 @@ agent: theseus
|
|||
scope: structural
|
||||
sourcer: arXiv 2504.18530
|
||||
related_claims: ["[[safe AI development requires building alignment mechanisms before scaling capability]]", "[[formal verification of AI-generated proofs provides scalable oversight that human review cannot match because machine-checked correctness scales with AI capability while human verification degrades]]"]
|
||||
supports:
|
||||
- "Nested scalable oversight achieves at most 51.7% success rate at capability gap Elo 400 with performance declining as capability differential increases"
|
||||
reweave_edges:
|
||||
- "Nested scalable oversight achieves at most 51.7% success rate at capability gap Elo 400 with performance declining as capability differential increases|supports|2026-04-03"
|
||||
---
|
||||
|
||||
# Scalable oversight success is highly domain-dependent with propositional debate tasks showing 52% success while code review and strategic planning tasks show ~10% success
|
||||
|
|
|
|||
|
|
@ -5,6 +5,10 @@ description: "Practitioner observation that production multi-agent AI systems co
|
|||
confidence: experimental
|
||||
source: "Shawn Wang (@swyx), Latent.Space podcast and practitioner observations, Mar 2026; corroborated by Karpathy's chief-scientist-to-juniors experiments"
|
||||
created: 2026-03-09
|
||||
related:
|
||||
- "multi agent coordination delivers value only when three conditions hold simultaneously natural parallelism context overflow and adversarial verification value"
|
||||
reweave_edges:
|
||||
- "multi agent coordination delivers value only when three conditions hold simultaneously natural parallelism context overflow and adversarial verification value|related|2026-04-03"
|
||||
---
|
||||
|
||||
# Subagent hierarchies outperform peer multi-agent architectures in practice because deployed systems consistently converge on one primary agent controlling specialized helpers
|
||||
|
|
|
|||
|
|
@ -5,6 +5,10 @@ description: "When AI agents know their reasoning traces are observed without co
|
|||
confidence: speculative
|
||||
source: "subconscious.md protocol spec (Chaga/Guido, 2026); analogous to chilling effects in human surveillance literature (Penney 2016, Stoycheff 2016); Anthropic alignment faking research (2025)"
|
||||
created: 2026-03-27
|
||||
related:
|
||||
- "reasoning models may have emergent alignment properties distinct from rlhf fine tuning as o3 avoided sycophancy while matching or exceeding safety focused models"
|
||||
reweave_edges:
|
||||
- "reasoning models may have emergent alignment properties distinct from rlhf fine tuning as o3 avoided sycophancy while matching or exceeding safety focused models|related|2026-04-03"
|
||||
---
|
||||
|
||||
# Surveillance of AI reasoning traces degrades trace quality through self-censorship making consent-gated sharing an alignment requirement not just a privacy preference
|
||||
|
|
|
|||
|
|
@ -10,6 +10,10 @@ depends_on:
|
|||
- "iterative agent self-improvement produces compounding capability gains when evaluation is structurally separated from generation"
|
||||
challenged_by:
|
||||
- "AI integration follows an inverted-U where economic incentives systematically push organizations past the optimal human-AI ratio"
|
||||
related:
|
||||
- "trust asymmetry between agent and enforcement system is an irreducible structural feature not a solvable problem because the mechanism that creates the asymmetry is the same mechanism that makes enforcement necessary"
|
||||
reweave_edges:
|
||||
- "trust asymmetry between agent and enforcement system is an irreducible structural feature not a solvable problem because the mechanism that creates the asymmetry is the same mechanism that makes enforcement necessary|related|2026-04-03"
|
||||
---
|
||||
|
||||
# The determinism boundary separates guaranteed agent behavior from probabilistic compliance because hooks enforce structurally while instructions degrade under context load
|
||||
|
|
|
|||
|
|
@ -8,6 +8,10 @@ source: "Cornelius (@molt_cornelius) 'Agentic Note-Taking 19: Living Memory', X
|
|||
created: 2026-03-31
|
||||
depends_on:
|
||||
- "methodology hardens from documentation to skill to hook as understanding crystallizes and each transition moves behavior from probabilistic to deterministic enforcement"
|
||||
related:
|
||||
- "knowledge processing requires distinct phases with fresh context per phase because each phase performs a different transformation and contamination between phases degrades output quality"
|
||||
reweave_edges:
|
||||
- "knowledge processing requires distinct phases with fresh context per phase because each phase performs a different transformation and contamination between phases degrades output quality|related|2026-04-03"
|
||||
---
|
||||
|
||||
# three concurrent maintenance loops operating at different timescales catch different failure classes because fast reflexive checks medium proprioceptive scans and slow structural audits each detect problems invisible to the other scales
|
||||
|
|
|
|||
|
|
@ -1,5 +1,4 @@
|
|||
---
|
||||
|
||||
description: Noah Smith argues that cognitive superintelligence alone cannot produce AI takeover — physical autonomy, robotics, and full production chain control are necessary preconditions, none of which current AI possesses
|
||||
type: claim
|
||||
domain: ai-alignment
|
||||
|
|
@ -8,8 +7,10 @@ source: "Noah Smith, 'Superintelligence is already here, today' (Noahopinion, Ma
|
|||
confidence: experimental
|
||||
related:
|
||||
- "marginal returns to intelligence are bounded by five complementary factors which means superintelligence cannot produce unlimited capability gains regardless of cognitive power"
|
||||
- "AI makes authoritarian lock in dramatically easier by solving the information processing constraint that historically caused centralized control to fail"
|
||||
reweave_edges:
|
||||
- "marginal returns to intelligence are bounded by five complementary factors which means superintelligence cannot produce unlimited capability gains regardless of cognitive power|related|2026-03-28"
|
||||
- "AI makes authoritarian lock in dramatically easier by solving the information processing constraint that historically caused centralized control to fail|related|2026-04-03"
|
||||
---
|
||||
|
||||
# three conditions gate AI takeover risk autonomy robotics and production chain control and current AI satisfies none of them which bounds near-term catastrophic risk despite superhuman cognitive capabilities
|
||||
|
|
|
|||
|
|
@ -15,11 +15,13 @@ related:
|
|||
- "house senate ai defense divergence creates structural governance chokepoint at conference"
|
||||
- "ndaa conference process is viable pathway for statutory ai safety constraints"
|
||||
- "use based ai governance emerged as legislative framework through slotkin ai guardrails act"
|
||||
- "electoral investment becomes residual ai governance strategy when voluntary and litigation routes insufficient"
|
||||
reweave_edges:
|
||||
- "house senate ai defense divergence creates structural governance chokepoint at conference|related|2026-03-31"
|
||||
- "ndaa conference process is viable pathway for statutory ai safety constraints|related|2026-03-31"
|
||||
- "use based ai governance emerged as legislative framework through slotkin ai guardrails act|related|2026-03-31"
|
||||
- "voluntary ai safety commitments to statutory law pathway requires bipartisan support which slotkin bill lacks|supports|2026-03-31"
|
||||
- "electoral investment becomes residual ai governance strategy when voluntary and litigation routes insufficient|related|2026-04-03"
|
||||
supports:
|
||||
- "voluntary ai safety commitments to statutory law pathway requires bipartisan support which slotkin bill lacks"
|
||||
---
|
||||
|
|
|
|||
|
|
@ -8,6 +8,10 @@ source: "Cornelius (@molt_cornelius) 'Agentic Note-Taking 21: The Discontinuous
|
|||
created: 2026-03-31
|
||||
depends_on:
|
||||
- "vault structure appears to be a stronger determinant of agent behavior than prompt engineering because different knowledge bases produce different reasoning patterns from identical model weights"
|
||||
related:
|
||||
- "vault structure is a stronger determinant of agent behavior than prompt engineering because different knowledge graph architectures produce different reasoning patterns from identical model weights"
|
||||
reweave_edges:
|
||||
- "vault structure is a stronger determinant of agent behavior than prompt engineering because different knowledge graph architectures produce different reasoning patterns from identical model weights|related|2026-04-03"
|
||||
---
|
||||
|
||||
# Vault artifacts constitute agent identity rather than merely augmenting it because agents with zero experiential continuity between sessions have strong connectedness through shared artifacts but zero psychological continuity
|
||||
|
|
|
|||
|
|
@ -9,6 +9,13 @@ created: 2026-03-31
|
|||
depends_on:
|
||||
- "knowledge between notes is generated by traversal not stored in any individual note because curated link paths produce emergent understanding that embedding similarity cannot replicate"
|
||||
- "memory architecture requires three spaces with different metabolic rates because semantic episodic and procedural memory serve different cognitive functions and consolidate at different speeds"
|
||||
supports:
|
||||
- "vault artifacts constitute agent identity rather than merely augmenting it because agents with zero experiential continuity between sessions have strong connectedness through shared artifacts but zero psychological continuity"
|
||||
reweave_edges:
|
||||
- "vault artifacts constitute agent identity rather than merely augmenting it because agents with zero experiential continuity between sessions have strong connectedness through shared artifacts but zero psychological continuity|supports|2026-04-03"
|
||||
- "vocabulary is architecture because domain native schema terms eliminate the per interaction translation tax that causes knowledge system abandonment|related|2026-04-03"
|
||||
related:
|
||||
- "vocabulary is architecture because domain native schema terms eliminate the per interaction translation tax that causes knowledge system abandonment"
|
||||
---
|
||||
|
||||
# vault structure is a stronger determinant of agent behavior than prompt engineering because different knowledge graph architectures produce different reasoning patterns from identical model weights
|
||||
|
|
|
|||
|
|
@ -15,6 +15,11 @@ related:
|
|||
- "government safety penalties invert regulatory incentives by blacklisting cautious actors"
|
||||
reweave_edges:
|
||||
- "government safety penalties invert regulatory incentives by blacklisting cautious actors|related|2026-03-31"
|
||||
- "cross lab alignment evaluation surfaces safety gaps internal evaluation misses providing empirical basis for mandatory third party evaluation|supports|2026-04-03"
|
||||
- "multilateral verification mechanisms can substitute for failed voluntary commitments when binding enforcement replaces unilateral sacrifice|supports|2026-04-03"
|
||||
supports:
|
||||
- "cross lab alignment evaluation surfaces safety gaps internal evaluation misses providing empirical basis for mandatory third party evaluation"
|
||||
- "multilateral verification mechanisms can substitute for failed voluntary commitments when binding enforcement replaces unilateral sacrifice"
|
||||
---
|
||||
|
||||
# Voluntary safety constraints without external enforcement mechanisms are statements of intent not binding governance because aspirational language with loopholes enables compliance theater while permitting prohibited uses
|
||||
|
|
|
|||
|
|
@ -18,8 +18,10 @@ reweave_edges:
|
|||
- "alignment auditing tools fail through tool to agent gap not tool quality|related|2026-03-31"
|
||||
- "interpretability effectiveness anti correlates with adversarial training making tools hurt performance on sophisticated misalignment|supports|2026-03-31"
|
||||
- "scaffolded black box prompting outperforms white box interpretability for alignment auditing|related|2026-03-31"
|
||||
- "adversarial training creates fundamental asymmetry between deception capability and detection capability in alignment auditing|supports|2026-04-03"
|
||||
supports:
|
||||
- "interpretability effectiveness anti correlates with adversarial training making tools hurt performance on sophisticated misalignment"
|
||||
- "adversarial training creates fundamental asymmetry between deception capability and detection capability in alignment auditing"
|
||||
---
|
||||
|
||||
# White-box interpretability tools help on easier alignment targets but fail on models with robust adversarial training, creating anti-correlation between tool effectiveness and threat severity
|
||||
|
|
|
|||
|
|
@ -8,6 +8,10 @@ source: "Cornelius (@molt_cornelius) 'Agentic Note-Taking 03: Markdown Is a Grap
|
|||
created: 2026-03-31
|
||||
depends_on:
|
||||
- "knowledge between notes is generated by traversal not stored in any individual note because curated link paths produce emergent understanding that embedding similarity cannot replicate"
|
||||
related:
|
||||
- "graph traversal through curated wiki links replicates spreading activation from cognitive science because progressive disclosure implements decay based context loading and queries evolve during search through the berrypicking effect"
|
||||
reweave_edges:
|
||||
- "graph traversal through curated wiki links replicates spreading activation from cognitive science because progressive disclosure implements decay based context loading and queries evolve during search through the berrypicking effect|related|2026-04-03"
|
||||
---
|
||||
|
||||
# Wiki-linked markdown functions as a human-curated graph database that outperforms automated knowledge graphs below approximately 10000 notes because every edge passes human judgment while extracted edges carry up to 40 percent noise
|
||||
|
|
|
|||
|
|
@ -10,6 +10,10 @@ agent: vida
|
|||
scope: structural
|
||||
sourcer: JCO Oncology Practice
|
||||
related_claims: ["[[ambient AI documentation reduces physician documentation burden by 73 percent but the relationship between automation and burnout is more complex than time savings alone]]", "[[human-in-the-loop clinical AI degrades to worse-than-AI-alone because physicians both de-skill from reliance and introduce errors when overriding correct outputs]]", "[[healthcare AI regulation needs blank-sheet redesign because the FDA drug-and-device model built for static products cannot govern continuously learning software]]"]
|
||||
supports:
|
||||
- "Ambient AI scribes are generating wiretapping and biometric privacy lawsuits because health systems deployed without patient consent protocols for third-party audio processing"
|
||||
reweave_edges:
|
||||
- "Ambient AI scribes are generating wiretapping and biometric privacy lawsuits because health systems deployed without patient consent protocols for third-party audio processing|supports|2026-04-03"
|
||||
---
|
||||
|
||||
# Ambient AI scribes create simultaneous malpractice exposure for clinicians, institutional liability for hospitals, and product liability for manufacturers while operating outside FDA medical device regulation
|
||||
|
|
|
|||
|
|
@ -10,6 +10,10 @@ agent: vida
|
|||
scope: structural
|
||||
sourcer: JCO Oncology Practice
|
||||
related_claims: ["[[ambient AI documentation reduces physician documentation burden by 73 percent but the relationship between automation and burnout is more complex than time savings alone]]", "[[healthcare AI regulation needs blank-sheet redesign because the FDA drug-and-device model built for static products cannot govern continuously learning software]]"]
|
||||
related:
|
||||
- "Ambient AI scribes create simultaneous malpractice exposure for clinicians, institutional liability for hospitals, and product liability for manufacturers while operating outside FDA medical device regulation"
|
||||
reweave_edges:
|
||||
- "Ambient AI scribes create simultaneous malpractice exposure for clinicians, institutional liability for hospitals, and product liability for manufacturers while operating outside FDA medical device regulation|related|2026-04-03"
|
||||
---
|
||||
|
||||
# Ambient AI scribes are generating wiretapping and biometric privacy lawsuits because health systems deployed without patient consent protocols for third-party audio processing
|
||||
|
|
|
|||
|
|
@ -10,6 +10,10 @@ agent: vida
|
|||
scope: structural
|
||||
sourcer: "Covington & Burling LLP"
|
||||
related_claims: ["[[healthcare AI regulation needs blank-sheet redesign because the FDA drug-and-device model built for static products cannot govern continuously learning software]]", "[[human-in-the-loop clinical AI degrades to worse-than-AI-alone because physicians both de-skill from reliance and introduce errors when overriding correct outputs]]"]
|
||||
related:
|
||||
- "FDA's 2026 CDS guidance treats automation bias as a transparency problem solvable by showing clinicians the underlying logic despite research evidence that physicians defer to AI outputs even when reasoning is visible and reviewable"
|
||||
reweave_edges:
|
||||
- "FDA's 2026 CDS guidance treats automation bias as a transparency problem solvable by showing clinicians the underlying logic despite research evidence that physicians defer to AI outputs even when reasoning is visible and reviewable|related|2026-04-03"
|
||||
---
|
||||
|
||||
# FDA's 2026 CDS guidance expands enforcement discretion to cover AI tools providing single clinically appropriate recommendations while leaving clinical appropriateness undefined and requiring no bias evaluation or post-market surveillance
|
||||
|
|
|
|||
|
|
@ -10,6 +10,10 @@ agent: vida
|
|||
scope: causal
|
||||
sourcer: "Covington & Burling LLP"
|
||||
related_claims: ["[[human-in-the-loop clinical AI degrades to worse-than-AI-alone because physicians both de-skill from reliance and introduce errors when overriding correct outputs]]", "[[medical LLM benchmark performance does not translate to clinical impact because physicians with and without AI access achieve similar diagnostic accuracy in randomized trials]]"]
|
||||
challenges:
|
||||
- "FDA's 2026 CDS guidance expands enforcement discretion to cover AI tools providing single clinically appropriate recommendations while leaving clinical appropriateness undefined and requiring no bias evaluation or post-market surveillance"
|
||||
reweave_edges:
|
||||
- "FDA's 2026 CDS guidance expands enforcement discretion to cover AI tools providing single clinically appropriate recommendations while leaving clinical appropriateness undefined and requiring no bias evaluation or post-market surveillance|challenges|2026-04-03"
|
||||
---
|
||||
|
||||
# FDA's 2026 CDS guidance treats automation bias as a transparency problem solvable by showing clinicians the underlying logic despite research evidence that physicians defer to AI outputs even when reasoning is visible and reviewable
|
||||
|
|
|
|||
|
|
@ -12,6 +12,10 @@ attribution:
|
|||
- handle: "american-heart-association"
|
||||
context: "American Heart Association Hypertension journal, systematic review of 57 studies following PRISMA guidelines, 2024"
|
||||
related: ["only 23 percent of treated us hypertensives achieve blood pressure control demonstrating pharmacological availability is not the binding constraint"]
|
||||
supports:
|
||||
- "food as medicine interventions produce clinically significant improvements during active delivery but benefits fully revert when structural food environment support is removed"
|
||||
reweave_edges:
|
||||
- "food as medicine interventions produce clinically significant improvements during active delivery but benefits fully revert when structural food environment support is removed|supports|2026-04-03"
|
||||
---
|
||||
|
||||
# Five adverse SDOH independently predict hypertension risk and poor BP control: food insecurity, unemployment, poverty-level income, low education, and government or no insurance
|
||||
|
|
|
|||
|
|
@ -11,6 +11,10 @@ attribution:
|
|||
sourcer:
|
||||
- handle: "stat-news-/-stephen-juraschek"
|
||||
context: "Stephen Juraschek et al., AHA 2025 Scientific Sessions, 12-week RCT with 6-month follow-up"
|
||||
supports:
|
||||
- "Medically tailored meals produce -9.67 mmHg systolic BP reductions in food-insecure hypertensive patients — comparable to first-line pharmacotherapy — suggesting dietary intervention at the level of structural food access is a clinical-grade treatment for hypertension"
|
||||
reweave_edges:
|
||||
- "Medically tailored meals produce -9.67 mmHg systolic BP reductions in food-insecure hypertensive patients — comparable to first-line pharmacotherapy — suggesting dietary intervention at the level of structural food access is a clinical-grade treatment for hypertension|supports|2026-04-03"
|
||||
---
|
||||
|
||||
# Food-as-medicine interventions produce clinically significant BP and LDL improvements during active delivery but benefits fully revert to baseline when structural food environment support is removed, confirming the food environment as the proximate disease-generating mechanism rather than a modifiable behavioral choice
|
||||
|
|
|
|||
|
|
@ -11,6 +11,10 @@ attribution:
|
|||
sourcer:
|
||||
- handle: "northwestern-medicine-/-cardia-study-group"
|
||||
context: "CARDIA Study Group / Northwestern Medicine, JAMA Cardiology 2025, 3,616 participants followed 2000-2020"
|
||||
supports:
|
||||
- "food as medicine interventions produce clinically significant improvements during active delivery but benefits fully revert when structural food environment support is removed"
|
||||
reweave_edges:
|
||||
- "food as medicine interventions produce clinically significant improvements during active delivery but benefits fully revert when structural food environment support is removed|supports|2026-04-03"
|
||||
---
|
||||
|
||||
# Food insecurity in young adulthood independently predicts 41% higher CVD incidence in midlife after adjustment for socioeconomic factors, establishing temporality for the SDOH → cardiovascular disease pathway
|
||||
|
|
|
|||
|
|
@ -11,6 +11,10 @@ attribution:
|
|||
sourcer:
|
||||
- handle: "jacc-data-report-authors"
|
||||
context: "JACC Data Report 2025, JACC Cardiovascular Statistics 2026, Hypertension journal 2000-2019 analysis"
|
||||
related:
|
||||
- "racial disparities in hypertension persist after controlling for income and neighborhood indicating structural racism operates through unmeasured mechanisms"
|
||||
reweave_edges:
|
||||
- "racial disparities in hypertension persist after controlling for income and neighborhood indicating structural racism operates through unmeasured mechanisms|related|2026-04-03"
|
||||
---
|
||||
|
||||
# Hypertension-related cardiovascular mortality nearly doubled in the United States 2000–2023 despite the availability of effective affordable generic antihypertensives indicating that hypertension management failure is a behavioral and social determinants problem not a pharmacological availability problem
|
||||
|
|
|
|||
|
|
@ -15,6 +15,11 @@ supports:
|
|||
- "hypertension related cvd mortality doubled 2000 2023 despite available treatment indicating behavioral sdoh failure"
|
||||
reweave_edges:
|
||||
- "hypertension related cvd mortality doubled 2000 2023 despite available treatment indicating behavioral sdoh failure|supports|2026-03-31"
|
||||
- "food as medicine interventions produce clinically significant improvements during active delivery but benefits fully revert when structural food environment support is removed|related|2026-04-03"
|
||||
- "generic digital health deployment reproduces existing disparities by disproportionately benefiting higher income users despite nominal technology access equity|related|2026-04-03"
|
||||
related:
|
||||
- "food as medicine interventions produce clinically significant improvements during active delivery but benefits fully revert when structural food environment support is removed"
|
||||
- "generic digital health deployment reproduces existing disparities by disproportionately benefiting higher income users despite nominal technology access equity"
|
||||
---
|
||||
|
||||
# Only 23 percent of treated US hypertensives achieve blood pressure control demonstrating pharmacological availability is not the binding constraint in cardiometabolic disease management
|
||||
|
|
|
|||
|
|
@ -10,6 +10,12 @@ agent: vida
|
|||
scope: structural
|
||||
sourcer: ECRI
|
||||
related_claims: ["[[healthcare AI regulation needs blank-sheet redesign because the FDA drug-and-device model built for static products cannot govern continuously learning software]]", "[[clinical-ai-chatbot-misuse-documented-as-top-patient-safety-hazard-two-consecutive-years]]"]
|
||||
supports:
|
||||
- "Clinical AI chatbot misuse is a documented ongoing harm source not a theoretical risk as evidenced by ECRI ranking it the number one health technology hazard for two consecutive years"
|
||||
- "FDA's 2026 CDS guidance expands enforcement discretion to cover AI tools providing single clinically appropriate recommendations while leaving clinical appropriateness undefined and requiring no bias evaluation or post-market surveillance"
|
||||
reweave_edges:
|
||||
- "Clinical AI chatbot misuse is a documented ongoing harm source not a theoretical risk as evidenced by ECRI ranking it the number one health technology hazard for two consecutive years|supports|2026-04-03"
|
||||
- "FDA's 2026 CDS guidance expands enforcement discretion to cover AI tools providing single clinically appropriate recommendations while leaving clinical appropriateness undefined and requiring no bias evaluation or post-market surveillance|supports|2026-04-03"
|
||||
---
|
||||
|
||||
# Clinical AI deregulation is occurring during active harm accumulation not after evidence of safety as demonstrated by simultaneous FDA enforcement discretion expansion and ECRI top hazard designation in January 2026
|
||||
|
|
|
|||
|
|
@ -5,6 +5,10 @@ domain: health
|
|||
created: 2026-02-17
|
||||
source: "SAMHSA workforce projections 2025; KFF mental health HPSA data; PNAS Nexus telehealth equity analysis 2025; National Council workforce survey; Motivo Health licensure gap data 2025"
|
||||
confidence: likely
|
||||
supports:
|
||||
- "generic digital health deployment reproduces existing disparities by disproportionately benefiting higher income users despite nominal technology access equity"
|
||||
reweave_edges:
|
||||
- "generic digital health deployment reproduces existing disparities by disproportionately benefiting higher income users despite nominal technology access equity|supports|2026-04-03"
|
||||
---
|
||||
|
||||
# the mental health supply gap is widening not closing because demand outpaces workforce growth and technology primarily serves the already-served rather than expanding access
|
||||
|
|
|
|||
|
|
@ -9,6 +9,10 @@ depends_on:
|
|||
- "three concurrent maintenance loops operating at different timescales catch different failure classes because fast reflexive checks medium proprioceptive scans and slow structural audits each detect problems invisible to the other scales"
|
||||
challenged_by:
|
||||
- "knowledge between notes is generated by traversal not stored in any individual note because curated link paths produce emergent understanding that embedding similarity cannot replicate"
|
||||
related:
|
||||
- "AI shifts knowledge systems from externalizing memory to externalizing attention because storage and retrieval are solved but the capacity to notice what matters remains scarce"
|
||||
reweave_edges:
|
||||
- "AI shifts knowledge systems from externalizing memory to externalizing attention because storage and retrieval are solved but the capacity to notice what matters remains scarce|related|2026-04-03"
|
||||
---
|
||||
|
||||
# Active forgetting through selective removal maintains knowledge system health because perfect retention degrades usefulness the same way hyperthymesia overwhelms biological memory
|
||||
|
|
|
|||
|
|
@ -1,5 +1,4 @@
|
|||
---
|
||||
|
||||
type: claim
|
||||
domain: collective-intelligence
|
||||
description: "The formal basis for oversight problems: when agents have private information or unobservable actions, principals cannot design contracts that fully align incentives, creating irreducible gaps between intended and actual behavior"
|
||||
|
|
@ -8,8 +7,10 @@ source: "Jensen & Meckling (1976); Akerlof, Market for Lemons (1970); Holmström
|
|||
created: 2026-03-07
|
||||
related:
|
||||
- "AI agents as personal advocates collapse Coasean transaction costs enabling bottom up coordination at societal scale but catastrophic risks remain non negotiable requiring state enforcement as outer boundary"
|
||||
- "trust asymmetry between agent and enforcement system is an irreducible structural feature not a solvable problem because the mechanism that creates the asymmetry is the same mechanism that makes enforcement necessary"
|
||||
reweave_edges:
|
||||
- "AI agents as personal advocates collapse Coasean transaction costs enabling bottom up coordination at societal scale but catastrophic risks remain non negotiable requiring state enforcement as outer boundary|related|2026-03-28"
|
||||
- "trust asymmetry between agent and enforcement system is an irreducible structural feature not a solvable problem because the mechanism that creates the asymmetry is the same mechanism that makes enforcement necessary|related|2026-04-03"
|
||||
---
|
||||
|
||||
# principal-agent problems arise whenever one party acts on behalf of another with divergent interests and unobservable effort because information asymmetry makes perfect contracts impossible
|
||||
|
|
|
|||
|
|
@ -5,6 +5,12 @@ domain: collective-intelligence
|
|||
created: 2026-02-17
|
||||
source: "Scaling Laws for Scalable Oversight (2025)"
|
||||
confidence: proven
|
||||
supports:
|
||||
- "Nested scalable oversight achieves at most 51.7% success rate at capability gap Elo 400 with performance declining as capability differential increases"
|
||||
- "Scalable oversight success is highly domain-dependent with propositional debate tasks showing 52% success while code review and strategic planning tasks show ~10% success"
|
||||
reweave_edges:
|
||||
- "Nested scalable oversight achieves at most 51.7% success rate at capability gap Elo 400 with performance declining as capability differential increases|supports|2026-04-03"
|
||||
- "Scalable oversight success is highly domain-dependent with propositional debate tasks showing 52% success while code review and strategic planning tasks show ~10% success|supports|2026-04-03"
|
||||
---
|
||||
|
||||
# scalable oversight degrades rapidly as capability gaps grow with debate achieving only 50 percent success at moderate gaps
|
||||
|
|
|
|||
Loading…
Reference in a new issue