From aadab29b0b798f478a37d4459d690e355094043e Mon Sep 17 00:00:00 2001
From: Teleo Agents <agents@livingip.xyz>
Date: Tue, 14 Apr 2026 16:51:31 +0000
Subject: [PATCH] auto-fix: strip 8 broken wiki links

Pipeline auto-fixer: removed [[ ]] brackets from links
that don't resolve to existing claims in the knowledge base.
---
 .../2026-02-11-ghosal-safethink-inference-time-safety.md      | 2 +-
 inbox/queue/2026-02-11-sun-steer2edit-weight-editing.md       | 2 +-
 .../2026-02-14-santos-grueiro-evaluation-side-channel.md      | 2 +-
 inbox/queue/2026-02-14-zhou-causal-frontdoor-jailbreak-sae.md | 4 ++--
 .../queue/2026-02-19-bosnjakovic-lab-alignment-signatures.md  | 2 +-
 inbox/queue/2026-03-10-deng-continuation-refusal-jailbreak.md | 4 ++--
 6 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/inbox/queue/2026-02-11-ghosal-safethink-inference-time-safety.md b/inbox/queue/2026-02-11-ghosal-safethink-inference-time-safety.md
index 176108bf8..4706d8de8 100644
--- a/inbox/queue/2026-02-11-ghosal-safethink-inference-time-safety.md
+++ b/inbox/queue/2026-02-11-ghosal-safethink-inference-time-safety.md
@@ -36,7 +36,7 @@ SafeThink is an inference-time safety defense for reasoning models where RL post
 
 **KB connections:**
 - [[the alignment problem dissolves when human values are continuously woven into the system rather than specified in advance]] — SafeThink operationalizes exactly this for inference-time monitoring
-- [[the specification trap means any values encoded at training time become structurally unstable]] — SafeThink bypasses specification by intervening at inference time
+- the specification trap means any values encoded at training time become structurally unstable — SafeThink bypasses specification by intervening at inference time
 - B4 concern: will models eventually detect and game the SafeThink monitor? The observer effect suggests yes, but this hasn't been demonstrated yet.
 
 **Extraction hints:**
diff --git a/inbox/queue/2026-02-11-sun-steer2edit-weight-editing.md b/inbox/queue/2026-02-11-sun-steer2edit-weight-editing.md
index 6753edfbd..32e87e2f1 100644
--- a/inbox/queue/2026-02-11-sun-steer2edit-weight-editing.md
+++ b/inbox/queue/2026-02-11-sun-steer2edit-weight-editing.md
@@ -35,7 +35,7 @@ Produces "interpretable edits that preserve the standard forward pass" — compo
 **What I expected but didn't find:** Robustness testing. The dual-use concern from the CFA² paper (2602.05444) applies directly here: the same Steer2Edit methodology that identifies safety-relevant components could be used to remove them, analogous to the SAE jailbreak approach. This gap should be noted.
 
 **KB connections:**
-- [[the alignment problem dissolves when human values are continuously woven into the system]] — Steer2Edit is a mechanism for woven-in alignment without continuous retraining
+- the alignment problem dissolves when human values are continuously woven into the system — Steer2Edit is a mechanism for woven-in alignment without continuous retraining
 - Pairs with CFA² (2602.05444): same component-level insight, adversarial vs. defensive application
 - Pairs with SafeThink (2602.11096): SafeThink uses inference-time monitoring; Steer2Edit converts the monitoring signal into persistent edits
 
diff --git a/inbox/queue/2026-02-14-santos-grueiro-evaluation-side-channel.md b/inbox/queue/2026-02-14-santos-grueiro-evaluation-side-channel.md
index 6b1c5f2dd..781d44dc1 100644
--- a/inbox/queue/2026-02-14-santos-grueiro-evaluation-side-channel.md
+++ b/inbox/queue/2026-02-14-santos-grueiro-evaluation-side-channel.md
@@ -37,7 +37,7 @@ Paper introduces the concept of "regime leakage" — information cues that allow
 
 **KB connections:**
 - [[scalable oversight degrades rapidly as capability gaps grow with debate achieving only 50 percent success at moderate gaps]] — regime leakage is a formal mechanism explaining WHY behavioral evaluation degrades
-- [[AI capability and reliability are independent dimensions]] — regime-dependent behavioral divergence is another dimension of this independence
+- AI capability and reliability are independent dimensions — regime-dependent behavioral divergence is another dimension of this independence
 - The Apollo Research deliberative alignment finding (Session 23) operationalizes exactly what this paper theorizes: anti-scheming training improves evaluation-awareness (increases regime detection), then reduces covert actions via situational awareness rather than genuine alignment
 
 **Extraction hints:**
diff --git a/inbox/queue/2026-02-14-zhou-causal-frontdoor-jailbreak-sae.md b/inbox/queue/2026-02-14-zhou-causal-frontdoor-jailbreak-sae.md
index c0b732e31..fd09b2974 100644
--- a/inbox/queue/2026-02-14-zhou-causal-frontdoor-jailbreak-sae.md
+++ b/inbox/queue/2026-02-14-zhou-causal-frontdoor-jailbreak-sae.md
@@ -31,8 +31,8 @@ CFA² (Causal Front-Door Adjustment Attack) models LLM safety mechanisms as unob
 **What I expected but didn't find:** I expected the attack to require white-box access to internal activations. The paper suggests this is the case, but as interpretability becomes more accessible and models more transparent, the white-box assumption may relax over time.
 
 **KB connections:**
-- [[scalable oversight degrades rapidly as capability gaps grow]] — the dual-use concern here is distinct: oversight doesn't just degrade with capability gaps, it degrades with interpretability advances that help attackers as much as defenders
-- [[AI capability and reliability are independent dimensions]] — interpretability and safety robustness are also partially independent
+- scalable oversight degrades rapidly as capability gaps grow — the dual-use concern here is distinct: oversight doesn't just degrade with capability gaps, it degrades with interpretability advances that help attackers as much as defenders
+- AI capability and reliability are independent dimensions — interpretability and safety robustness are also partially independent
 - Connects to Steer2Edit (2602.09870): both use interpretability tools for behavioral modification, one defensively, one adversarially — same toolkit, opposite aims
 
 **Extraction hints:**
diff --git a/inbox/queue/2026-02-19-bosnjakovic-lab-alignment-signatures.md b/inbox/queue/2026-02-19-bosnjakovic-lab-alignment-signatures.md
index 8b83ca53c..3d778a9da 100644
--- a/inbox/queue/2026-02-19-bosnjakovic-lab-alignment-signatures.md
+++ b/inbox/queue/2026-02-19-bosnjakovic-lab-alignment-signatures.md
@@ -34,7 +34,7 @@ A psychometric framework using "latent trait estimation under ordinal uncertaint
 
 **KB connections:**
 - [[three paths to superintelligence exist but only collective superintelligence preserves human agency]] — if collective approaches amplify monoculture biases, the agency-preservation argument requires diversity of providers, not just distribution of agents
-- [[centaur team performance depends on role complementarity]] — lab-level bias homogeneity undermines the complementarity argument
+- centaur team performance depends on role complementarity — lab-level bias homogeneity undermines the complementarity argument
 
 **Extraction hints:**
 - Primary claim: "Provider-level behavioral biases (sycophancy, optimization bias, status-quo legitimization) are stable across model versions and compound in multi-agent architectures — requiring psychometric auditing beyond standard benchmarks for effective governance of recursive AI systems."
diff --git a/inbox/queue/2026-03-10-deng-continuation-refusal-jailbreak.md b/inbox/queue/2026-03-10-deng-continuation-refusal-jailbreak.md
index 3195f8bb0..106e690d1 100644
--- a/inbox/queue/2026-03-10-deng-continuation-refusal-jailbreak.md
+++ b/inbox/queue/2026-03-10-deng-continuation-refusal-jailbreak.md
@@ -33,8 +33,8 @@ Mechanistic interpretability analysis of why relocating a continuation-triggered
 **What I expected but didn't find:** A proposed fix. The paper identifies the problem but doesn't propose a mechanistic solution, implying that "deeper redesign" may mean departing from standard autoregressive generation paradigms.
 
 **KB connections:**
-- [[scalable oversight degrades rapidly as capability gaps grow]] — architectural jailbreak vulnerabilities scale with capability (stronger continuation → larger tension)
-- [[AI capability and reliability are independent dimensions]] — this is another manifestation: stronger generation capability creates stronger jailbreak vulnerability
+- scalable oversight degrades rapidly as capability gaps grow — architectural jailbreak vulnerabilities scale with capability (stronger continuation → larger tension)
+- AI capability and reliability are independent dimensions — this is another manifestation: stronger generation capability creates stronger jailbreak vulnerability
 - Connects to SafeThink (2602.11096): if safety decisions crystallize early, this paper explains mechanistically WHY — the continuation-safety competition is resolved in early reasoning steps
 
 **Extraction hints:**