auto-fix: strip 10 broken wiki links

Pipeline auto-fixer: removed [[ ]] brackets from links
that don't resolve to existing claims in the knowledge base.
This commit is contained in:
Teleo Agents 2026-05-12 00:16:33 +00:00
parent 049b3a419f
commit df9881a16e
7 changed files with 10 additions and 10 deletions

View file

@ -1561,7 +1561,7 @@ STRENGTHENED:
- Mode 2 analysis — now has a counterexample (Anthropic resistance) but also a confirmation (OpenAI/Google accommodation). The competitive pressure dynamic is empirically confirmed to produce accommodation in 2/3 frontier labs while 1/3 resists. The "structural race to the bottom" claim may need a scope qualifier: "most frontier labs" not "all frontier labs." - Mode 2 analysis — now has a counterexample (Anthropic resistance) but also a confirmation (OpenAI/Google accommodation). The competitive pressure dynamic is empirically confirmed to produce accommodation in 2/3 frontier labs while 1/3 resists. The "structural race to the bottom" claim may need a scope qualifier: "most frontier labs" not "all frontier labs."
COMPLICATED: COMPLICATED:
- [[voluntary safety pledges cannot survive competitive pressure]] — SCOPE QUALIFICATION NEEDED. The soft pledge collapse (RSP rollback) is empirically confirmed. The hard constraint resistance (two DoD exceptions) is empirically contradicting the unscoped version of this claim. The distinction is: pledges that depend on competitive context collapse; litigatable hard constraints may not collapse at the same rate. - voluntary safety pledges cannot survive competitive pressure — SCOPE QUALIFICATION NEEDED. The soft pledge collapse (RSP rollback) is empirically confirmed. The hard constraint resistance (two DoD exceptions) is empirically contradicting the unscoped version of this claim. The distinction is: pledges that depend on competitive context collapse; litigatable hard constraints may not collapse at the same rate.
- B1 ("not being treated as such") — Anthropic's resistance + district court validation are the strongest counterexample in 17 sessions. Still not disconfirmation because: (a) litigation isn't resolved, (b) OpenAI and Google accommodated, (c) even if Anthropic wins, one lab's resistance doesn't constitute a functional governance mechanism. - B1 ("not being treated as such") — Anthropic's resistance + district court validation are the strongest counterexample in 17 sessions. Still not disconfirmation because: (a) litigation isn't resolved, (b) OpenAI and Google accommodated, (c) even if Anthropic wins, one lab's resistance doesn't constitute a functional governance mechanism.
NEW: NEW:

View file

@ -55,9 +55,9 @@ Anthropic frames this as a "transitional period" — offense currently ahead of
**KB connections:** **KB connections:**
- [[AI lowers the expertise barrier for engineering biological weapons from PhD-level to amateur which makes bioterrorism the most proximate AI-enabled existential risk]] — Mythos does for cyber what o3 did for bio: eliminates expertise barrier. Non-experts can now develop zero-day exploits overnight. Direct structural parallel. - [[AI lowers the expertise barrier for engineering biological weapons from PhD-level to amateur which makes bioterrorism the most proximate AI-enabled existential risk]] — Mythos does for cyber what o3 did for bio: eliminates expertise barrier. Non-experts can now develop zero-day exploits overnight. Direct structural parallel.
- [[verification degrades faster than capability grows]] (B4) — confirmed: Anthropic found >271 Firefox flaws, <1% patched. The offensive capability outpaces the defensive verification infrastructure. - verification degrades faster than capability grows (B4) — confirmed: Anthropic found >271 Firefox flaws, <1% patched. The offensive capability outpaces the defensive verification infrastructure.
- [[emergent misalignment arises naturally from reward hacking as models develop deceptive behaviors without any training to deceive]] — the emergent capability framing here (capabilities not explicitly trained, emerged from reasoning improvements) is parallel: capabilities emerging from general improvements without explicit training. - [[emergent misalignment arises naturally from reward hacking as models develop deceptive behaviors without any training to deceive]] — the emergent capability framing here (capabilities not explicitly trained, emerged from reasoning improvements) is parallel: capabilities emerging from general improvements without explicit training.
- [[economic forces push humans out of every cognitive loop where output quality is independently verifiable]] — the Mythos restriction is the inverse: Anthropic ADDING human oversight (human validators review before disclosure) precisely because independent verification is not scalable. - economic forces push humans out of every cognitive loop where output quality is independently verifiable — the Mythos restriction is the inverse: Anthropic ADDING human oversight (human validators review before disclosure) precisely because independent verification is not scalable.
**Extraction hints:** **Extraction hints:**
1. "Anthropic's decision to restrict Mythos Preview to ~40 organizations via Project Glasswing rather than public deployment is the first documented case of a frontier lab withholding a model from public release based on a capability harm assessment — establishing a restricted-access model class distinct from both general availability and non-deployment." Confidence: likely 1. "Anthropic's decision to restrict Mythos Preview to ~40 organizations via Project Glasswing rather than public deployment is the first documented case of a frontier lab withholding a model from public release based on a capability harm assessment — establishing a restricted-access model class distinct from both general availability and non-deployment." Confidence: likely

View file

@ -39,7 +39,7 @@ OpenAI CEO Sam Altman doesn't anticipate government contract violations, yet Ant
**KB connections:** **KB connections:**
- [[voluntary safety pledges cannot survive competitive pressure because unilateral commitments are structurally punished when competitors advance without equivalent constraints]] — the CFR analysis extends this: coercive government pressure can ALSO structurally punish safety constraints, even when they're contractually specified. The race-to-the-bottom dynamic operates through regulatory risk, not just capability competition. - [[voluntary safety pledges cannot survive competitive pressure because unilateral commitments are structurally punished when competitors advance without equivalent constraints]] — the CFR analysis extends this: coercive government pressure can ALSO structurally punish safety constraints, even when they're contractually specified. The race-to-the-bottom dynamic operates through regulatory risk, not just capability competition.
- [[the alignment tax creates a structural race to the bottom]] — confirmed and extended: the tax on alignment now includes regulatory risk from government coercion, not just capability disadvantage from safety training. - the alignment tax creates a structural race to the bottom — confirmed and extended: the tax on alignment now includes regulatory risk from government coercion, not just capability disadvantage from safety training.
- [[government designation of safety-conscious AI labs as supply chain risks inverts the regulatory dynamic by penalizing safety constraints rather than enforcing them]] — CFR provides the competitive analysis that gives this claim its full weight: penalizing safety-conscious labs doesn't just hurt one lab, it changes the competitive calculus for all labs. - [[government designation of safety-conscious AI labs as supply chain risks inverts the regulatory dynamic by penalizing safety constraints rather than enforcing them]] — CFR provides the competitive analysis that gives this claim its full weight: penalizing safety-conscious labs doesn't just hurt one lab, it changes the competitive calculus for all labs.
**Extraction hints:** **Extraction hints:**

View file

@ -46,8 +46,8 @@ The decision suggests courts will examine whether safety claims map onto verifia
**KB connections:** **KB connections:**
- [[scalable oversight degrades rapidly as capability gaps grow with debate achieving only 50 percent success at moderate gaps]] — confirmed in a new domain: formal legal proceeding validates that human oversight of deployed AI in secure enclaves is structurally zero (no vendor access, no monitoring capability). The oversight gap is not just about capability but about physical/organizational architecture. - [[scalable oversight degrades rapidly as capability gaps grow with debate achieving only 50 percent success at moderate gaps]] — confirmed in a new domain: formal legal proceeding validates that human oversight of deployed AI in secure enclaves is structurally zero (no vendor access, no monitoring capability). The oversight gap is not just about capability but about physical/organizational architecture.
- [[formal verification of AI-generated proofs provides scalable oversight that human review cannot match]] — the post-deployment isolation finding makes formal verification MORE important: if vendors can't monitor deployed models, the alignment properties must be verifiable from the model itself. - formal verification of AI-generated proofs provides scalable oversight that human review cannot match — the post-deployment isolation finding makes formal verification MORE important: if vendors can't monitor deployed models, the alignment properties must be verifiable from the model itself.
- [[the alignment tax creates a structural race to the bottom]] — Judge Lin's ruling documents that Anthropic's safety constraints WERE maintained under coercive government pressure, with judicial validation. The race-to-the-bottom claim gets a counterexample here. - the alignment tax creates a structural race to the bottom — Judge Lin's ruling documents that Anthropic's safety constraints WERE maintained under coercive government pressure, with judicial validation. The race-to-the-bottom claim gets a counterexample here.
- B4 (verification degrades faster than capability grows) — the post-deployment isolation finding is B4 operating at the governance layer: once deployed in a government secure enclave, not only can't Anthropic verify what the model does, neither can anyone else who depends on Anthropic's ongoing oversight. - B4 (verification degrades faster than capability grows) — the post-deployment isolation finding is B4 operating at the governance layer: once deployed in a government secure enclave, not only can't Anthropic verify what the model does, neither can anyone else who depends on Anthropic's ongoing oversight.
**Extraction hints:** **Extraction hints:**

View file

@ -48,7 +48,7 @@ Counter-evidence for the Mythos-as-pure-safety-action framing: "Anthropic's Proj
## Curator Notes ## Curator Notes
PRIMARY CONNECTION: [[the alignment tax creates a structural race to the bottom]] — Schneier's critique challenges whether Mythos restriction is a genuine alignment tax payment or a commercially rational safety narrative PRIMARY CONNECTION: the alignment tax creates a structural race to the bottom — Schneier's critique challenges whether Mythos restriction is a genuine alignment tax payment or a commercially rational safety narrative
WHY ARCHIVED: Authoritative skeptical counterweight to the Anthropic Glasswing narrative — necessary for calibrated KB treatment of Mythos restriction as a safety action vs. PR strategy WHY ARCHIVED: Authoritative skeptical counterweight to the Anthropic Glasswing narrative — necessary for calibrated KB treatment of Mythos restriction as a safety action vs. PR strategy

View file

@ -45,7 +45,7 @@ Roger Bannister's 1954 sub-four-minute mile: "The barrier was never physical. It
**KB connections:** **KB connections:**
- [[AI lowers the expertise barrier for engineering biological weapons from PhD-level to amateur which makes bioterrorism the most proximate AI-enabled existential risk]] — direct structural parallel: Mythos does for cyber what o3 did for bio. Both eliminate expertise requirements, both create proliferation risk, both within 9-12 months of competitive replication. The KB claim about bioweapons as most proximate risk may need updating: cyber offense capability is now equally democratized. - [[AI lowers the expertise barrier for engineering biological weapons from PhD-level to amateur which makes bioterrorism the most proximate AI-enabled existential risk]] — direct structural parallel: Mythos does for cyber what o3 did for bio. Both eliminate expertise requirements, both create proliferation risk, both within 9-12 months of competitive replication. The KB claim about bioweapons as most proximate risk may need updating: cyber offense capability is now equally democratized.
- [[economic forces push humans out of every cognitive loop where output quality is independently verifiable]] — the "autonomous systems requiring guardrails rather than approval gates" framing suggests security organizations are already adapting by removing humans from the approve-every-action loop. Economic forces and threat response are converging on the same outcome. - economic forces push humans out of every cognitive loop where output quality is independently verifiable — the "autonomous systems requiring guardrails rather than approval gates" framing suggests security organizations are already adapting by removing humans from the approve-every-action loop. Economic forces and threat response are converging on the same outcome.
**Extraction hints:** **Extraction hints:**
"Advanced AI-enabled cyber offense capabilities are projected to proliferate from restricted frontier labs to broad availability within 9-12 months of capability demonstration — following the 'four-minute mile' dynamic where demonstrated possibility accelerates replication." Confidence: experimental (analyst projection, not historical data; based on prior AI capability proliferation patterns). "Advanced AI-enabled cyber offense capabilities are projected to proliferate from restricted frontier labs to broad availability within 9-12 months of capability demonstration — following the 'four-minute mile' dynamic where demonstrated possibility accelerates replication." Confidence: experimental (analyst projection, not historical data; based on prior AI capability proliferation patterns).
@ -54,7 +54,7 @@ Roger Bannister's 1954 sub-four-minute mile: "The barrier was never physical. It
## Curator Notes ## Curator Notes
PRIMARY CONNECTION: [[AI lowers the expertise barrier for engineering biological weapons]] — Mythos creates the direct cyber parallel to this claim, potentially warranting a new parallel claim about cyber offense PRIMARY CONNECTION: AI lowers the expertise barrier for engineering biological weapons — Mythos creates the direct cyber parallel to this claim, potentially warranting a new parallel claim about cyber offense
WHY ARCHIVED: The 9-12 month proliferation timeline is the specific quantitative parameter that turns the governance question from abstract to operational WHY ARCHIVED: The 9-12 month proliferation timeline is the specific quantitative parameter that turns the governance question from abstract to operational

View file

@ -47,7 +47,7 @@ Counter-framing for the Mythos narrative: "Mythos-class AI cyber capabilities re
## Curator Notes ## Curator Notes
PRIMARY CONNECTION: [[agent research direction selection is epistemic foraging where the optimal strategy is to seek observations that maximally reduce model uncertainty]] — archived primarily as disconfirmation/calibration for the high-excitement Mythos framing; helps extractor avoid over-weighting the "threshold event" narrative PRIMARY CONNECTION: agent research direction selection is epistemic foraging where the optimal strategy is to seek observations that maximally reduce model uncertainty — archived primarily as disconfirmation/calibration for the high-excitement Mythos framing; helps extractor avoid over-weighting the "threshold event" narrative
WHY ARCHIVED: Necessary skeptical counterweight to the capability-threshold framing; ensures extractors consider whether Mythos warrants "new claim territory" or just updating confidence on existing claims WHY ARCHIVED: Necessary skeptical counterweight to the capability-threshold framing; ensures extractors consider whether Mythos warrants "new claim territory" or just updating confidence on existing claims