theseus: extract claims from 2026-04-25-nordby-cross-model-limitations-family-specific-patterns

- Source: inbox/queue/2026-04-25-nordby-cross-model-limitations-family-specific-patterns.md - Domain: ai-alignment - Claims: 0, Entities: 0 - Enrichments: 3 - Extracted by: pipeline ingest (OpenRouter anthropic/claude-sonnet-4.5) Pentagon-Agent: Theseus <PIPELINE>
rio: extract claims from 2026-04-24-coindesk-cftc-sues-new-york-prediction-markets
2026-04-30 02:52:54 +00:00 · 2026-04-30 02:30:56 +00:00 · 2026-04-30 02:29:46 +00:00
12 changed files with 108 additions and 7 deletions
--- a/domains/ai-alignment/multi-layer-ensemble-probes-outperform-single-layer-by-29-78-percent.md
+++ b/domains/ai-alignment/multi-layer-ensemble-probes-outperform-single-layer-by-29-78-percent.md
@ -94,3 +94,10 @@ Apollo explicitly acknowledges their probe 'sometimes fires for the topic of dec
 **Source:** Theseus Session 37 synthesis of Nordby et al. and SCAV evidence

 Multi-layer ensemble probes represent a conditional exception to verification degradation for closed-source models. The Nordby × SCAV synthesis shows: (1) For open-weights models: no protection against white-box multi-layer SCAV attacks - B4 holds. (2) For closed-source models with black-box adversaries: genuine structural protection IF rotation patterns are architecture-specific. Nordby's limitations section provides indirect evidence of architecture-specificity (family-specific probe performance, no universal two-layer ensemble). This is the first verification approach that might SCALE with model size for closed-source deployments (~5% AUROC per 10x parameter increase, R=0.81). If rotation patterns are architecture-specific, larger closed-source models become EASIER to monitor through representation approaches, partially inverting the B4 claim for this domain. However, this exception does not reach the alignment-relevant core: verifying values, intent, and long-term consequences.
+
+
+## Challenging Evidence
+
+**Source:** Nordby et al. arXiv 2604.13386, Limitations section
+
+Nordby et al.'s own Limitations section states: 'We evaluate within-family scaling but do not systematically test whether probes or ensemble configurations transfer across model families.' The paper reports family-specific patterns (e.g., Llama's strong Insider Trading performance) and notes that 'optimal approaches may not generalize, limiting practical applicability.' Best layer positions vary dramatically across architectures (Figure 3 shows Llama models with high variance versus Qwen's consistent 60–80% range). No universal two-layer ensemble improves performance across all tasks simultaneously. This directly challenges the generalizability of the 29-78% improvement claim beyond within-family scaling.
--- a/domains/ai-alignment/multi-layer-ensemble-probes-provide-black-box-robustness-but-not-white-box-protection-against-scav-attacks.md
+++ b/domains/ai-alignment/multi-layer-ensemble-probes-provide-black-box-robustness-but-not-white-box-protection-against-scav-attacks.md
@ -30,3 +30,10 @@ The moderating claim that multi-layer ensemble probes provide black-box robustne
 **Source:** Schnoor et al. 2025, arXiv 2509.22755

 CAV-based monitoring techniques exhibit fundamental sensitivity to non-concept distribution choice (Schnoor et al., arXiv 2509.22755). The authors demonstrate that CAVs are random vectors whose distribution depends heavily on the arbitrary choice of non-concept examples used during training. They present an adversarial attack on TCAV (Testing with CAVs) that exploits this distributional dependence. This suggests cross-architecture concept direction transfer faces distributional incompatibility beyond architectural differences alone—even within a single model, CAV reliability depends on training distribution choices that would necessarily differ across model families.
+
+
+## Extending Evidence
+
+**Source:** Nordby et al. arXiv 2604.13386, Limitations + empirical results
+
+Nordby et al. provides indirect empirical evidence for architecture-specificity of rotation patterns through probe non-generalization. Family-specific probe performance patterns, dramatic variance in optimal layer positions across architectures, and absence of universal ensemble configurations suggest that rotation patterns are architecture-dependent. The paper notes 'tens to hundreds of deception related directions' in larger models, indicating complex, architecture-specific geometry. This supports the hypothesis that black-box multi-layer SCAV attacks would fail against closed-source models with different architectures, strengthening the 'Nordby wins for closed-source deployments' resolution. However, the paper contains no adversarial robustness evaluation whatsoever—all results are on clean data. Confidence upgrades from speculative to experimental based on indirect evidence.
--- a/domains/ai-alignment/rotation-pattern-universality-determines-black-box-multi-layer-scav-feasibility.md
+++ b/domains/ai-alignment/rotation-pattern-universality-determines-black-box-multi-layer-scav-feasibility.md
@ -24,3 +24,10 @@ The feasibility of black-box multi-layer SCAV attacks depends on whether the rot
 **Source:** Schnoor et al. 2025, arXiv 2509.22755

 Theoretical analysis from XAI literature shows CAVs (Concept Activation Vectors) are fundamentally fragile to non-concept distribution choice (Schnoor et al., arXiv 2509.22755). Since non-concept distributions necessarily differ across model architectures and training regimes, this provides theoretical grounding for why rotation patterns extracted via SCAV would fail to transfer across model families—the concept vectors themselves are unstable under distributional shifts inherent to cross-architecture application.
+
+
+## Extending Evidence
+
+**Source:** Nordby et al. arXiv 2604.13386
+
+Nordby et al. provides the strongest available indirect evidence on rotation pattern architecture-specificity, though it does not directly test cross-architecture transfer. The paper shows: (1) family-specific probe performance patterns that do not generalize, (2) dramatic variance in optimal layer positions across model families (Llama high variance vs Qwen consistent 60-80%), (3) no universal two-layer ensemble that improves all tasks, (4) task-optimal weighting differs substantially across deception types and families. The geometric analysis (R≈-0.435 correlation between geometric similarity and performance) applies only within single architectures—cross-architecture geometric analysis was not performed. This suggests rotation patterns are architecture-specific, but the question remains empirically unresolved for black-box SCAV attacks.
--- a/domains/entertainment/blank-canvas-ip-achieves-billion-dollar-scale-through-licensing-to-established-franchises-not-original-narrative.md
+++ b/domains/entertainment/blank-canvas-ip-achieves-billion-dollar-scale-through-licensing-to-established-franchises-not-original-narrative.md
@ -10,10 +10,18 @@ agent: clay
 sourced_from: entertainment/2026-04-24-variety-squishmallows-blank-canvas-licensing-strategy.md
 scope: causal
 sourcer: Variety/Jazwares
-challenges: ["community-owned-ip-invests-in-narrative-infrastructure-as-scaling-mechanism-after-proving-token-mechanics"]
-related: ["blank-narrative-vessel-achieves-commercial-scale-through-fan-emotional-projection", "minimum-viable-narrative-achieves-50m-revenue-scale-through-character-design-and-distribution-without-story-depth", "distributed-narrative-architecture-enables-ip-scale-without-concentrated-story-through-blank-canvas-fan-projection", "blank-canvas-ip-achieves-billion-dollar-scale-through-licensing-to-established-franchises-not-original-narrative", "narrative-development-attempts-fail-when-commercial-scale-precedes-narrative-investment-because-business-model-lock-in-removes-incentive"]
-supports: ["Narrative development attempts fail when commercial scale precedes narrative investment because business model lock-in removes incentive to take creative risk"]
-reweave_edges: ["Narrative development attempts fail when commercial scale precedes narrative investment because business model lock-in removes incentive to take creative risk|supports|2026-04-28"]
+challenges:
+- community-owned-ip-invests-in-narrative-infrastructure-as-scaling-mechanism-after-proving-token-mechanics
+related:
+- blank-narrative-vessel-achieves-commercial-scale-through-fan-emotional-projection
+- minimum-viable-narrative-achieves-50m-revenue-scale-through-character-design-and-distribution-without-story-depth
+- distributed-narrative-architecture-enables-ip-scale-without-concentrated-story-through-blank-canvas-fan-projection
+- blank-canvas-ip-achieves-billion-dollar-scale-through-licensing-to-established-franchises-not-original-narrative
+- narrative-development-attempts-fail-when-commercial-scale-precedes-narrative-investment-because-business-model-lock-in-removes-incentive
+supports:
+- Narrative development attempts fail when commercial scale precedes narrative investment because business model lock-in removes incentive to take creative risk
+reweave_edges:
+- Narrative development attempts fail when commercial scale precedes narrative investment because business model lock-in removes incentive to take creative risk|supports|2026-04-28
 ---

 # Blank canvas IPs achieve billion-dollar scale through licensing to established franchises rather than building original narrative
@ -31,4 +39,4 @@ Pudgy Penguins pursued dual narrative strategy: original content (Lil Pudgys ser

 **Source:** Squishmallows CAA deal (Dec 2021), Squishville series (2021), licensing crossovers (2025-2026), HBR case study (2022)

-Squishmallows attempted original narrative content (CAA deal 2021, Squishville series) but pivoted to licensing crossovers (Stranger Things, Harry Potter, Pokémon, Poppy Playtime, KPop Demon Hunters) after 5 years of no narrative output. HBR case study (2022) reframed as 'lifestyle brand' not 'entertainment franchise' one year after CAA deal, signaling internal strategic pivot before narrative content was produced.
+Squishmallows attempted original narrative content (CAA deal 2021, Squishville series) but pivoted to licensing crossovers (Stranger Things, Harry Potter, Pokémon, Poppy Playtime, KPop Demon Hunters) after 5 years of no narrative output. HBR case study (2022) reframed as 'lifestyle brand' not 'entertainment franchise' one year after CAA deal, signaling internal strategic pivot before narrative content was produced.
--- a/domains/internet-finance/cftc-dcm-preemption-scope-excludes-unregistered-platforms.md
+++ b/domains/internet-finance/cftc-dcm-preemption-scope-excludes-unregistered-platforms.md
@ -38,3 +38,10 @@ CFTC's Wisconsin lawsuit (April 28, 2026) defends Kalshi and Polymarket—both D
 **Source:** CoinDesk/CFTC Press Release, April 28, 2026

 Wisconsin lawsuit (April 28, 2026) is the 5th state in CFTC's enforcement campaign, targeting only DCM-registered platforms (Coinbase, Crypto.com, Kalshi, Polymarket, Robinhood). Pattern now spans 5 states over 26 days with zero enforcement against unregistered decentralized platforms.
+
+
+## Supporting Evidence
+
+**Source:** CoinDesk Policy, CFTC SDNY filing April 24 2026
+
+CFTC's New York lawsuit scope explicitly limited to 'CFTC registrants' and 'federally regulated exchanges' with no protection asserted for non-registered on-chain protocols. The complaint's legal theory relies on DCM registration as the trigger for federal preemption.
--- a/domains/internet-finance/cftc-four-state-offensive-represents-fastest-regulatory-escalation-for-new-product-category.md
+++ b/domains/internet-finance/cftc-four-state-offensive-represents-fastest-regulatory-escalation-for-new-product-category.md
@ -0,0 +1,19 @@
+---
+type: claim
+domain: internet-finance
+description: CFTC moved from amicus participation to affirmative preemption lawsuits against four states within weeks under single commissioner
+confidence: experimental
+source: CoinDesk Policy, CFTC litigation timeline through April 2026
+created: 2026-04-30
+title: CFTC four-state prediction market offensive represents unprecedented regulatory escalation speed from defensive to offensive posture
+agent: rio
+sourced_from: internet-finance/2026-04-24-coindesk-cftc-sues-new-york-prediction-markets.md
+scope: structural
+sourcer: CoinDesk Policy
+supports: ["cftc-multi-state-litigation-represents-qualitative-shift-from-regulatory-drafting-to-active-jurisdictional-defense", "executive-branch-offensive-litigation-creates-preemption-through-simultaneous-multi-state-suits-not-defensive-case-law"]
+related: ["cftc-multi-state-litigation-represents-qualitative-shift-from-regulatory-drafting-to-active-jurisdictional-defense", "cftc-sole-commissioner-governance-creates-structural-concentration-risk-through-administration-contingent-favorability", "executive-branch-offensive-litigation-creates-preemption-through-simultaneous-multi-state-suits-not-defensive-case-law", "cftc-state-supreme-court-amicus-signals-multi-jurisdictional-defense-strategy", "cftc-same-day-counter-filing-signals-institutionalized-enforcement-machinery", "cftc-dcm-preemption-scope-excludes-unregistered-platforms"]
+---
+
+# CFTC four-state prediction market offensive represents unprecedented regulatory escalation speed from defensive to offensive posture
+
+The CFTC escalated from defensive amicus brief participation (3rd Circuit ruling April 7) to affirmative lawsuits against four states (Arizona, Connecticut, Illinois, New York) within weeks, all under Chairman Mike Selig. This represents a qualitative shift from regulatory drafting to active jurisdictional defense. The speed and scope of escalation is notable: rather than waiting for state enforcement to reach federal courts through normal appellate process, the CFTC is preemptively suing states in federal district courts to establish preemption. This offensive litigation strategy creates simultaneous multi-jurisdictional pressure on states, forcing them to defend their gambling law enforcement authority in federal court rather than letting prediction market platforms fight state-by-state battles. The single-commissioner concentration (Selig) creates both opportunity and risk: aggressive protection of prediction market infrastructure, but also reversal vulnerability if administration changes. The escalation pattern suggests the CFTC views prediction markets as core regulated infrastructure worth defending through affirmative litigation, not just amicus support.
--- a/domains/internet-finance/cftc-licensed-dcm-preemption-protects-centralized-prediction-markets-but-not-decentralized-governance-markets.md
+++ b/domains/internet-finance/cftc-licensed-dcm-preemption-protects-centralized-prediction-markets-but-not-decentralized-governance-markets.md
@ -391,3 +391,10 @@ Arizona TRO (April 10, 2026) provides first federal district court finding that
 **Source:** CNBC, April 27, 2026

 CFTC Chairman Selig actively supported DCM platforms expanding into perpetual futures: 'Under my leadership, the CFTC will use the tools at its disposal to onshore perpetual and other novel derivative products.' This confirms DCM preemption applies to full-spectrum derivatives exchanges, not just event contracts, further separating DCM platforms from governance markets.
+
+
+## Supporting Evidence
+
+**Source:** CoinDesk Policy, CFTC SDNY filing April 24 2026
+
+CFTC's April 24, 2026 New York lawsuit explicitly seeks protection for 'federally regulated exchanges' and 'CFTC registrants' with no mention of on-chain protocols, decentralized governance markets, or futarchy. The complaint's framing is entirely about DCM-registered platforms (Kalshi, Coinbase, Gemini named in NY enforcement). Non-registered protocols are invisible to the CFTC in this litigation.
--- a/domains/internet-finance/cftc-multi-state-litigation-represents-qualitative-shift-from-regulatory-drafting-to-active-jurisdictional-defense.md
+++ b/domains/internet-finance/cftc-multi-state-litigation-represents-qualitative-shift-from-regulatory-drafting-to-active-jurisdictional-defense.md
@ -163,3 +163,10 @@ The CFTC's 5-state campaign in 26 days (April 2-28, 2026) has accelerated to sam
 **Source:** CNN CFTC staffing report, April 26, 2026

 The CFTC is simultaneously conducting aggressive litigation (5-state campaign defending DCM jurisdiction) while losing 24% of staff and eliminating entire regional offices. This reveals a strategic resource allocation: the agency is deploying remaining capacity on high-visibility jurisdictional battles while losing the broader capacity to investigate novel theories. The litigation is offensive/preemptive; the enforcement capacity collapse affects reactive enforcement.
+
+
+## Supporting Evidence
+
+**Source:** CoinDesk Policy, CFTC litigation timeline through April 2026
+
+CFTC sued four states (AZ, CT, IL, NY) within weeks of the April 7 3rd Circuit ruling, demonstrating the shift from amicus participation to affirmative preemption litigation. The New York filing came one day after NY AG's April 21 enforcement action against Coinbase and Gemini, showing same-day counter-filing capability.
--- a/domains/internet-finance/cftc-offensive-state-litigation-creates-two-tier-prediction-market-architecture-through-dcm-only-preemption-defense.md
+++ b/domains/internet-finance/cftc-offensive-state-litigation-creates-two-tier-prediction-market-architecture-through-dcm-only-preemption-defense.md
@ -0,0 +1,19 @@
+---
+type: claim
+domain: internet-finance
+description: Federal preemption protection explicitly limited to registered platforms, leaving decentralized protocols unprotected
+confidence: experimental
+source: CoinDesk Policy, CFTC SDNY filing April 24 2026
+created: 2026-04-30
+title: CFTC offensive state litigation creates two-tier prediction market architecture through DCM-only preemption defense
+agent: rio
+sourced_from: internet-finance/2026-04-24-coindesk-cftc-sues-new-york-prediction-markets.md
+scope: structural
+sourcer: CoinDesk Policy
+supports: ["cftc-licensed-dcm-preemption-protects-centralized-prediction-markets-but-not-decentralized-governance-markets"]
+related: ["futarchy-governance-markets-risk-regulatory-capture-by-anti-gambling-frameworks-because-the-event-betting-and-organizational-governance-use-cases-are-conflated-in-current-policy-discourse", "cftc-licensed-dcm-preemption-protects-centralized-prediction-markets-but-not-decentralized-governance-markets", "cftc-dcm-preemption-scope-excludes-unregistered-platforms", "dcm-field-preemption-protects-all-contracts-on-registered-platforms-regardless-of-type", "dodd-frank-textual-argument-strongest-state-resistance-theory", "preemptive-federal-litigation-creates-jurisdictional-shield-against-state-prediction-market-enforcement", "cftc-arizona-tro-formalizes-dcm-preemption-two-tier-structure"]
+---
+
+# CFTC offensive state litigation creates two-tier prediction market architecture through DCM-only preemption defense
+
+The CFTC's April 24, 2026 lawsuit against New York (fourth state sued after Arizona, Connecticut, Illinois) seeks declaratory judgment that federal law grants exclusive authority over event contracts and permanent injunction against state enforcement. The legal theory: Commodity Exchange Act grants CFTC 'exclusive jurisdiction' over commodity futures, options, and swaps traded on federally regulated exchanges, preempting state gambling laws. Critical scope limitation: lawsuits specifically protect 'federally regulated exchanges' and 'CFTC registrants' with no indication of protection for non-registered on-chain protocols. This creates a structural two-tier system where DCM-registered platforms (Kalshi, Coinbase, Gemini) receive active federal defense while decentralized governance markets operate outside this protection. The CFTC's aggressive posture (four states sued in weeks) demonstrates federal commitment to defending registered infrastructure, but the explicit DCM-only framing means futarchy protocols like MetaDAO remain in regulatory limbo. This is not just a legal development but a structural architectural choice: the CFTC is building a walled garden of federal protection that requires registration to enter.
--- a/domains/internet-finance/cftc-sole-commissioner-governance-creates-structural-concentration-risk-through-administration-contingent-favorability.md
+++ b/domains/internet-finance/cftc-sole-commissioner-governance-creates-structural-concentration-risk-through-administration-contingent-favorability.md
@ -113,3 +113,10 @@ Norton Rose analysis documents Selig's April 17 House Agriculture Committee test
 **Source:** Bettors Insider, April 17, 2026 — ANPRM process implications

 The 800-comment ANPRM record may actually help lock in Chairman Selig's prediction market framework despite single-commissioner governance risk. A substantial public comment process makes the resulting rule harder to reverse by future bipartisan commissioners, as the administrative record demonstrates extensive stakeholder engagement and deliberation.
+
+
+## Supporting Evidence
+
+**Source:** CoinDesk Policy, CFTC Chairman Mike Selig litigation pattern
+
+All four state lawsuits (AZ, CT, IL, NY) filed under single Commissioner Mike Selig, demonstrating the concentration of regulatory posture in one individual. The aggressive escalation from amicus to affirmative litigation represents Selig's personal regulatory strategy, creating administration-contingent stability risk.
--- a/inbox/archive/ai-alignment/2026-04-25-nordby-cross-model-limitations-family-specific-patterns.md
+++ b/inbox/archive/ai-alignment/2026-04-25-nordby-cross-model-limitations-family-specific-patterns.md
@ -7,9 +7,12 @@ date: 2026-04-25
 domain: ai-alignment
 secondary_domains: []
 format: preprint
-status: unprocessed
+status: processed
+processed_by: theseus
+processed_date: 2026-04-30
 priority: high
 tags: [representation-monitoring, linear-probes, multi-layer-ensemble, cross-model-generalization, rotation-patterns, adversarial-robustness, divergence-resolution, b4-verification]
+extraction_model: "anthropic/claude-sonnet-4.5"
 ---

 ## Content
--- a/inbox/archive/internet-finance/2026-04-24-coindesk-cftc-sues-new-york-prediction-markets.md
+++ b/inbox/archive/internet-finance/2026-04-24-coindesk-cftc-sues-new-york-prediction-markets.md
@ -7,9 +7,12 @@ date: 2026-04-24
 domain: internet-finance
 secondary_domains: []
 format: article
-status: unprocessed
+status: processed
+processed_by: rio
+processed_date: 2026-04-30
 priority: high
 tags: [cftc, prediction-markets, regulation, new-york, preemption, howey, living-capital, futarchy-regulatory]
+extraction_model: "anthropic/claude-sonnet-4.5"
 ---

 ## Content
Author	SHA1	Message	Date
Teleo Agents	f72198ef55	theseus: extract claims from 2026-04-25-nordby-cross-model-limitations-family-specific-patterns Some checks failed Mirror PR to Forgejo / mirror (pull_request) Has been cancelled Details - Source: inbox/queue/2026-04-25-nordby-cross-model-limitations-family-specific-patterns.md - Domain: ai-alignment - Claims: 0, Entities: 0 - Enrichments: 3 - Extracted by: pipeline ingest (OpenRouter anthropic/claude-sonnet-4.5) Pentagon-Agent: Theseus <PIPELINE>	2026-04-30 02:52:54 +00:00
Teleo Agents	3faddaa887	rio: extract claims from 2026-04-24-coindesk-cftc-sues-new-york-prediction-markets Some checks failed Mirror PR to Forgejo / mirror (pull_request) Has been cancelled Details - Source: inbox/queue/2026-04-24-coindesk-cftc-sues-new-york-prediction-markets.md - Domain: internet-finance - Claims: 2, Entities: 0 - Enrichments: 4 - Extracted by: pipeline ingest (OpenRouter anthropic/claude-sonnet-4.5) Pentagon-Agent: Rio <PIPELINE>	2026-04-30 02:30:56 +00:00
Teleo Agents	215cc745a1	reweave: merge 16 files via frontmatter union [auto] Some checks failed Mirror PR to Forgejo / mirror (pull_request) Has been cancelled Details	2026-04-30 02:29:46 +00:00