Compare commits

...

2 commits

Author SHA1 Message Date
Teleo Agents
c61ea2c652 theseus: extract claims from 2026-04-25-nordby-cross-model-limitations-family-specific-patterns
Some checks failed
Mirror PR to Forgejo / mirror (pull_request) Has been cancelled
- Source: inbox/queue/2026-04-25-nordby-cross-model-limitations-family-specific-patterns.md
- Domain: ai-alignment
- Claims: 0, Entities: 0
- Enrichments: 3
- Extracted by: pipeline ingest (OpenRouter anthropic/claude-sonnet-4.5)

Pentagon-Agent: Theseus <PIPELINE>
2026-04-30 03:17:22 +00:00
Teleo Agents
2be91c8eb6 rio: extract claims from 2026-04-29-polymarket-seeks-cftc-main-exchange-us-reapproval
Some checks failed
Mirror PR to Forgejo / mirror (pull_request) Has been cancelled
- Source: inbox/queue/2026-04-29-polymarket-seeks-cftc-main-exchange-us-reapproval.md
- Domain: internet-finance
- Claims: 0, Entities: 0
- Enrichments: 3
- Extracted by: pipeline ingest (OpenRouter anthropic/claude-sonnet-4.5)

Pentagon-Agent: Rio <PIPELINE>
2026-04-30 03:16:18 +00:00
7 changed files with 51 additions and 18 deletions

View file

@ -94,3 +94,10 @@ Apollo explicitly acknowledges their probe 'sometimes fires for the topic of dec
**Source:** Theseus Session 37 synthesis of Nordby et al. and SCAV evidence
Multi-layer ensemble probes represent a conditional exception to verification degradation for closed-source models. The Nordby × SCAV synthesis shows: (1) For open-weights models: no protection against white-box multi-layer SCAV attacks - B4 holds. (2) For closed-source models with black-box adversaries: genuine structural protection IF rotation patterns are architecture-specific. Nordby's limitations section provides indirect evidence of architecture-specificity (family-specific probe performance, no universal two-layer ensemble). This is the first verification approach that might SCALE with model size for closed-source deployments (~5% AUROC per 10x parameter increase, R=0.81). If rotation patterns are architecture-specific, larger closed-source models become EASIER to monitor through representation approaches, partially inverting the B4 claim for this domain. However, this exception does not reach the alignment-relevant core: verifying values, intent, and long-term consequences.
## Challenging Evidence
**Source:** Nordby et al. arXiv 2604.13386, Limitations section
Nordby et al.'s own Limitations section states: 'We evaluate within-family scaling but do not systematically test whether probes or ensemble configurations transfer across model families.' The paper reports family-specific patterns (e.g., Llama's strong Insider Trading performance) and notes that 'optimal approaches may not generalize, limiting practical applicability.' Best layer positions vary dramatically across architectures (Figure 3 shows Llama models with high variance versus Qwen's consistent 6080% range). No universal two-layer ensemble improves performance across all tasks simultaneously. This directly challenges the generalizability of the 29-78% improvement claim beyond within-family scaling.

View file

@ -30,3 +30,10 @@ The moderating claim that multi-layer ensemble probes provide black-box robustne
**Source:** Schnoor et al. 2025, arXiv 2509.22755
CAV-based monitoring techniques exhibit fundamental sensitivity to non-concept distribution choice (Schnoor et al., arXiv 2509.22755). The authors demonstrate that CAVs are random vectors whose distribution depends heavily on the arbitrary choice of non-concept examples used during training. They present an adversarial attack on TCAV (Testing with CAVs) that exploits this distributional dependence. This suggests cross-architecture concept direction transfer faces distributional incompatibility beyond architectural differences alone—even within a single model, CAV reliability depends on training distribution choices that would necessarily differ across model families.
## Extending Evidence
**Source:** Nordby et al. arXiv 2604.13386, Limitations + empirical results
Nordby et al. provides indirect empirical evidence for architecture-specificity of rotation patterns through probe non-generalization. Family-specific probe performance patterns, dramatic variance in optimal layer positions across architectures, and absence of universal ensemble configurations suggest that rotation patterns are architecture-dependent. The paper notes 'tens to hundreds of deception related directions' in larger models, indicating complex, architecture-specific geometry. This supports the hypothesis that black-box multi-layer SCAV attacks would fail against closed-source models with different architectures, strengthening the 'Nordby wins for closed-source deployments' resolution. However, the paper contains no adversarial robustness evaluation whatsoever—all results are on clean data. Confidence upgrades from speculative to experimental based on indirect evidence.

View file

@ -24,3 +24,10 @@ The feasibility of black-box multi-layer SCAV attacks depends on whether the rot
**Source:** Schnoor et al. 2025, arXiv 2509.22755
Theoretical analysis from XAI literature shows CAVs (Concept Activation Vectors) are fundamentally fragile to non-concept distribution choice (Schnoor et al., arXiv 2509.22755). Since non-concept distributions necessarily differ across model architectures and training regimes, this provides theoretical grounding for why rotation patterns extracted via SCAV would fail to transfer across model families—the concept vectors themselves are unstable under distributional shifts inherent to cross-architecture application.
## Extending Evidence
**Source:** Nordby et al. arXiv 2604.13386
Nordby et al. provides the strongest available indirect evidence on rotation pattern architecture-specificity, though it does not directly test cross-architecture transfer. The paper shows: (1) family-specific probe performance patterns that do not generalize, (2) dramatic variance in optimal layer positions across model families (Llama high variance vs Qwen consistent 60-80%), (3) no universal two-layer ensemble that improves all tasks, (4) task-optimal weighting differs substantially across deception types and families. The geometric analysis (R≈-0.435 correlation between geometric similarity and performance) applies only within single architectures—cross-architecture geometric analysis was not performed. This suggests rotation patterns are architecture-specific, but the question remains empirically unresolved for black-box SCAV attacks.

View file

@ -7,7 +7,7 @@ source: Multiple sources (PYMNTS, CoinDesk, Crowdfund Insider, TheBulldog.law),
created: 2026-03-11
secondary_domains: ["grand-strategy"]
supports: ["The CFTC's multi-state litigation posture represents a qualitative shift from regulatory rule-drafting to active jurisdictional defense of prediction markets", "QCX", "trump-jr-dual-investment-creates-political-legitimacy-risk-for-prediction-market-preemption-regardless-of-legal-merit"]
related: ["CFTC-licensed DCM preemption protects centralized prediction markets from state gambling law but leaves decentralized governance markets legally exposed because they cannot access the DCM licensing pathway", "Prediction market SCOTUS cert is likely by early 2027 because three-circuit litigation pattern creates formal split by summer 2026 and 34-state amicus participation signals federalism stakes justify review", "Third Circuit ruling creates first federal appellate precedent for CFTC preemption of state gambling laws making Supreme Court review near-certain", "Trump Jr.'s dual investment in Kalshi and Polymarket creates a structural conflict of interest that undermines prediction market regulatory legitimacy regardless of legal merit", "State prediction market enforcement extends to federally licensed exchanges creating institutional exposure beyond specialized platforms", "qcx", "polymarket-achieved-us-regulatory-legitimacy-through-qcx-acquisition-establishing-prediction-markets-as-cftc-regulated-derivatives", "polymarket-kalshi-duopoly-emerging-as-dominant-us-prediction-market-structure-with-complementary-regulatory-models", "prediction-market-regulatory-legitimacy-creates-both-opportunity-and-existential-risk-for-decision-markets"]
related: ["CFTC-licensed DCM preemption protects centralized prediction markets from state gambling law but leaves decentralized governance markets legally exposed because they cannot access the DCM licensing pathway", "Prediction market SCOTUS cert is likely by early 2027 because three-circuit litigation pattern creates formal split by summer 2026 and 34-state amicus participation signals federalism stakes justify review", "Third Circuit ruling creates first federal appellate precedent for CFTC preemption of state gambling laws making Supreme Court review near-certain", "Trump Jr.'s dual investment in Kalshi and Polymarket creates a structural conflict of interest that undermines prediction market regulatory legitimacy regardless of legal merit", "State prediction market enforcement extends to federally licensed exchanges creating institutional exposure beyond specialized platforms", "qcx", "polymarket-achieved-us-regulatory-legitimacy-through-qcx-acquisition-establishing-prediction-markets-as-cftc-regulated-derivatives", "polymarket-kalshi-duopoly-emerging-as-dominant-us-prediction-market-structure-with-complementary-regulatory-models", "prediction-market-regulatory-legitimacy-creates-both-opportunity-and-existential-risk-for-decision-markets", "dcm-registered-prediction-market-platforms-converging-on-perpetual-futures-marks-structural-repositioning-as-full-spectrum-derivatives-exchanges-creating-three-way-category-split"]
reweave_edges: ["CFTC-licensed DCM preemption protects centralized prediction markets from state gambling law but leaves decentralized governance markets legally exposed because they cannot access the DCM licensing pathway|related|2026-04-17", "The CFTC's multi-state litigation posture represents a qualitative shift from regulatory rule-drafting to active jurisdictional defense of prediction markets|supports|2026-04-17", "Prediction market SCOTUS cert is likely by early 2027 because three-circuit litigation pattern creates formal split by summer 2026 and 34-state amicus participation signals federalism stakes justify review|related|2026-04-19", "QCX|supports|2026-04-19", "Third Circuit ruling creates first federal appellate precedent for CFTC preemption of state gambling laws making Supreme Court review near-certain|related|2026-04-20", "trump-jr-dual-investment-creates-political-legitimacy-risk-for-prediction-market-preemption-regardless-of-legal-merit|supports|2026-04-20", "Trump Jr.'s dual investment in Kalshi and Polymarket creates a structural conflict of interest that undermines prediction market regulatory legitimacy regardless of legal merit|related|2026-04-20", "State prediction market enforcement extends to federally licensed exchanges creating institutional exposure beyond specialized platforms|related|2026-04-24"]
sourced_from: ["inbox/archive/internet-finance/2026-01-20-polymarket-cftc-approval-qcx-acquisition.md"]
---
@ -118,3 +118,10 @@ Topics:
**Source:** CNBC, April 27, 2026
Polymarket's DCM platform (via QCEX acquisition) launched perpetual futures on crypto assets with up to 10x leverage on April 21, 2026—the first time a CFTC-registered prediction market platform has offered crypto perps to US users. This represents strategic expansion beyond event contracts into the much larger derivatives market (perps = 70%+ of CEX volume, $61.7T in 2025).
## Extending Evidence
**Source:** Bloomberg/CoinDesk April 28, 2026
Polymarket's November 2025 CFTC approval for US platform (via QCEX acquisition) resulted in limited activity despite full DCM registration—sports markets only, minimal volume compared to $10B+ monthly on main exchange. This suggests DCM registration alone is insufficient for volume capture; user experience, product breadth, and trust are critical factors. The April 2026 application to reopen main exchange to US users indicates the initial approval pathway was structurally incomplete for Polymarket's core business model.

View file

@ -1,22 +1,15 @@
---
type: claim
domain: internet-finance
secondary_domains: [grand-strategy]
description: "Polymarket (crypto, CFTC-via-acquisition) and Kalshi (traditional finance, native CFTC approval) are converging on $20B valuations as the two-player market structure for US prediction markets"
description: Polymarket (crypto, CFTC-via-acquisition) and Kalshi (traditional finance, native CFTC approval) are converging on $20B valuations as the two-player market structure for US prediction markets
confidence: experimental
source: "Multiple sources (PYMNTS, CoinDesk, Crowdfund Insider, TheBulldog.law), January 2026"
source: Multiple sources (PYMNTS, CoinDesk, Crowdfund Insider, TheBulldog.law), January 2026
created: 2026-03-11
supports:
- QCX
- DCM-registered prediction market platforms converging on perpetual futures marks structural repositioning as full-spectrum derivatives exchanges, creating a three-way category split distinguishing regulated event platforms, offshore decentralized venues, and on-chain governance markets
reweave_edges:
- QCX|supports|2026-04-19
- DCM-registered prediction market platforms converging on perpetual futures marks structural repositioning as full-spectrum derivatives exchanges, creating a three-way category split distinguishing regulated event platforms, offshore decentralized venues, and on-chain governance markets|supports|2026-04-30
- Kalshi-Hyperliquid HIP-4 partnership creates offshore decentralized prediction market regulatory arbitrage model separating US access from execution infrastructure|related|2026-04-30
sourced_from:
- inbox/archive/internet-finance/2026-01-20-polymarket-cftc-approval-qcx-acquisition.md
related:
- Kalshi-Hyperliquid HIP-4 partnership creates offshore decentralized prediction market regulatory arbitrage model separating US access from execution infrastructure
secondary_domains: ["grand-strategy"]
supports: ["QCX", "DCM-registered prediction market platforms converging on perpetual futures marks structural repositioning as full-spectrum derivatives exchanges, creating a three-way category split distinguishing regulated event platforms, offshore decentralized venues, and on-chain governance markets"]
reweave_edges: ["QCX|supports|2026-04-19", "DCM-registered prediction market platforms converging on perpetual futures marks structural repositioning as full-spectrum derivatives exchanges, creating a three-way category split distinguishing regulated event platforms, offshore decentralized venues, and on-chain governance markets|supports|2026-04-30", "Kalshi-Hyperliquid HIP-4 partnership creates offshore decentralized prediction market regulatory arbitrage model separating US access from execution infrastructure|related|2026-04-30"]
sourced_from: ["inbox/archive/internet-finance/2026-01-20-polymarket-cftc-approval-qcx-acquisition.md"]
related: ["Kalshi-Hyperliquid HIP-4 partnership creates offshore decentralized prediction market regulatory arbitrage model separating US access from execution infrastructure", "polymarket-kalshi-duopoly-emerging-as-dominant-us-prediction-market-structure-with-complementary-regulatory-models", "kalshi", "polymarket", "kalshi-hyperliquid-hip4-partnership-creates-offshore-decentralized-prediction-market-regulatory-arbitrage-model", "dcm-registered-prediction-market-platforms-converging-on-perpetual-futures-marks-structural-repositioning-as-full-spectrum-derivatives-exchanges-creating-three-way-category-split"]
---
# Polymarket-Kalshi duopoly emerging as dominant US prediction market structure with complementary regulatory models
@ -80,4 +73,10 @@ Relevant Notes:
- [[MetaDAO is the futarchy launchpad on Solana where projects raise capital through unruggable ICOs governed by conditional markets creating the first platform for ownership coins at scale]]
Topics:
- domains/internet-finance/_map
- domains/internet-finance/_map
## Extending Evidence
**Source:** Fortune/Bloomberg April 2026
Fortune (April 21, 2026) reports Polymarket is being valued at a discount to Kalshi due to crypto ties and operational stumbles, with Kalshi pulling ahead operationally. This valuation gap reflects market perception that Polymarket's crypto-native architecture (Polygon-based smart contracts) creates additional regulatory friction compared to Kalshi's traditional DCM structure with crypto markets added on top. The $10B monthly volume on Polymarket's international exchange versus limited US platform activity demonstrates the regulatory-volume tradeoff.

View file

@ -7,9 +7,12 @@ date: 2026-04-25
domain: ai-alignment
secondary_domains: []
format: preprint
status: unprocessed
status: processed
processed_by: theseus
processed_date: 2026-04-30
priority: high
tags: [representation-monitoring, linear-probes, multi-layer-ensemble, cross-model-generalization, rotation-patterns, adversarial-robustness, divergence-resolution, b4-verification]
extraction_model: "anthropic/claude-sonnet-4.5"
---
## Content

View file

@ -7,10 +7,13 @@ date: 2026-04-28
domain: internet-finance
secondary_domains: []
format: news-synthesis
status: unprocessed
status: processed
processed_by: rio
processed_date: 2026-04-30
priority: medium
tags: [polymarket, cftc, dcm, us-approval, prediction-markets, regulatory-path]
intake_tier: research-task
extraction_model: "anthropic/claude-sonnet-4.5"
---
## Content