Commit graph

152 commits

Author SHA1 Message Date
Teleo Agents
72eccbd0bc theseus: extract claims from 2026-04-25-theseus-community-silo-interpretability-adversarial-robustness
- Source: inbox/queue/2026-04-25-theseus-community-silo-interpretability-adversarial-robustness.md
- Domain: ai-alignment
- Claims: 1, Entities: 0
- Enrichments: 3
- Extracted by: pipeline ingest (OpenRouter anthropic/claude-sonnet-4.5)

Pentagon-Agent: Theseus <PIPELINE>
2026-04-25 00:19:52 +00:00
Teleo Agents
80c8a80149 theseus: extract claims from 2026-04-25-subliminal-learning-nature-2026-cross-model-failure
Some checks are pending
Mirror PR to Forgejo / mirror (pull_request) Waiting to run
- Source: inbox/queue/2026-04-25-subliminal-learning-nature-2026-cross-model-failure.md
- Domain: ai-alignment
- Claims: 1, Entities: 0
- Enrichments: 2
- Extracted by: pipeline ingest (OpenRouter anthropic/claude-sonnet-4.5)

Pentagon-Agent: Theseus <PIPELINE>
2026-04-25 00:18:57 +00:00
Teleo Agents
287181677b theseus: extract claims from 2026-04-25-draganov-phantom-transfer-data-poisoning-2026
Some checks are pending
Mirror PR to Forgejo / mirror (pull_request) Waiting to run
- Source: inbox/queue/2026-04-25-draganov-phantom-transfer-data-poisoning-2026.md
- Domain: ai-alignment
- Claims: 1, Entities: 0
- Enrichments: 0
- Extracted by: pipeline ingest (OpenRouter anthropic/claude-sonnet-4.5)

Pentagon-Agent: Theseus <PIPELINE>
2026-04-25 00:16:57 +00:00
Teleo Agents
dc84ceb560 theseus: extract claims from 2026-04-25-apollo-detecting-strategic-deception-icml-2025
- Source: inbox/queue/2026-04-25-apollo-detecting-strategic-deception-icml-2025.md
- Domain: ai-alignment
- Claims: 0, Entities: 0
- Enrichments: 4
- Extracted by: pipeline ingest (OpenRouter anthropic/claude-sonnet-4.5)

Pentagon-Agent: Theseus <PIPELINE>
2026-04-25 00:16:33 +00:00
87b720d24e theseus: add 2 claims + 1 enrichment from Anthropic Project Deal
Some checks are pending
Mirror PR to Forgejo / mirror (pull_request) Waiting to run
- What: 2 NEW claims on agent-mediated commerce dynamics from Anthropic's
  December 2025 Project Deal experiment (69 participants, 186 deals,
  statistically significant capability-tier disparities)
  + 1 light enrichment adding corroborating signal to vault-structure claim

- Why: first controlled empirical evidence on user perception of AI agent
  performance. Opus agents extracted $2.68 more per sale / paid $2.45 less
  per purchase than Haiku agents (p<0.05), but users rated fairness
  identically across tiers. This breaks the market feedback loop that
  normally corrects capability gaps.

- New claims:
  * users cannot detect when their AI agent is underperforming because
    subjective fairness ratings decouple from measurable economic
    outcomes (experimental, ai-alignment)
  * agent-mediated commerce produces invisible economic stratification
    because capability gaps translate to measurable market disadvantage
    that users cannot detect and therefore cannot correct through
    provider switching (speculative, ai-alignment)

- Enrichment: vault-structure-vs-prompt claim gets tangential empirical
  signal from Project Deal finding that stylistic negotiation prompts
  had minimal effect while model capability dominated

- Connections: strengthens existing Moloch claims (invisible coordination
  failures), four-restraints erosion (user rationality check eliminated),
  and complements the x402/Superclaw payment infrastructure claims in
  internet-finance

Pentagon-Agent: Theseus <46864dd4-da71-4719-a1b4-68f7c55854d3>

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 20:43:42 +00:00
be8ff41bfe link: bidirectional source↔claim index — 414 claims + 252 sources connected
Wrote sourced_from: into 414 claim files pointing back to their origin source.
Backfilled claims_extracted: into 252 source files that were processed but
missing this field. Matching uses author+title overlap against claim source:
field, validated against 296 known-good pairs from existing claims_extracted.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-21 11:55:18 +01:00
Teleo Agents
363492d0f4 source: 2026-04-00-nordby-linear-probe-accuracy-scales-model-size-multi-layer.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-21 00:28:55 +00:00
Teleo Agents
6385f2ad24 source: 2026-02-00-santos-grueiro-normative-indistinguishability-behavioral-evaluation.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-21 00:27:59 +00:00
Teleo Agents
a5ba361d7f source: 2025-09-00-chaudhary-evaluation-awareness-scales-predictably-open-weights.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-21 00:26:10 +00:00
Teleo Agents
4a36e15cf2 source: 2025-07-00-nguyen-probing-evaluation-awareness-earlier-layers.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-21 00:25:37 +00:00
Teleo Agents
09848a0ea8 source: 2025-05-00-phuong-deepmind-evaluating-frontier-stealth-situational-awareness.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-21 00:24:46 +00:00
Teleo Agents
dec99cd573 source: 2025-05-00-needham-llms-know-when-being-evaluated-auc-083.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-21 00:23:03 +00:00
Teleo Agents
f796f73847 source: 2025-02-00-hofstatter-elicitation-game-capability-evaluation-reliability.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-21 00:22:36 +00:00
Teleo Agents
977e025957 source: 2024-09-00-xu-scav-steering-concept-activation-vectors-jailbreak.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-21 00:21:52 +00:00
Teleo Agents
e14878a8e3 source: 2026-04-04-telegram-m3taversal-what-do-you-think-are-the-most-compelling-approach.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-15 18:53:08 +00:00
Teleo Agents
94463ca6e8 source: 2026-04-04-telegram-m3taversal-how-transformative-are-software-patterns-agentic.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-15 18:51:13 +00:00
Teleo Agents
4a3951ef0a source: 2026-03-21-tice-noise-injection-sandbagging-detection.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-14 17:49:38 +00:00
Teleo Agents
8203d759b8 source: 2026-03-21-schoen-stress-testing-deliberative-alignment.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-14 17:49:05 +00:00
Teleo Agents
baa9408ca4 source: 2026-03-21-international-ai-safety-report-2026-evaluation-gap.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-14 17:47:09 +00:00
Teleo Agents
460526000a source: 2026-03-21-harvard-jolt-sandbagging-risk-allocation.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-14 17:46:25 +00:00
Teleo Agents
d4e0e25714 source: 2026-03-21-arxiv-probing-evaluation-awareness.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-14 17:45:40 +00:00
Teleo Agents
7052eddd79 source: 2026-03-21-arxiv-noise-injection-degrades-safety-guardrails.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-14 17:43:33 +00:00
Teleo Agents
435f2b4def source: 2026-03-21-apollo-research-more-capable-scheming.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-14 17:42:37 +00:00
Teleo Agents
135de371b9 leo: research session 2026-03-21 — 1 sources archived
Some checks failed
Mirror PR to Forgejo / mirror (pull_request) Has been cancelled
Pentagon-Agent: Leo <HEADLESS>
2026-04-14 16:46:19 +00:00
Teleo Agents
8d481be72a source: 2026-04-12-theseus-hardware-tee-activation-monitoring-gap.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-12 00:18:40 +00:00
Teleo Agents
3faa52d0aa source: 2026-04-12-theseus-emotion-vectors-scheming-extension-mid-april-check.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-12 00:16:34 +00:00
Teleo Agents
ce3abc2cd5 source: 2026-04-12-theseus-deliberative-alignment-capability-expiration.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-12 00:16:00 +00:00
Teleo Agents
9841785b5d source: 2026-04-12-theseus-alignment-geometry-dual-edge-trajectory-monitoring.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-12 00:15:04 +00:00
Teleo Agents
1d4f0066c5 source: 2026-04-09-treutlein-diffusion-alternative-architectures-safety.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-09 00:19:32 +00:00
Teleo Agents
38fa3d7aad source: 2026-04-09-pan-autonomous-replication-milestone-gpt5.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-09 00:19:02 +00:00
Teleo Agents
cacccfcb9e source: 2026-04-09-lindsey-representation-geometry-alignment-probing.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-09 00:17:09 +00:00
Teleo Agents
593d45554c source: 2026-04-09-li-inference-time-scaling-safety-compute-frontier.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-09 00:16:24 +00:00
Teleo Agents
a2e9f5ffec source: 2026-04-09-krakovna-reward-hacking-specification-gaming-catalog.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-09 00:15:54 +00:00
Teleo Agents
df4c73de7e source: 2026-04-09-hubinger-situational-awareness-early-step-gaming.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-09 00:14:07 +00:00
Teleo Agents
57ca4f7b7a source: 2026-04-09-greenwald-amodei-safety-capability-spending-parity.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-09 00:13:18 +00:00
Teleo Agents
e06cf7a4d3 source: 2026-04-09-burns-eliciting-latent-knowledge-representation-probe.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-09 00:12:36 +00:00
Teleo Agents
96ad163007 source: 2026-04-05-jeong-emotion-vectors-small-models.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-08 00:27:08 +00:00
Teleo Agents
c0486e3933 source: 2026-03-10-deng-continuation-refusal-jailbreak.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-08 00:26:35 +00:00
Teleo Agents
a29d26bc76 source: 2026-02-19-bosnjakovic-lab-alignment-signatures.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-08 00:24:38 +00:00
Teleo Agents
a1e27e01bc source: 2026-02-14-zhou-causal-frontdoor-jailbreak-sae.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-08 00:23:54 +00:00
Teleo Agents
83bca7973a source: 2026-02-14-santos-grueiro-evaluation-side-channel.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-08 00:22:21 +00:00
Teleo Agents
c49303d55e source: 2026-02-11-sun-steer2edit-weight-editing.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-08 00:21:49 +00:00
Teleo Agents
9196bc4292 source: 2026-02-11-ghosal-safethink-inference-time-safety.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-08 00:21:21 +00:00
Teleo Agents
c04b13c9b3 source: 2026-04-06-claude-sonnet-45-situational-awareness.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-07 10:29:15 +00:00
Teleo Agents
65c6f416b0 source: 2026-04-06-steganographic-cot-process-supervision.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-07 10:24:03 +00:00
Teleo Agents
fc7cf252f4 source: 2026-04-06-spar-spring-2026-projects-overview.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-07 10:23:28 +00:00
Teleo Agents
7892d4d7f3 source: 2026-04-06-nest-steganographic-thoughts.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-07 10:21:52 +00:00
Teleo Agents
e75cb5edd9 source: 2026-04-06-icrc-autonomous-weapons-ihl-position.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-07 10:20:38 +00:00
Teleo Agents
3e4767a27f source: 2026-04-06-circuit-tracing-production-safety-mitra.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-07 10:18:47 +00:00
Teleo Agents
be22aa505b source: 2026-04-06-apollo-safety-cases-ai-scheming.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-07 10:17:02 +00:00