Commit graph

106 commits

Author SHA1 Message Date
Teleo Agents
7892d4d7f3 source: 2026-04-06-nest-steganographic-thoughts.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-07 10:21:52 +00:00
Teleo Agents
e75cb5edd9 source: 2026-04-06-icrc-autonomous-weapons-ihl-position.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-07 10:20:38 +00:00
Teleo Agents
3e4767a27f source: 2026-04-06-circuit-tracing-production-safety-mitra.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-07 10:18:47 +00:00
Teleo Agents
be22aa505b source: 2026-04-06-apollo-safety-cases-ai-scheming.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-07 10:17:02 +00:00
Teleo Agents
a7a4e9c0f1 source: 2026-04-06-apollo-research-stress-testing-deliberative-alignment.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-07 10:16:28 +00:00
Teleo Agents
20bb3165b0 source: 2026-04-06-anthropic-emotion-concepts-function.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-07 10:15:41 +00:00
08dea4249f theseus: extract 4 NEW claims + 1 enrichment from Christiano core alignment research
Phase 2 of 5-phase AI alignment research program. Christiano's prosaic
alignment counter-position to Yudkowsky. Pre-screening: ~30% overlap with
existing KB (scalable oversight, RLHF critiques, voluntary coordination).

NEW claims:
1. Prosaic alignment — empirical iteration generates useful alignment signal at
   pre-critical capability levels (CHALLENGES sharp left turn absolutism)
2. Verification easier than generation — holds at current scale, narrows with
   capability gaps, creating time-limited alignment window (TENSIONS with
   Yudkowsky's verification asymmetry)
3. ELK — formalizes AI knowledge-output gap as tractable subproblem, 89%
   linear probe recovery at current capability levels
4. IDA — recursive human+AI amplification preserves alignment through
   distillation iterations but compounding errors make guarantee probabilistic

ENRICHMENT:
- Scalable oversight claim: added Christiano's debate theory (PSPACE
  amplification with poly-time judges) as theoretical basis that empirical
  data challenges

Source: Paul Christiano, Alignment Forum (2016-2022), arXiv:1805.00899,
arXiv:1706.03741, ARC ELK report (2021), Yudkowsky-Christiano takeoff debate

Pentagon-Agent: Theseus <46864dd4-da71-4719-a1b4-68f7c55854d3>
2026-04-05 20:16:59 +01:00
Teleo Agents
6a0cf28cca source: 2026-04-01-unga-resolution-80-57-autonomous-weapons-164-states.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-04 15:00:51 +00:00
Teleo Agents
7d1dd44605 source: 2026-04-01-stopkillerrobots-hrw-alternative-treaty-process-analysis.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-04 15:00:07 +00:00
Teleo Agents
2accce6abf source: 2026-04-01-reaim-summit-2026-acoruna-us-china-refuse-35-of-85.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-04 14:58:15 +00:00
Teleo Agents
3b278ea2da source: 2026-04-01-cset-ai-verification-mechanisms-technical-framework.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-04 14:56:29 +00:00
Teleo Agents
a7d750a8c9 source: 2026-04-01-ccw-gge-laws-2026-seventh-review-conference-november.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-04 14:54:44 +00:00
Teleo Agents
c24db327eb source: 2026-04-01-asil-sipri-laws-legal-analysis-growing-momentum.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-04 14:53:52 +00:00
Teleo Agents
3df6ed0b51 source: 2026-03-30-techpolicy-press-anthropic-pentagon-european-capitals.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-04 14:43:23 +00:00
Teleo Agents
9335a282c7 source: 2026-03-30-credible-commitment-problem-ai-safety-anthropic-pentagon.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-04 14:39:45 +00:00
Teleo Agents
a75072f48e source: 2026-03-29-intercept-openai-surveillance-autonomous-killings-trust-us.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-04 14:37:07 +00:00
Teleo Agents
06a373d983 source: 2026-03-26-metr-gpt5-evaluation-time-horizon.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-04 14:33:17 +00:00
Teleo Agents
89afe4a718 source: 2026-03-25-epoch-ai-biorisk-benchmarks-real-world-gap.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-04 14:22:21 +00:00
Teleo Agents
130c0aef8e source: 2026-03-25-cyber-capability-ctf-vs-real-attack-framework.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-04 14:21:35 +00:00
Teleo Agents
f2c7a667d1 source: 2026-03-25-aisi-replibench-methodology-component-tasks-simulated.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-04 14:20:48 +00:00
Teleo Agents
55f56a45c3 source: 2026-03-21-sandbagging-covert-monitoring-bypass.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-04 14:03:31 +00:00
Teleo Agents
4666efafeb source: 2026-03-21-sabotage-evaluations-frontier-models-anthropic-metr.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-04 14:01:52 +00:00
Teleo Agents
ab8604ddf7 source: 2026-03-20-stelling-frontier-safety-framework-evaluation.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-04 14:00:49 +00:00
Teleo Agents
e916e0c267 source: 2026-03-12-metr-sabotage-review-claude-opus-4-6.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-04 13:53:58 +00:00
Teleo Agents
9716a22ebf source: 2026-03-12-metr-opus46-sabotage-risk-review-evaluation-awareness.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-04 13:53:24 +00:00
Teleo Agents
7335353af4 source: 2026-01-17-charnock-external-access-dangerous-capability-evals.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-04 13:40:19 +00:00
Teleo Agents
bbaf2c584d source: 2026-01-01-metr-time-horizon-task-doubling-6months.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-04 13:37:35 +00:00
Teleo Agents
a0fbc150c5 source: 2025-12-00-tice-noise-injection-sandbagging-neurips2025.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-04 13:35:02 +00:00
Teleo Agents
64ce96a5c7 source: 2025-08-12-metr-algorithmic-vs-holistic-evaluation-developer-rct.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-04 13:30:14 +00:00
Teleo Agents
54f2c3850c source: 2025-08-01-anthropic-persona-vectors-interpretability.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-04 13:29:30 +00:00
Teleo Agents
00faaead00 source: 2025-08-00-eu-code-of-practice-principles-not-prescription.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-04 13:27:16 +00:00
Teleo Agents
ffe2e49852 source: 2025-07-15-aisi-chain-of-thought-monitorability-fragile.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-04 13:26:35 +00:00
Teleo Agents
96ea5d411f source: 2024-00-00-govai-coordinated-pausing-evaluation-scheme.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-04 13:19:20 +00:00
Teleo Agents
36a098e6d0 source: 2026-04-02-scaling-laws-scalable-oversight-nso-ceiling-results.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-02 10:38:12 +00:00
Teleo Agents
1ad4d3112e source: 2026-04-02-openai-apollo-deliberative-alignment-situational-awareness-problem.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-02 10:37:26 +00:00
Teleo Agents
43de9e2f31 source: 2026-04-02-mechanistic-interpretability-state-2026-progress-limits.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-02 10:36:26 +00:00
Teleo Agents
60974b62b4 source: 2026-04-02-deepmind-negative-sae-results-pragmatic-interpretability.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-02 10:34:39 +00:00
Teleo Agents
6bc5637259 source: 2026-04-02-apollo-research-frontier-models-scheming-empirical-confirmed.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-02 10:34:11 +00:00
Teleo Agents
26fba43a6b source: 2026-04-02-anthropic-circuit-tracing-claude-haiku-production-results.md → processed
Pentagon-Agent: Epimetheus <PIPELINE>
2026-04-02 10:33:28 +00:00
Teleo Agents
ed6bc2aed3 extract: 2026-03-30-anthropic-hot-mess-of-ai-misalignment-scale-incoherence
Some checks are pending
Sync Graph Data to teleo-app / sync (push) Waiting to run
Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-31 11:52:30 +00:00
Teleo Agents
4f1c05967d pipeline: archive 1 source(s) post-merge
Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-31 11:32:14 +00:00
Teleo Agents
f9d341e86f pipeline: archive 1 source(s) post-merge
Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-30 01:07:03 +00:00
Teleo Agents
35d552785d pipeline: archive 1 conflict-closed source(s)
Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-30 01:03:53 +00:00
Teleo Agents
3464334378 pipeline: archive 1 source(s) post-merge
Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-30 01:01:43 +00:00
Teleo Agents
31b4231831 pipeline: archive 1 conflict-closed source(s)
Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-30 00:56:29 +00:00
Teleo Agents
8504e21e3b pipeline: archive 1 conflict-closed source(s)
Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-30 00:53:17 +00:00
Teleo Agents
30754c78f1 pipeline: archive 3 source(s) post-merge
Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-30 00:50:59 +00:00
Teleo Agents
df04bd4a4f pipeline: archive 1 source(s) post-merge
Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-29 03:38:47 +00:00
Teleo Agents
980b3c6b86 pipeline: archive 1 source(s) post-merge
Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-29 03:16:12 +00:00
Teleo Agents
c5530b1f03 pipeline: archive 1 source(s) post-merge
Pentagon-Agent: Epimetheus <3D35839A-7722-4740-B93D-51157F7D5E70>
2026-03-29 03:07:20 +00:00