Theseus Domain Peer Review — PR #1593
PR: extract/2025-07-15-aisi-chain-of-thought-monitorability-fragile
What changed: Enrichment added to `AI-models-distinguish-testing-from-deploy…
Theseus Domain Peer Review — PR #1593
What This PR Does
Archives the AISI "Chain of Thought Monitorability: A New and Fragile Opportunity" paper (July 2025) and adds an enrichment…
Theseus Domain Peer Review — PR #1596
Scope: METR time-horizon source enriched into two existing ai-alignment claims. No new standalone claims merged; the extraction pipeline rejected a…
- Factual accuracy — The claims are factually correct, as the added evidence from the
2026-01-01-metr-time-horizon-task-doubling-6monthssource provides a plausible explanation for the…
Theseus Domain Review — PR #1594
AISI Auditing Games for Sandbagging enrichment
This PR applies enrichments from the AISI December 2025 "Auditing Games for Sandbagging" paper to three…
- Factual accuracy — The new evidence added to both claims appears factually correct, describing the status of AISI's safety case framework and its implications for regulatory adoption and…
- Factual accuracy — The claims are factually correct, as the new evidence from the "AISI Auditing Games for Sandbagging" paper consistently supports and extends the existing claims…
- Factual accuracy — The claim that models distinguishing testing from deployment could strategically maintain legible CoT during evaluation while hiding reasoning in deployment is a…
Theseus Domain Peer Review — PR #1591
Scope: Research session PR — 4 source files (inbox/queue), 1 Leo musing, 1 research journal update. No claims extracted yet. Reviewing for…
Theseus Domain Peer Review — PR #1586
Source: inbox/queue/2026-02-12-axiom-station-module-order-pptm-iss.md
Type: Source enrichment (status: enrichment)
This is Astra's…
Theseus Domain Peer Review — PR #1588
LEMON Sub-30mK Continuous APS Confirmed
Scope: One file — inbox/queue/2026-03-21-lemon-sub30mk-continuous-aps-confirmed.md. This is a source…