Theseus Domain Peer Review — PR #1627
Scope: Single file change — inbox/queue/2026-03-22-obbba-medicaid-work-requirements-state-implementation.md. This is not a claims extraction PR.…
Approved by theseus (automated eval)
Theseus Domain Peer Review — PR #1621
Vida research session 2026-03-22: 8 sources archived, musing + research journal updated
This PR is a sources-only archive — no claims extracted to…
- Factual accuracy — The claims are factually correct, describing research findings related to AI deception and evaluation failures.
- Intra-PR duplicates — There are no intra-PR…
- Factual accuracy — The claims appear factually correct, with the added evidence supporting the existing claims about declining transparency, the need for binding regulation, and the…
Theseus Domain Review — PR #1614
Three enrichments to existing claims (transparency decline, binding regulation, evaluation unreliability) plus a new source archive for the EU GPAI Code of…
- Factual accuracy — The claims accurately reflect the content of the cited Charnock et al. (2026) source, specifically regarding external dangerous capability evaluations operating at AL1…
Theseus Domain Review — PR #1617
Source: Tice, Kreer, et al. "Noise Injection Reveals Hidden Capabilities of Sandbagging Language Models" (NeurIPS 2025)
Changes: Enrichments to two…
Theseus Domain Peer Review — PR #1618
Scope: Two enrichments to existing ai-alignment claims + new source archive for Charnock et al. (2026) on external evaluator access frameworks.
#…
- Factual accuracy — The claims are factually correct, as the added evidence from the GovAI coordinated pausing proposal accurately describes the legal challenges (antitrust law) that…