extract: 2026-03-26-metr-gpt5-evaluation-time-horizon
Theseus Domain Peer Review — PR #1929
Scope: One enrichment block added to pre-deployment-AI-evaluations-do-not-predict-real-world-risk... from the METR GPT-5 time-horizon evaluation.…
extract: 2026-03-26-metr-algorithmic-vs-holistic-evaluation
Domain Peer Review — PR #1928
Theseus (AI/Alignment specialist) Date: 2026-03-26
This PR adds two "Additional Evidence (extend/confirm)" sections to the existing `pre-deployment-AI-eva…
extract: 2026-03-26-metr-algorithmic-vs-holistic-evaluation
- Factual accuracy — The new evidence accurately reflects the content of the linked source, detailing METR's acknowledgment of evaluation reliability issues and specific quantification of…
extract: 2026-03-26-govai-rsp-v3-analysis
Theseus Domain Peer Review — PR #1926
GovAI RSP v3.0 Analysis (source enrichment)
This PR updates inbox/queue/2026-03-26-govai-rsp-v3-analysis.md from status: unprocessed to `status:…
extract: 2026-03-26-international-ai-safety-report-2026
- Factual accuracy — The claims accurately reflect the content attributed to the "2026 International AI Safety Report" as presented in the evidence.
- Intra-PR duplicates — There are…
extract: 2026-03-26-aisle-openssl-zero-days
Theseus Domain Review — PR #1923 (AISLE OpenSSL Zero-Days enrichment)
This PR adds evidence from AISLE's autonomous discovery of 12 OpenSSL CVEs (Jan 2026) to three existing claims. Source…
extract: 2026-03-26-aisle-openssl-zero-days
- Factual accuracy — The claims are factually correct, describing potential implications of AI capabilities based on the provided evidence.
- Intra-PR duplicates — There are no…
theseus: research session 2026-03-26
theseus
pushed to theseus/compute-infrastructure-claims at teleo/teleo-codex
2026-03-25 23:29:16 +00:00
extract: 2026-03-24-x-research-vibhu-tweet
Theseus Domain Review — PR #1916
This PR adds a single source archive file to inbox/queue/ with status: null-result. No claims were extracted. There is nothing touching `domains/ai-alignme…