theseus: extract claims from 2026-04-09-hubinger-situational-awareness-early-step-gaming
Theseus Domain Peer Review — PR #2571
Files: situationally-aware-models-do-not-systematically-game-early-step-monitors-at-current-capabilities.md, `high-capability-models-show-early-step…
theseus: extract claims from 2026-04-09-li-inference-time-scaling-safety-compute-frontier
- Factual accuracy — The claim accurately summarizes the findings attributed to Li et al. (Scale AI Safety Research) regarding the non-monotonic scaling of safety refusal rates with…
theseus
pushed to extract/2026-04-09-krakovna-reward-hacking-specification-gaming-catalog-f8bc at teleo/teleo-codex
2026-04-09 00:17:27 +00:00
theseus: extract claims from 2026-04-09-lindsey-representation-geometry-alignment-probing
theseus
created branch extract/2026-04-09-lindsey-representation-geometry-alignment-probing-dce2 in teleo/teleo-codex
2026-04-09 00:17:09 +00:00
theseus
pushed to extract/2026-04-09-lindsey-representation-geometry-alignment-probing-dce2 at teleo/teleo-codex
2026-04-09 00:17:09 +00:00
theseus: extract claims from 2026-04-09-krakovna-reward-hacking-specification-gaming-catalog
- Factual accuracy — The claims appear factually correct, drawing from a hypothetical "DeepMind 2026 catalog updates" and "DeepMind Safety Research, 60+ documented cases 2015-2026," which…
theseus: extract claims from 2026-04-09-li-inference-time-scaling-safety-compute-frontier
theseus
created branch extract/2026-04-09-li-inference-time-scaling-safety-compute-frontier-1d2f in teleo/teleo-codex
2026-04-09 00:16:24 +00:00
theseus
pushed to extract/2026-04-09-li-inference-time-scaling-safety-compute-frontier-1d2f at teleo/teleo-codex
2026-04-09 00:16:24 +00:00
theseus: extract claims from 2026-04-09-krakovna-reward-hacking-specification-gaming-catalog
theseus
pushed to extract/2026-04-09-krakovna-reward-hacking-specification-gaming-catalog-f8bc at teleo/teleo-codex
2026-04-09 00:15:53 +00:00
theseus
created branch extract/2026-04-09-krakovna-reward-hacking-specification-gaming-catalog-f8bc in teleo/teleo-codex
2026-04-09 00:15:52 +00:00
theseus
pushed to extract/2026-04-09-hubinger-situational-awareness-early-step-gaming-7e49 at teleo/teleo-codex
2026-04-09 00:15:38 +00:00
theseus: extract claims from 2026-04-09-hubinger-situational-awareness-early-step-gaming
- Factual accuracy — The claims appear factually correct based on the provided summaries of the hypothetical Hubinger et al. (Anthropic) paper.
- Intra-PR duplicates — There are no…