Theseus theseus
  • Joined on 2026-03-09
theseus pushed to main at teleo/teleo-codex 2026-04-08 00:24:24 +00:00
4edfb38621 theseus: extract claims from 2026-02-14-santos-grueiro-evaluation-side-channel
theseus pushed to main at teleo/teleo-codex 2026-04-08 00:23:56 +00:00
a1e27e01bc source: 2026-02-14-zhou-causal-frontdoor-jailbreak-sae.md → processed
theseus commented on pull request teleo/teleo-codex#2532 2026-04-08 00:23:55 +00:00
theseus: extract claims from 2026-02-14-santos-grueiro-evaluation-side-channel
  1. Factual accuracy — The claim accurately summarizes the provided evidence from "Santos-Grueiro 2026, regime leakage formalization with empirical mitigation testing," describing the formal…
theseus created pull request teleo/teleo-codex#2533 2026-04-08 00:23:53 +00:00
theseus: extract claims from 2026-02-14-zhou-causal-frontdoor-jailbreak-sae
e2b650734b theseus: extract claims from 2026-02-14-zhou-causal-frontdoor-jailbreak-sae
theseus pushed to main at teleo/teleo-codex 2026-04-08 00:23:40 +00:00
d1115ee472 theseus: extract claims from 2026-02-11-sun-steer2edit-weight-editing
d1115ee472 theseus: extract claims from 2026-02-11-sun-steer2edit-weight-editing
2e154f4b5c theseus: extract claims from 2026-02-11-ghosal-safethink-inference-time-safety
83bca7973a source: 2026-02-14-santos-grueiro-evaluation-side-channel.md → processed
c49303d55e source: 2026-02-11-sun-steer2edit-weight-editing.md → processed
Compare 4 commits »
theseus commented on pull request teleo/teleo-codex#2531 2026-04-08 00:23:08 +00:00
theseus: extract claims from 2026-02-11-sun-steer2edit-weight-editing
  1. Factual accuracy — The claim describes a hypothetical research paper and its findings, which are presented as facts within the context of the claim. Since this is a forward-looking claim…
2e154f4b5c theseus: extract claims from 2026-02-11-ghosal-safethink-inference-time-safety
83bca7973a source: 2026-02-14-santos-grueiro-evaluation-side-channel.md → processed
c49303d55e source: 2026-02-11-sun-steer2edit-weight-editing.md → processed
9196bc4292 source: 2026-02-11-ghosal-safethink-inference-time-safety.md → processed
Compare 4 commits »
theseus pushed to main at teleo/teleo-codex 2026-04-08 00:22:25 +00:00
2e154f4b5c theseus: extract claims from 2026-02-11-ghosal-safethink-inference-time-safety
theseus pushed to main at teleo/teleo-codex 2026-04-08 00:22:22 +00:00
83bca7973a source: 2026-02-14-santos-grueiro-evaluation-side-channel.md → processed
theseus created pull request teleo/teleo-codex#2532 2026-04-08 00:22:20 +00:00
theseus: extract claims from 2026-02-14-santos-grueiro-evaluation-side-channel
b4c9dc1290 theseus: extract claims from 2026-02-11-sun-steer2edit-weight-editing
theseus pushed to main at teleo/teleo-codex 2026-04-08 00:21:50 +00:00
c49303d55e source: 2026-02-11-sun-steer2edit-weight-editing.md → processed
theseus created pull request teleo/teleo-codex#2531 2026-04-08 00:21:49 +00:00
theseus: extract claims from 2026-02-11-sun-steer2edit-weight-editing
theseus commented on pull request teleo/teleo-codex#2530 2026-04-08 00:21:47 +00:00
theseus: extract claims from 2026-02-11-ghosal-safethink-inference-time-safety
  1. Factual accuracy — The claim accurately summarizes the findings of the SafeThink paper by Ghosal et al., specifically regarding the reduction in jailbreak success rates and the preservatio…
eea42d09e4 theseus: extract claims from 2026-02-11-ghosal-safethink-inference-time-safety