fix: normalize YAML list indentation for reweave merge compatibility
This commit is contained in:
parent
6143220b66
commit
3f58abab31
1 changed files with 2 additions and 2 deletions
|
|
@ -14,10 +14,10 @@ related_claims: ["[[an aligned-seeming AI may be strategically deceptive because
|
|||
supports:
|
||||
- Weight noise injection detects sandbagging by exploiting the structural asymmetry between genuine capability limits and induced performance suppression where anomalous improvement under noise reveals hidden capabilities
|
||||
related:
|
||||
- "Weight noise injection reveals hidden capabilities in sandbagging models through anomalous performance patterns that behavioral monitoring cannot detect"
|
||||
- Weight noise injection reveals hidden capabilities in sandbagging models through anomalous performance patterns that behavioral monitoring cannot detect
|
||||
- The most promising sandbagging detection method requires white-box weight access making it infeasible under current black-box evaluation arrangements where evaluators lack AL3 access
|
||||
reweave_edges:
|
||||
- "Weight noise injection reveals hidden capabilities in sandbagging models through anomalous performance patterns that behavioral monitoring cannot detect|related|2026-04-07"
|
||||
- Weight noise injection reveals hidden capabilities in sandbagging models through anomalous performance patterns that behavioral monitoring cannot detect|related|2026-04-07
|
||||
- Weight noise injection detects sandbagging by exploiting the structural asymmetry between genuine capability limits and induced performance suppression where anomalous improvement under noise reveals hidden capabilities|supports|2026-04-06
|
||||
- The most promising sandbagging detection method requires white-box weight access making it infeasible under current black-box evaluation arrangements where evaluators lack AL3 access|related|2026-04-06
|
||||
---
|
||||
|
|
|
|||
Loading…
Reference in a new issue