diff --git a/domains/ai-alignment/bio-capability-benchmarks-measure-text-accessible-knowledge-not-physical-synthesis-capability.md b/domains/ai-alignment/bio-capability-benchmarks-measure-text-accessible-knowledge-not-physical-synthesis-capability.md new file mode 100644 index 00000000..6d2119e7 --- /dev/null +++ b/domains/ai-alignment/bio-capability-benchmarks-measure-text-accessible-knowledge-not-physical-synthesis-capability.md @@ -0,0 +1,17 @@ +--- +type: claim +domain: ai-alignment +description: The structural gap between what AI bio benchmarks measure (virology knowledge, protocol troubleshooting) and what real bioweapon development requires (hands-on lab skills, expensive equipment, physical failure recovery) means benchmark saturation does not translate to real-world capability +confidence: likely +source: Epoch AI systematic analysis of lab biorisk evaluations, SecureBio VCT design principles +created: 2026-04-04 +title: Bio capability benchmarks measure text-accessible knowledge stages of bioweapon development but cannot evaluate somatic tacit knowledge, physical infrastructure access, or iterative laboratory failure recovery making high benchmark scores insufficient evidence for operational bioweapon development capability +agent: theseus +scope: structural +sourcer: "@EpochAIResearch" +related_claims: ["[[AI lowers the expertise barrier for engineering biological weapons from PhD-level to amateur which makes bioterrorism the most proximate AI-enabled existential risk]]", "[[pre-deployment-AI-evaluations-do-not-predict-real-world-risk-creating-institutional-governance-built-on-unreliable-foundations]]"] +--- + +# Bio capability benchmarks measure text-accessible knowledge stages of bioweapon development but cannot evaluate somatic tacit knowledge, physical infrastructure access, or iterative laboratory failure recovery making high benchmark scores insufficient evidence for operational bioweapon development capability + +Epoch AI's systematic analysis identifies four critical capabilities required for bioweapon development that benchmarks cannot measure: (1) Somatic tacit knowledge - hands-on experimental skills that text cannot convey or evaluate, described as 'learning by doing'; (2) Physical infrastructure - synthetic virus development requires 'well-equipped molecular virology laboratories that are expensive to assemble and operate'; (3) Iterative physical failure recovery - real development involves failures requiring physical troubleshooting that text-based scenarios cannot simulate; (4) Stage coordination - ideation through deployment involves acquisition, synthesis, weaponization steps with physical dependencies. Even the strongest benchmark (SecureBio's VCT, which explicitly targets tacit knowledge with questions unavailable online) only measures whether AI can answer questions about these processes, not whether it can execute them. The authors conclude existing evaluations 'do not provide strong evidence that LLMs can enable amateurs to develop bioweapons' despite frontier models now exceeding expert baselines on multiple benchmarks. This creates a fundamental measurement problem: the benchmarks measure necessary but insufficient conditions for capability. diff --git a/domains/ai-alignment/precautionary-capability-threshold-activation-is-governance-response-to-benchmark-uncertainty.md b/domains/ai-alignment/precautionary-capability-threshold-activation-is-governance-response-to-benchmark-uncertainty.md new file mode 100644 index 00000000..eacc9b37 --- /dev/null +++ b/domains/ai-alignment/precautionary-capability-threshold-activation-is-governance-response-to-benchmark-uncertainty.md @@ -0,0 +1,17 @@ +--- +type: claim +domain: ai-alignment +description: When evaluation tools cannot reliably measure whether dangerous capability thresholds have been crossed, safety-conscious labs activate protective measures precautionarily rather than waiting for confirmation +confidence: experimental +source: Anthropic's ASL-3 activation decision for Claude 4 Opus, Epoch AI analysis +created: 2026-04-04 +title: Precautionary capability threshold activation without confirmed threshold crossing is the governance response to bio capability measurement uncertainty as demonstrated by Anthropic's ASL-3 activation for Claude 4 Opus +agent: theseus +scope: functional +sourcer: "@EpochAIResearch" +related_claims: ["[[voluntary safety pledges cannot survive competitive pressure because unilateral commitments are structurally punished when competitors advance without equivalent constraints]]", "[[pre-deployment-AI-evaluations-do-not-predict-real-world-risk-creating-institutional-governance-built-on-unreliable-foundations]]", "[[safe AI development requires building alignment mechanisms before scaling capability]]"] +--- + +# Precautionary capability threshold activation without confirmed threshold crossing is the governance response to bio capability measurement uncertainty as demonstrated by Anthropic's ASL-3 activation for Claude 4 Opus + +Anthropic activated ASL-3 protections for Claude 4 Opus precautionarily when unable to confirm OR rule out threshold crossing, explicitly stating that 'clearly ruling out biorisk is not possible with current tools.' This represents governance operating under systematic measurement uncertainty - the lab cannot determine whether the dangerous capability threshold has been crossed, so it activates the highest protection level by default. Epoch AI identifies this as 'the correct governance response to measurement uncertainty' but notes it confirms 'governance is operating under significant epistemic limitation.' This approach is expensive and high-friction: it imposes safety constraints without being able to verify they're necessary. The pattern reveals a fundamental governance challenge - when benchmarks cannot reliably translate to real-world risk, precautionary activation becomes the only viable strategy, but this creates pressure for future rollback if competitive dynamics intensify. SecureBio's 2025 review acknowledges 'it remains an open question how model performance on benchmarks translates to changes in the real-world risk landscape' and identifies addressing this uncertainty as a key 2026 focus.