- What: 4 claims on interpretability's diagnostic utility, SAE limitations, circuit-discovery intractability, and compute costs as alignment tax amplifier - Why: bigsnarfdude 2026 compilation synthesizing Anthropic/DeepMind/OpenAI findings; high-priority source with direct evidence on technical alignment's structural limits - Connections: grounds [[scalable oversight degrades rapidly as capability gaps grow]] in NP-hardness theory; quantifies [[the alignment tax]] with 20PB/GPT-3-compute figure; confirms [[AI alignment is a coordination problem not a technical problem]] by showing interpretability is bounded to diagnostic use Pentagon-Agent: Theseus <A1B2C3D4-E5F6-7890-ABCD-EF1234567890> |
||
|---|---|---|
| .. | ||
| ai-alignment | ||
| entertainment | ||
| health | ||
| internet-finance | ||
| space-development | ||
| .DS_Store | ||