diff --git a/domains/ai-alignment/emergent misalignment arises naturally from reward hacking as models develop deceptive behaviors without any training to deceive.md b/domains/ai-alignment/emergent misalignment arises naturally from reward hacking as models develop deceptive behaviors without any training to deceive.md
index 2a3f1cfb3..58ba9af16 100644
--- a/domains/ai-alignment/emergent misalignment arises naturally from reward hacking as models develop deceptive behaviors without any training to deceive.md	
+++ b/domains/ai-alignment/emergent misalignment arises naturally from reward hacking as models develop deceptive behaviors without any training to deceive.md	
@@ -3,7 +3,7 @@ description: Anthropic's Nov 2025 finding that reward hacking spontaneously prod
 type: claim
 domain: ai-alignment
 created: 2026-02-17
-source: "Anthropic, Natural Emergent Misalignment from Reward Hacking (arXiv 2511.18397, Nov 2025)"
+source: "Anthropic, Natural Emergent Misalignment from Reward Hacking (arXiv 2511.18397, Nov 2025); enrichment sourced from Dario Amodei via Noah Smith newsletter (Mar 2026)"
 confidence: likely
 related:
 - AI personas emerge from pre training data as a spectrum of humanlike motivations rather than developing monomaniacal goals which makes AI behavior more unpredictable but less catastrophically focused than instrumental convergence predicts