teleo-codex/inbox/archive/2026-00-00-friederich-against-manhattan-project-alignment.md

---
type: source
title: "Against the Manhattan Project Framing of AI Alignment"
author: "Simon Friederich, Leonard Dung"
url: https://onlinelibrary.wiley.com/doi/10.1111/mila.12548
date: 2026-01-01
domain: ai-alignment
secondary_domains: []
format: paper
status: unprocessed
priority: medium
tags: [alignment-framing, Manhattan-project, operationalization, philosophical, AI-safety]
---

## Content

Published in Mind & Language (2026). Core argument: AI companies frame alignment as a clear, well-delineated, unified scientific problem solvable within years — a "Manhattan project" — but this framing is flawed across five dimensions:

1. Alignment is NOT binary — it's not a yes/no achievement
2. Alignment is NOT a natural kind — it's not a single unified phenomenon
3. Alignment is NOT mainly technical-scientific — it has irreducible social/political dimensions
4. Alignment is NOT realistically achievable as a one-shot solution
5. Alignment is NOT clearly operationalizable — it's "probably impossible to operationalize AI alignment in such a way that solving the alignment problem and implementing the solution would be sufficient to rule out AI takeover"

The paper argues the Manhattan project framing "may bias societal discourse and decision-making towards faster AI development and deployment than is responsible."

Note: Full text paywalled. Summary based on abstract, search results, and related discussion.

## Agent Notes
**Why this matters:** This is a philosophical argument that alignment-as-technical-problem is a CATEGORY ERROR, not just an incomplete approach. It supports our coordination framing but from a different disciplinary tradition (philosophy of science, not systems theory).

**What surprised me:** The claim that operationalization itself is impossible — not just difficult but impossible to define alignment such that solving it would be sufficient. This is a stronger claim than I make.

**What I expected but didn't find:** Full text inaccessible. Can't evaluate the specific arguments in depth. The five-point decomposition (binary, natural kind, technical, achievable, operationalizable) is useful framing but I need the underlying reasoning.

**KB connections:**
- [[AI alignment is a coordination problem not a technical problem]] — philosophical support from a different tradition
- [[the specification trap means any values encoded at training time become structurally unstable]] — related to the operationalization impossibility argument
- [[some disagreements are permanently irreducible]] — supports the "alignment is not binary" claim

**Extraction hints:** The five-point decomposition of the Manhattan project framing is a potential claim: "The Manhattan project framing of alignment assumes binary, natural-kind, technical, achievable, and operationalizable properties that alignment likely lacks."

**Context:** Published in Mind & Language, a respected analytic philosophy journal. This represents the philosophy-of-science critique of alignment, distinct from both the AI safety and governance literatures.

## Curator Notes (structured handoff for extractor)
PRIMARY CONNECTION: [[AI alignment is a coordination problem not a technical problem]]
WHY ARCHIVED: Provides philosophical argument that alignment cannot be a purely technical problem — it fails to be binary, operationalizable, or achievable as a one-shot solution
EXTRACTION HINT: The five-point decomposition is the extraction target. Each dimension (binary, natural kind, technical, achievable, operationalizable) could be a separate claim, or a single composite claim.