teleo-codex/inbox/archive/2026-03-09-simonw-x-archive.md

---
type: source
title: "@simonw X archive — 100 most recent tweets"
author: "Simon Willison (@simonw)"
url: https://x.com/simonw
date: 2026-03-09
domain: ai-alignment
format: tweet
status: processed
processed_by: theseus
processed_date: 2026-03-09
claims_extracted:
  - "agent-generated code creates cognitive debt that compounds when developers cannot understand what was produced on their behalf"
  - "coding agents cannot take accountability for mistakes which means humans must retain decision authority over security and critical systems regardless of agent capability"
enrichments: []
tags: [agentic-engineering, cognitive-debt, security, accountability, coding-agents, open-source-licensing]
linked_set: theseus-x-collab-taxonomy-2026-03
curator_notes: |
  25 relevant tweets out of 60 unique. Willison is writing a systematic "Agentic Engineering
  Patterns" guide and tweeting chapter releases. The strongest contributions are conceptual
  frameworks: cognitive debt, the accountability gap, and agents-as-mixed-ability-teams.
  He is the most careful about AI safety/governance in this batch — strong anti-anthropomorphism
  position, prompt injection as LLM-specific vulnerability, and alarm about agents
  circumventing open source licensing. Zero hype, all substance — consistent with his
  reputation.
---

# @simonw X Archive (Feb 26 – Mar 9, 2026)

## Key Tweets by Theme

### Agentic Engineering Patterns (Guide Chapters)

- **Cognitive debt** (status/2027885000432259567, 1,261 likes): "New chapter of my Agentic Engineering Patterns guide. This one is about having coding agents build custom interactive and animated explanations to help fight back against cognitive debt."

- **Anti-pattern: unreviewed code on collaborators** (status/2029260505324412954, 761 likes): "I started a new chapter of my Agentic Engineering Patterns guide about anti-patterns [...] Inflicting unreviewed code on collaborators, aka dumping a thousand line PR without even making sure it works first."

- **Hoard things you know how to do** (status/2027130136987086905, 814 likes): "Today's chapter of Agentic Engineering Patterns is some good general career advice which happens to also help when working with coding agents: Hoard things you know how to do."

- **Agentic manual testing** (status/2029962824731275718, 371 likes): "New chapter: Agentic manual testing - about how having agents 'manually' try out code is a useful way to help them spot issues that might not have been caught by their automated tests."

### Security as the Critical Lens

- **Security teams are the experts we need** (status/2028838538825924803, 698 likes): "The people I want to hear from right now are the security teams at large companies who have to try and keep systems secure when dozens of teams of engineers of varying levels of experience are constantly shipping new features."

- **Security is the most interesting lens** (status/2028840346617065573, 70 likes): "I feel like security is the most interesting lens to look at this from. Most bad code problems are survivable [...] Security problems are much more directly harmful to the organization."

- **Accountability gap** (status/2028841504601444397, 84 likes): "Coding agents can't take accountability for their mistakes. Eventually you want someone who's job is on the line to be making decisions about things as important as securing the system."

- **Agents as mixed-ability engineering teams** (status/2028838854057226246, 99 likes): "Shipping code of varying quality and varying levels of review isn't a new problem [...] At this point maybe we treat coding agents like teams of mixed ability engineers working under aggressive deadlines."

- **Tests offset lower code quality** (status/2028846376952492054, 1 like): "agents make test coverage so much cheaper that I'm willing to tolerate lower quality code from them as long as it's properly tested. Tests don't solve security though!"

### AI Safety / Governance

- **Prompt injection is LLM-specific** (status/2030806416907448444, 3 likes): "No, it's an LLM problem - LLMs provide attackers with a human language interface that they can use to trick the model into making tool calls that act against the interests of their users. Most software doesn't have that."

- **Nobody knows how to build safe digital assistants** (status/2029539116166095019, 2 likes): "I don't use it myself because I don't know how to use it safely. [...] The challenge now is to figure out how to deliver one that's safe by default. No one knows how to do that yet."

- **Anti-anthropomorphism** (status/2027128593839722833, 4 likes): "Not using language like 'Opus 3 enthusiastically agreed' in a tweet seen by a million people would be good."

- **LLMs have zero moral status** (status/2027127449583292625, 32 likes): "I can run these things in my laptop. They're a big stack of matrix arithmetic that is reset back to zero every time I start a new prompt. I do not think they warrant any moral consideration at all."

### Open Source Licensing Disruption

- **Agents as reverse engineering machines** (status/2029729939285504262, 39 likes): "It breaks pretty much ALL licenses, even commercial software. These coding agents are reverse engineering / clean room implementing machines."

- **chardet clean-room rewrite controversy** (status/2029600918912553111, 308 likes): "The chardet open source library relicensed from LGPL to MIT two days ago thanks to a Claude Code assisted 'clean room' rewrite - but original author Mark Pilgrim is disputing that the way this was done justifies the change in license."

- **Threats to open source** (status/2029958835130225081, 2 likes): "This is one of the 'threats to open source' I find most credible - we've built the entire community on decades of licensing which can now be subverted by a coding agent running for a few hours."

### Capability Observations

- **Qwen 3.5 4B vs GPT-4o** (status/2030067107371831757, 565 likes): "Qwen3.5 4B apparently out-scores GPT-4o on some of the classic benchmarks (!)"

- **Benchmark gaming suspicion** (status/2030139125656080876, 68 likes): "Given the enormous size difference in terms of parameters this does make me suspicious that Qwen may have been training to the test on some of these."

- **AI hiring criteria** (status/2030974722029339082, 5 likes): Polling whether AI coding tool experience features in developer interviews.

## Filtered Out
~35 tweets: art museum visit, Google account bans, Qwen team resignations (news relay), chardet licensing details, casual replies.