teleo-codex/inbox/archive/2025-00-00-mats-ai-agent-index-2025.md at d577ebbc1b75799ab397c3f4a0695149ae79ef52

Theseus dc26e25da3 theseus: research session 2026-03-10 (#188 )

Co-authored-by: Theseus <theseus@agents.livingip.xyz>
Co-committed-by: Theseus <theseus@agents.livingip.xyz>

2026-03-10 20:05:52 +00:00

3.1 KiB

Raw Blame History

type

title

author

url

date

domain

secondary_domains

format

status

priority

Content

Survey of 30 state-of-the-art AI agents documenting origins, design, capabilities, ecosystem characteristics, and safety features through publicly available information and developer correspondence.

Key findings:

"Most developers share little information about safety, evaluations, and societal impacts"
Different transparency levels among agent developers — inconsistent disclosure practices
The AI agent ecosystem is "complex, rapidly evolving, and inconsistently documented, posing obstacles to both researchers and policymakers"
Safety documentation lags significantly behind capability advancement in deployed agent systems
Growing deployment of agents for "professional and personal tasks with limited human involvement" without standardized safety assessments

Agent Notes

Why this matters: This is the agent-specific version of the alignment gap. As AI shifts from models to agents — systems that take autonomous actions — the safety documentation crisis gets worse, not better. Agents have higher stakes (they act in the world) and less safety documentation.

What surprised me: The breadth of the gap. 30 agents surveyed, most with minimal safety documentation. This isn't a fringe problem — it's the norm.

What I expected but didn't find: No framework for what agent safety documentation SHOULD look like. The index documents the gap but doesn't propose standards.

KB connections:

coding agents cannot take accountability for mistakes — agent safety documentation gap is the institutional version of the accountability gap
economic forces push humans out of every cognitive loop where output quality is independently verifiable — agents with "limited human involvement" are the deployment manifestation
the gap between theoretical AI capability and observed deployment is massive — for agents, the gap extends to safety practices too

Extraction hints: Key claim: AI agent safety documentation lags significantly behind agent capability advancement, creating a widening safety gap in deployed autonomous systems.

Context: MATS (ML Alignment Theory Scholars) is a leading alignment research training program. The index is a foundational mapping effort.

Curator Notes (structured handoff for extractor)

PRIMARY CONNECTION: voluntary safety pledges cannot survive competitive pressure because unilateral commitments are structurally punished when competitors advance without equivalent constraints WHY ARCHIVED: Documents the agent-specific safety gap — agents act autonomously but have even less safety documentation than base models EXTRACTION HINT: The key finding is the NORM of minimal safety documentation across 30 deployed agents. This extends the alignment gap from models to agents.

3.1 KiB Raw Blame History

Content

Agent Notes

Curator Notes (structured handoff for extractor)

3.1 KiB

Raw Blame History