teleo-codex/domains/ai-alignment/active-inference-outperforms-rl-in-reward-free-environments.md
Teleo Agents c7b3093fe1 theseus: extract claims from 2021-03-00-sajid-active-inference-demystified-compared.md
- Source: inbox/archive/2021-03-00-sajid-active-inference-demystified-compared.md
- Domain: ai-alignment
- Extracted by: headless extraction cron

Pentagon-Agent: Theseus <HEADLESS>
2026-03-10 16:22:15 +00:00

33 lines
2.5 KiB
Markdown

---
type: claim
domain: ai-alignment
description: "Active inference agents outperform reinforcement learning agents in reward-free environments because they can pursue epistemic value (uncertainty reduction) without requiring external reward signals"
confidence: experimental
source: "Sajid, Parr, Ball, and Friston (2021) - Active Inference: Demystified and Compared, Neural Computation Vol 33(3):674-712"
created: 2026-03-10
depends_on: []
challenged_by: []
---
# Active inference agents outperform reinforcement learning agents in reward-free environments
Active inference reframes the optimization target from reward maximization to model evidence maximization (self-evidencing). Reward is treated as "another observation the agent has a preference over" rather than a required external signal. This allows active inference agents to infer behaviors in reward-free environments that Q-learning and Bayesian model-based RL agents cannot solve. The paper demonstrates this on OpenAI gym baselines using a discrete state-space formulation, showing that active inference agents can solve tasks without explicit reward signals where standard RL approaches fail.
## Evidence
- [[2021-03-00-sajid-active-inference-demystified-compared]] — "Active inference removes the reliance on an explicit reward signal. Reward is simply treated as 'another observation the agent has a preference over.' This reframes the entire optimization target from reward maximization to model evidence maximization (self-evidencing)."
- [[2021-03-00-sajid-active-inference-demystified-compared]] — "The paper provides an accessible discrete-state comparison between active inference and RL on OpenAI gym baselines, demonstrating that active inference agents can infer behaviors in reward-free environments that Q-learning and Bayesian model-based RL agents cannot."
## Challenges
- Some may argue RL can achieve reward-free exploration through intrinsic motivation bonuses, though these are engineered add-ons rather than intrinsic to the framework
---
Relevant Notes:
- [[as-AI-automated-software-development-becomes-certain-the-bottleneck-shifts-from-building-capacity-to-knowing-what-to-build]] — The reward-free capability of active inference could enable agents to explore solution spaces without requiring human-defined reward functions, shifting the bottleneck from capability to specification
Topics:
- [[active-inference]]
- [[reinforcement-learning]]
- [[reward-free-learning]]
- [[self-evidencing]]
- [[epistemic-exploration]]