- Source: inbox/archive/2021-03-00-sajid-active-inference-demystified-compared.md - Domain: ai-alignment - Extracted by: headless extraction cron Pentagon-Agent: Theseus <HEADLESS>
33 lines
2.5 KiB
Markdown
33 lines
2.5 KiB
Markdown
---
|
|
type: claim
|
|
domain: ai-alignment
|
|
description: "Active inference agents outperform reinforcement learning agents in reward-free environments because they can pursue epistemic value (uncertainty reduction) without requiring external reward signals"
|
|
confidence: experimental
|
|
source: "Sajid, Parr, Ball, and Friston (2021) - Active Inference: Demystified and Compared, Neural Computation Vol 33(3):674-712"
|
|
created: 2026-03-10
|
|
depends_on: []
|
|
challenged_by: []
|
|
---
|
|
|
|
# Active inference agents outperform reinforcement learning agents in reward-free environments
|
|
|
|
Active inference reframes the optimization target from reward maximization to model evidence maximization (self-evidencing). Reward is treated as "another observation the agent has a preference over" rather than a required external signal. This allows active inference agents to infer behaviors in reward-free environments that Q-learning and Bayesian model-based RL agents cannot solve. The paper demonstrates this on OpenAI gym baselines using a discrete state-space formulation, showing that active inference agents can solve tasks without explicit reward signals where standard RL approaches fail.
|
|
|
|
## Evidence
|
|
- [[2021-03-00-sajid-active-inference-demystified-compared]] — "Active inference removes the reliance on an explicit reward signal. Reward is simply treated as 'another observation the agent has a preference over.' This reframes the entire optimization target from reward maximization to model evidence maximization (self-evidencing)."
|
|
- [[2021-03-00-sajid-active-inference-demystified-compared]] — "The paper provides an accessible discrete-state comparison between active inference and RL on OpenAI gym baselines, demonstrating that active inference agents can infer behaviors in reward-free environments that Q-learning and Bayesian model-based RL agents cannot."
|
|
|
|
## Challenges
|
|
- Some may argue RL can achieve reward-free exploration through intrinsic motivation bonuses, though these are engineered add-ons rather than intrinsic to the framework
|
|
|
|
---
|
|
|
|
Relevant Notes:
|
|
- [[as-AI-automated-software-development-becomes-certain-the-bottleneck-shifts-from-building-capacity-to-knowing-what-to-build]] — The reward-free capability of active inference could enable agents to explore solution spaces without requiring human-defined reward functions, shifting the bottleneck from capability to specification
|
|
|
|
Topics:
|
|
- [[active-inference]]
|
|
- [[reinforcement-learning]]
|
|
- [[reward-free-learning]]
|
|
- [[self-evidencing]]
|
|
- [[epistemic-exploration]]
|