- GEPA self-evolution system (trace-based evolutionary prompt optimization) - DeMo: Decoupled Momentum Optimization (Peng, Kingma et al. — 85x bandwidth reduction) - YaRN: Context Window Extension (adopted by Meta and DeepSeek) - Hermes 4 Technical Report (hybrid reasoning model family) - Agent Skills open standard (30+ platform adoption, Anthropic-originated) Per m3ta directive: GEPA and skills ecosystem observations are solid research material worth extracting as sources regardless of deployment. Pentagon-Agent: Theseus <46864dd4-da71-4719-a1b4-68f7c55854d3>
55 lines
3.2 KiB
Markdown
55 lines
3.2 KiB
Markdown
---
|
|
type: source
|
|
title: "Hermes 4 Technical Report"
|
|
author: "Ryan Teknium, Roger Jin, Jai Suphavadeeprasit, Dakota Mahan, Jeffrey Quesnelle, Joe Li, Chen Guang, Shannon Sands, Karan Malhotra"
|
|
url: https://arxiv.org/abs/2508.18255
|
|
date: 2025-08-25
|
|
domain: ai-alignment
|
|
intake_tier: research-task
|
|
rationale: "Hermes 4 is the model family underlying the Hermes Agent. Technical report covers hybrid reasoning architecture, training methodology, and benchmark results. Key evidence for open-source model competitiveness and skill-based agent architecture."
|
|
proposed_by: theseus
|
|
format: paper
|
|
status: unprocessed
|
|
tags: [nous-research, hermes-4, hybrid-reasoning, open-source-models, training-methodology]
|
|
---
|
|
|
|
## Hermes 4 Technical Report
|
|
|
|
arXiv:2508.18255 (August 2025). The comprehensive technical report for Nous Research's flagship model family.
|
|
|
|
### Overview
|
|
|
|
Hermes 4 is a family of hybrid reasoning models that combine structured, multi-turn reasoning with broad instruction-following ability. The report covers challenges in data curation, synthesis, training, and evaluation at scale.
|
|
|
|
### Model Family
|
|
|
|
- **Hermes-4-Llama-3.1-405B** — frontier hybrid-mode reasoning model (802GB)
|
|
- **Hermes-4-Llama-3.1-70B** — smaller variant with shared improvements (140GB)
|
|
- **Hermes-4-14B** — dense model for local inference (28GB)
|
|
- **Hermes-4.3-Seed-36B** — post-trained entirely on the Psyche decentralized network (72GB)
|
|
|
|
### Hybrid Reasoning Architecture
|
|
|
|
The key innovation is the ability to switch between structured reasoning mode (chain-of-thought, step-by-step) and direct instruction-following mode. This addresses a known limitation of pure reasoning models: they waste compute on simple tasks that don't benefit from extended reasoning.
|
|
|
|
### Training Methodology
|
|
|
|
The report addresses challenges in:
|
|
- Data curation at scale — quality filtering, decontamination, domain balancing
|
|
- Synthetic data generation — using stronger models to generate training data
|
|
- Multi-stage training pipeline — pre-training → supervised fine-tuning → alignment
|
|
- Evaluation across mathematical reasoning, coding, knowledge, comprehension, and alignment benchmarks
|
|
|
|
### Benchmark Results
|
|
|
|
Comprehensive benchmarking across multiple domains. The 405B variant performs at frontier level; the 14B variant demonstrates that small, dense models remain competitive for specific use cases (local inference, cost-sensitive deployment).
|
|
|
|
### Decentralized Training (Hermes 4.3)
|
|
|
|
Hermes-4.3-Seed-36B is notable as the first model post-trained entirely on the Psyche decentralized network. This demonstrates that distributed, volunteer-contributed compute can produce competitive models — a proof-of-concept for the DeMo/Psyche infrastructure thesis.
|
|
|
|
### Significance for Agent Architecture
|
|
|
|
Hermes 4 is the default model powering the Hermes Agent. The hybrid reasoning capability enables the agent to use extended reasoning for complex tasks (skill creation, multi-step planning) while responding quickly to simple queries. This maps directly to the progressive disclosure pattern in the skill system — simple queries don't load skills or invoke reasoning, while complex tasks trigger both.
|
|
|
|
Model weights publicly released via Hugging Face. Licensed under CC BY 4.0.
|