Leaked benchmark data shows Claude Mythos outperforms Opus 4.6 by 18–27% on key tasks. Here's the full breakdown.
Based on leaked internal evaluation data corroborated by multiple independent sources (April 2026):
| Benchmark | Claude Opus 4.6 | Claude Mythos | Δ Change |
|---|---|---|---|
| HumanEval (coding) | 82.4% | 97.1% | +18% |
| MATH (reasoning) | 71.3% | 90.7% | +27% |
| MMLU Pro | 78.9% | 91.2% | +16% |
| SWE-bench (agentic) | 49.3% | 68.4% | +39% |
| GPQA Diamond | 74.1% | 86.3% | +16% |
| BrowseComp | 42.7% | 59.2% | +39% |
Key takeaway: Mythos shows its biggest gains in agentic tasks (SWE-bench +39%, BrowseComp +39%) — i.e., tasks that require multi-step planning and real-world tool use. This is where Anthropic has focused its training innovations.
| Feature | Opus 4.6 | Claude Mythos |
|---|---|---|
| Context Window | 200K tokens | ~400K tokens (est.) |
| Vision (images) | ✓ | ✓ (enhanced) |
| Audio Input | ✗ | ✓ (new) |
| Extended Thinking | ✓ | ✓ (deeper) |
| Computer Use | Beta | GA |
| Tool/Function Calling | ✓ | ✓ (parallel) |
| Available Now | ✓ Yes | ✗ (Q2–Q3 2026) |
Larger models are often slower. Based on architectural hints in the leak, Mythos uses a mixture-of-experts variant that should maintain comparable throughput to Opus:
Speed scores are relative estimates (tok/sec) based on architecture analysis. Official numbers pending release.
You need a production-ready model today. Cost is a factor. Your tasks are within 200K context. You're building products that need stable API availability.
You're building cutting-edge agentic systems. You need >200K context. Audio understanding is critical. You want the highest possible coding benchmark performance.
Opus 4.6 is currently priced at $15/M input tokens · $75/M output tokens. Claude Mythos is expected to launch at a premium — analyst estimates range from $25–$40/M input tokens based on the capability jump over Opus.
For most production workloads, Claude Sonnet or Opus will remain more cost-effective. Mythos will justify its price for tasks requiring frontier-level reasoning.
Be first to know when Anthropic releases Claude Mythos. Early access alerts & benchmark updates.
Join 3,400+ developers · Free forever · No spam
Explore Claude Mythos →