Model selection 2026: pick the right Claude for the job.
There is no single best model in 2026. There is the right model for the call you are making. This guide walks through the practical decision logic across Claude Fable 5, Opus 4.8, and Haiku 4.5, with concrete examples from the workflows IG ships in production.
Anthropic now offers three publicly available tiers of Claude: Fable 5 (the most capable, Mythos-class public model), Opus 4.8 (the prior flagship and the fallback for sensitive Fable 5 queries), and Haiku 4.5 (the high-throughput, cost-efficient workhorse). The teams that ship the best work are not the ones using the most capable model on every call. They are the ones routing the right call to the right model. This guide is the decision framework IG uses inside client engagements.
The three models, in one paragraph each
Claude Fable 5
The most capable publicly available Claude model. 1M token context, 128K max output, $10 / $50 per million tokens. Best on long-horizon agentic work, multi-step reasoning, complex coding, multimodal tasks, and any workflow that benefits from carrying significant state. Routes roughly 5% of sensitive queries to Opus 4.8 automatically. The right choice when capability matters more than latency or cost.
Claude Opus 4.8
The previous flagship public model and Fable 5’s safety fallback. Slightly lower benchmark performance than Fable 5 but with a different safety surface and well-understood behavior across regulated domains. Pricing remains higher than Fable 5 in most use cases. The right choice when you need predictable behavior in a domain Fable 5 is conservative about, or when integration code is already wired against Opus 4.8 and the migration cost outweighs the capability lift.
Claude Haiku 4.5
The high-throughput, low-cost workhorse model. Significantly faster and significantly cheaper than Fable 5 or Opus 4.8. Performs well on classification, extraction, simple transformations, and high-volume operations. The right choice for any workflow where throughput matters more than reasoning depth, or where the same call is run thousands of times per day at low individual stakes.
The decision tree
For any new AI call your team is architecting, walk this sequence.
Step 1: Is the workflow agentic or single-prompt?
Agentic = multi-step, tool-using, long-horizon. The model carries state across decisions and may take actions in external systems.
Single-prompt = one input in, one output out. No chained reasoning. No tool use.
If single-prompt: continue to step 2.
If agentic: default to Fable 5. The capability lift on multi-step work justifies the cost. Skip to step 4 for safety considerations.
Step 2: Is the call high-volume or low-volume?
High-volume = more than 100 calls per day on this exact pattern.
Low-volume = under 100 calls per day, or stakes-per-call are high.
If low-volume: continue to step 3.
If high-volume: default to Haiku 4.5. The cost-per-call difference compounds at volume. Only escalate to a more capable model if Haiku 4.5 fails specific quality thresholds.
Step 3: Does the workflow involve sensitive domains?
Sensitive domains include cybersecurity research, biology, chemistry, defense, financial regulation, healthcare diagnostics, and model distillation. Fable 5’s safety classifiers route these to Opus 4.8 automatically.
If sensitive and you want predictable behavior: Opus 4.8 directly, or Mythos 5 if you have Glasswing access and a compliance reason.
If non-sensitive: Fable 5 is your default.
Step 4: Architectural safety design
For any agentic workflow, the question is not just which model. It is also which human-in-the-loop checkpoints exist, what the fallback path looks like if Fable 5 routes to Opus 4.8 mid-task, and how observability is wired. This is the architectural layer where most production agentic workflows fail when they fail.
Common patterns and the model they map to
Campaign brief to launch automation
Fable 5 (agentic, long-horizon)
Customer support ticket triage
Haiku 4.5 (high-volume, low-stakes)
Sales call summary generation
Haiku 4.5 (single-prompt, high-volume)
Multi-source customer intelligence synthesis
Fable 5 (long context, multi-step)
Regulated content review (finance/health)
Opus 4.8 (predictable safety surface)
Real-time personalization decisions
Haiku 4.5 (latency matters more than capability)
Long-document analysis and summarization
Fable 5 (1M context)
Code generation for production systems
Fable 5 (SWE-bench Verified 95%)
Internal data classification at scale
Haiku 4.5 (cost per call dominates)
Cybersecurity research and pentesting analysis
Mythos 5 (Glasswing-only) or external tool
The cost framework
Most teams overspend on model calls because they default-up rather than default-right. The cost framework that works:
Estimate token volume per workflow per month. Multiply expected daily calls by daily call count by 30. For each model, calculate the monthly cost at that volume. The numbers usually surprise teams.
Run a 7-day shadow comparison. For each new workflow, run the same calls against Fable 5 and Haiku 4.5 in parallel for a week. Measure user-visible outcome quality. Many teams find Haiku 4.5 produces acceptable quality at 10% of the cost.
Architect the router, not the model. The system that decides which model handles which call is more important than which model handles any single call. Building this layer once pays back in every subsequent workflow.
Frequently asked questions
Do I need to use all three models in production?
For most teams, yes. A tier-aware model stack is the architecture pattern that produces the best cost-to-capability ratio. Even small teams running 2-3 production workflows will see meaningful savings by routing low-stakes calls to Haiku 4.5 while keeping Fable 5 for the workflows that benefit from its full capability.
How do I know when Haiku 4.5 is good enough?
Run a 7-day shadow comparison against Fable 5 on the exact prompts and inputs your workflow uses. Measure user-visible quality outcomes, not synthetic benchmarks. If Haiku 4.5 quality is within acceptable bounds, the 10x cost difference is significant.
What is the Opus 4.8 use case in 2026 if Fable 5 is more capable?
Three primary uses. First, automatic fallback when Fable 5 routes sensitive queries away. Second, predictable behavior in regulated domains where Fable 5's safety classifiers introduce variability. Third, migration cost: teams with existing Opus 4.8 integrations and stable performance often defer the Fable 5 migration on stable workflows.
Can I mix Anthropic models with OpenAI or Google models?
Yes, and many production architectures do. The decision logic in this guide applies to any tier-aware model stack. The specific model names change. The routing principles do not.
Do agentic workflows always need Fable 5?
No. Many agentic workflows are short-horizon (3-5 steps) and tool-light. These often run well on Haiku 4.5 with the right scaffolding. The Fable 5 advantage is most pronounced on workflows with more than 7 sequential decisions or significant state-carrying requirements.