Llama 4 Maverick
View website
Released April 2025, a 400B MoE (17B active, 128 experts) fitting on a single H100 node. Meta's first MoE-based Llama generation, trained on 30T tokens.
At a glance
- Context window
- 1M tokens
- Max output
- 1M tokens
- Knowledge cutoff
- Apr 2025
- Modalities
- Text Text Image Image → Text Text
Capabilities
Function calling
Function calling
Connect to external tools, APIs, and systems.
Pricing by provider
| Provider | Input / 1M tokens | Output / 1M tokens | |
|---|---|---|---|
|
|
$0.25 | $0.95 | |
|
|
$0.27 | $0.85 | |
|
|
$0.27 | $0.85 | |
|
|
Self-hosted | Self-hosted |
Heads up: We do our best to keep these specs & prices accurate. However, cloud costs may fluctuate based on region, usage, and other factors not listed here. These are estimates based on common setups and are for informational purposes only. Always verify current rates & exact specs with the provider before provisioning.
Compare with other models
Estimated prices shown. Actual costs may vary based on context length, batch size, caching, and provider-specific pricing tiers.