Scenarios
Four common workloads, priced honestly.
14,300 in · 500 out · 60% cached
The hidden iceberg: a 10-token question costs 14,800 tokens
User types 10 tokens. System prompt, few-shot examples, conversation history, RAG chunks, and tool definitions silently bring the total to 14,300 input tokens. This is the bill no one sees.
Open this scenario in calculator
8,000 in · 300 out · 85% cached
RAG customer support chatbot
High cache hit rate, medium output. Cache makes or breaks the unit economics.
Open this scenario in calculator
60,000 in · 3,000 out · 70% cached
Coding agent with tools
Long context, heavy tool definitions, medium output. Context window is the enemy.
Open this scenario in calculator
80,000 in · 400 out · 0% cached
Long-document summarization
Massive input, tiny output. Counter-intuitively not as expensive as it looks.
Open this scenario in calculator