Last updated: April 5, 2026 · Pricing & Deployment · by Daniel Ashford
What is Output Tokens?
The tokens the model generates in its response — the most expensive part of API usage.
Definition
Output tokens are the tokens generated by the model as its response. Each must be predicted one at a time, requiring full computation across the vocabulary for every token. This makes output tokens significantly more expensive.
How It Works
Output tokens typically cost 3-5x more than input. Controlling output length is one of the most effective cost optimization strategies. Techniques include requesting concise responses and setting max_tokens parameters.
Example
A model generating a 500-word response uses approximately 650-700 output tokens. At Claude Opus 4 rates ($75/M), that single response costs about $0.05.
Related Terms
See How Models Compare
Understanding output tokens is important when choosing the right AI model. See how 12 models compare on our leaderboard.