Last updated: April 5, 2026 · Pricing & Deployment · by Daniel Ashford

What is Output Tokens?

QUICK ANSWER

The tokens the model generates in its response — the most expensive part of API usage.

Definition

Output tokens are the tokens generated by the model as its response. Each must be predicted one at a time, requiring full computation across the vocabulary for every token. This makes output tokens significantly more expensive.

How It Works

Output tokens typically cost 3-5x more than input. Controlling output length is one of the most effective cost optimization strategies. Techniques include requesting concise responses and setting max_tokens parameters.

Example

A model generating a 500-word response uses approximately 650-700 output tokens. At Claude Opus 4 rates ($75/M), that single response costs about $0.05.

Related Terms

Input Tokens

The tokens in your prompt that the model reads — cheaper than output tokens.

Tokens

The basic units of text that LLMs process — roughly 3/4 of a word.

LLM API Pricing

The cost of using language models, typically measured in dollars per million tokens.

Max Tokens

An API parameter that limits how long the model response can be.

See How Models Compare

Understanding output tokens is important when choosing the right AI model. See how 12 models compare on our leaderboard.

View Leaderboard →Our Methodology

← Browse all 47 glossary terms

Daniel Ashford

Founder & Lead Evaluator · 200+ models evaluated