Last updated: April 5, 2026 · Prompting & Usage · by Daniel Ashford
What is Max Tokens?
An API parameter that limits how long the model response can be.
Definition
Max tokens sets the maximum number of tokens the model can generate. Once reached, generation stops — even mid-sentence. Important for cost control and response sizing.
How It Works
If set too low, responses may be cut off. If too high, the model may be unnecessarily verbose. A more effective approach is instructing the model for brevity in the prompt, which produces complete responses within desired length.
Example
Setting max_tokens=100 for a chatbot ensures concise responses. But a complex answer needing 150 tokens will be cut off at 100.
Related Terms
See How Models Compare
Understanding max tokens is important when choosing the right AI model. See how 12 models compare on our leaderboard.