Last updated: April 5, 2026 · Core Concepts · by Daniel Ashford
What is Context Window?
The maximum amount of text an LLM can process in a single request.
Definition
The context window is the maximum number of tokens an LLM can process in a single interaction. This includes both the input (your prompt, system instructions, and any documents) and the output (the model response). Anything beyond the context window is invisible to the model.
How It Works
Context windows have grown dramatically: GPT-3 had 4K tokens, GPT-4 introduced 128K, and Gemini 2.5 Ultra now supports 2 million tokens — enough to process an entire book. Larger context windows enable long document analysis, multi-turn conversations with full history, and code repository understanding. However, larger contexts cost more and may reduce response quality on information far from the end.
Example
With a 200K context window (Claude Opus 4), you can paste an entire 300-page document and ask questions about it. With a 4K context window, you could only fit about 6 pages.
Related Terms
See How Models Compare
Understanding context window is important when choosing the right AI model. See how 12 models compare on our leaderboard.