Last updated: April 5, 2026 · Model Architecture · by Daniel Ashford
What is Attention Mechanism?
The core technique that allows LLMs to understand relationships between words in text.
Definition
The attention mechanism is the fundamental operation inside Transformer models that allows each token to dynamically focus on other relevant tokens in the input sequence. It computes weighted relationships between all positions in the text.
How It Works
Self-attention operates through queries (Q), keys (K), and values (V). For each token, a query is compared against all keys to produce attention weights, which then weight the values. Multi-head attention runs this process multiple times in parallel with different learned projections, allowing the model to attend to different types of relationships simultaneously.
Example
In "The bank by the river had a steep bank," multi-head attention helps disambiguate the two meanings of "bank" by attending to surrounding context.
Related Terms
See How Models Compare
Understanding attention mechanism is important when choosing the right AI model. See how 12 models compare on our leaderboard.