Last updated: April 5, 2026 · Evaluation & Benchmarks · by Daniel Ashford

What is MMLU / MMLU-Pro?

QUICK ANSWER

A benchmark testing broad academic knowledge across 57 subjects.

Definition

MMLU (Massive Multitask Language Understanding) tests language models across 57 academic subjects including STEM, humanities, social sciences, and professional fields. MMLU-Pro is the harder updated version.

How It Works

The original MMLU contained 16,000+ multiple-choice questions. Frontier models have saturated it above 88%. MMLU-Pro addresses this with harder questions. Top MMLU-Pro scores in 2026 are around 79-81%.

Example

A sample MMLU question in professional law: "Under the Fourth Amendment, which of the following would most likely constitute an unreasonable search?" followed by four legal scenarios.

Related Terms

Benchmark
A standardized test used to measure and compare LLM capabilities.
GPQA Diamond
A graduate-level science benchmark with questions written by PhD experts.

See How Models Compare

Understanding mmlu / mmlu-pro is important when choosing the right AI model. See how 12 models compare on our leaderboard.

View Leaderboard →Our Methodology
← Browse all 47 glossary terms
DA
Daniel Ashford
Founder & Lead Evaluator · 200+ models evaluated