xAIJuly 9, 2025· Multimodal

Grok 4

Name: Grok 4
Price: 3 USD
Author: xAI

Deep reasoning model, strongest on abstract math at launch

Singularity Index

62.9

RANK 25 / 56

Reasoning

30%

Coding

25%

Agentic

25%

Multimodal

10%

General

10%

BENCHMARKS

Benchmark	Score	Rank
HumanEval Coding ability - generating correct Python functions	95.2%	#9 / 50
HellaSwag Common sense reasoning about everyday situations	96.4%	#9 / 36
MMLU Tests knowledge across 57 subjects from STEM to humanities	92.4%	#10 / 54
ARC-C Grade-school science questions requiring reasoning	98.1%	#10 / 40
ARC-AGIARC Prize Novel reasoning tasks requiring fluid intelligence	21.2%	#17 / 26
MATH Competition-level mathematics problems	94%	#20 / 50
SWE-bench Real-world GitHub issue resolution	68.4%	#26 / 39
Arena Elo Human preference ranking via blind comparisons	1408	#28 / 51
GPQA PhD-level science questions even experts struggle with	88%	#31 / 73
LiveCodeBenchvals.ai Contamination-free competitive programming (filtered by cutoff date)	79%	#34 / 48
TerminalArtificial Analysis Agentic terminal coding tasks requiring multi-step execution	37.9%	#35 / 56
hleArtificial Analysis	23.9%	#36 / 61
MMMUvals.ai College-level multimodal reasoning across 30+ disciplines	72.8%	#37 / 46
MMLU-Provals.ai Harder 10-option successor to MMLU; more reasoning-focused	79.7%	#40 / 45

Input

per 1M tokens

Output

$15

per 1M tokens

Context

256K

tokens

Speed

40.3

tokens / sec