AnthropicOctober 22, 2024· Multimodal

Claude 3.5 Sonnet

Name: Claude 3.5 Sonnet
Price: 3 USD
Author: Anthropic

Best coding model of 2024, dominated SWE-bench for months

Singularity Index

68.1

RANK 38 / 71

Reasoning

30%

Coding

25%

Agentic

—

25%

Multimodal

10%

General

10%

BENCHMARKS

Benchmark	Score	Rank
HumanEval Coding ability - generating correct Python functions	93.7%	#17 / 50
ARC-C Grade-school science questions requiring reasoning	96.7%	#23 / 40
HellaSwag Common sense reasoning about everyday situations	89%	#27 / 36
MMLU Tests knowledge across 57 subjects from STEM to humanities	88.7%	#28 / 54
MATH Competition-level mathematics problems	78.3%	#33 / 50
MMMUvals.ai College-level multimodal reasoning across 30+ disciplines	68.8%	#33 / 39
MMLU-Provals.ai Harder 10-option successor to MMLU; more reasoning-focused	78.4%	#36 / 38
SWE-bench Real-world GitHub issue resolution	49%	#37 / 40
LiveCodeBenchvals.ai Contamination-free competitive programming (filtered by cutoff date)	49.6%	#37 / 40
Arena Elo Human preference ranking via blind comparisons	1280	#44 / 52
GPQA PhD-level science questions even experts struggle with	65%	#50 / 64

Input

per 1M tokens

Output

$15

per 1M tokens

Context

200K

tokens