AnthropicJuly 11, 2023

Claude 2

Name: Claude 2
Author: Anthropic

First Claude with 100K context, established Anthropic as a frontier lab

BENCHMARKS

Benchmark	Score	Rank
HellaSwag Common sense reasoning about everyday situations	89.1%	#26 / 36
ARC-C Grade-school science questions requiring reasoning	93.2%	#34 / 40
HumanEval Coding ability - generating correct Python functions	71.2%	#43 / 50
MATH Competition-level mathematics problems	42.6%	#47 / 50
MMLU Tests knowledge across 57 subjects from STEM to humanities	78.5%	#49 / 54
GPQAArtificial Analysis PhD-level science questions even experts struggle with	34.4%	#73 / 73