Benchmarks
See how DeepSeek V4 performs across industry-standard benchmarks and real-world tasks
DeepSeek V4 Performance Benchmarks
Open Source Spirit. Commercial Grade Power.
DeepSeek V4 leads both open-source and closed-source models across all core benchmarks.
SWE-bench
Software Engineering
42.5%
DeepSeek V4 Score
DeepSeek V4
42.5%
Competitor O*
22.1%
Competitor A*
20.3%
HumanEval
Coding
95.2%
DeepSeek V4 Score
DeepSeek V4
95.2%
Competitor O*
90.1%
Competitor A*
88%
MATH
Reasoning
88.9%
DeepSeek V4 Score
DeepSeek V4
88.9%
Competitor O*
84.5%
Competitor A*
82.1%
All benchmarks are fully reproducible. View evaluation code here