Extended Benchmarks

Detailed benchmark scores across SWE-bench, HumanEval, LiveCodeBench, AIME 2024, MATH-500, and IFEval.

# Model Provider Score Bar
Loading benchmark data...