LLM Speed Leaderboard | Live Latency & Throughput Benchmarks

Fastest Model —

Top Tokens/sec —

Lowest TTFT —

Models Measured —

Sort by:

#	Model	Provider	Tokens/sec	TTFT	Latency	Throughput Bar
Loading speed metrics...

What do these metrics mean?

The number of tokens generated per second. Higher is better — this directly affects how fast you receive responses from the model.

How long until the model starts generating its response. Lower is better — critical for interactive chat and streaming applications.

Total round-trip time for a simple request. Lower is better — includes network overhead and processing queue time.