Speed Leaderboard

Live latency and throughput benchmarks. Measured every 12 hours from real API endpoints.

Updated periodically
Fastest Model
Top Tokens/sec
Lowest TTFT
Models Measured
# Model Provider Tokens/sec TTFT Latency Throughput Bar
Loading speed metrics...

What do these metrics mean?

Tokens/sec (TPS)

The number of tokens generated per second. Higher is better — this directly affects how fast you receive responses from the model.

Time to First Token (TTFT)

How long until the model starts generating its response. Lower is better — critical for interactive chat and streaming applications.

Latency

Total round-trip time for a simple request. Lower is better — includes network overhead and processing queue time.