New Same benchmark, more providers: Ollama vs OpenCode Zen & Go. Compare on TokenDyno →

Ollama Cloud tokens per second — live benchmark

Real inference speed, measured continuously. Every row is a live Ollama Cloud model — sorted by tokens per second, benchmarked every ~10 minutes.

● live — last benchmark 52s ago
Trend 24h
GLM 4.7 Pro 265.8 122.4 418ms 100% 33.8 14m ago
Nemotron 3 Nano 30B (non-reasoning) Free 261.5 223.2 427ms 100% 7.4 4m ago
Ministral 3 3B (non-reasoning) Free 221.5 204.0 474ms 100% 5.6 5m ago
Ministral 3 3B (non-reasoning) Pro 217.8 211.4 444ms 100% 5.6 12m ago
GLM 4.7 Free 195.6 109.1 448ms 100% 33.8 7m ago
Ministral 3 8B (non-reasoning) Free 171.6 132.8 453ms 100% 8.9 5m ago
Qwen3 Coder 480B (non-reasoning) Pro 162.7 108.6 784ms 100% 18 9m ago
Qwen3 Coder 480B (non-reasoning) Free 147.8 129.2 841ms 100% 18 3m ago
Gemini 3 Flash Preview Pro 145.0 114.5 1.7s 100% 37.8 15m ago
Ministral 3 8B (non-reasoning) Pro 140.9 129.5 591ms 100% 8.9 12m ago
MiniMax M2.1 Free 140.1 126.2 1.4s 100% 31.4 7m ago
DeepSeek V4 Pro Pro 137.7 152.8 656ms 100% 44.3 16m ago
RNJ 1 8B Free 126.1 126.0 327ms 100% 3m ago
RNJ 1 8B Pro 125.8 126.2 347ms 100% 9m ago
DeepSeek V4 Flash Pro 125.2 214.1 557ms 100% 40.3 16m ago
MiniMax M2.1 Pro 123.6 125.9 100% 31.4 13m ago
Nemotron 3 Super Free 106.3 55.8 609ms 100% 25.4 4m ago
Kimi K2.7 Code Pro 105.1 149.0 929ms 100% 41.9 13m ago
Qwen3 Coder Next (non-reasoning) Pro 103.2 89.7 335ms 100% 21.2 9m ago
Qwen3 Coder Next (non-reasoning) Free 98.5 88.7 374ms 100% 21.2 3m ago
Nemotron 3 Nano 30B (non-reasoning) Pro 96.3 215.5 439ms 100% 7.4 12m ago
GLM 5.1 Pro 94.8 135.3 938ms 100% 40.2 14m ago
Ministral 3 14B (non-reasoning) Pro 88.9 104.8 488ms 100% 10 13m ago
MiniMax M2.5 Pro 88.3 84.9 296ms 100% 33.7 13m ago
Gemma4 31B Pro 85.2 117.9 391ms 100% 29.4 15m ago
Devstral 2 123B (non-reasoning) Pro 82.4 58.4 547ms 100% 15.5 16m ago
Devstral Small 2 24B (non-reasoning) Pro 82.3 58.9 484ms 100% 13.1 16m ago
Ministral 3 14B (non-reasoning) Free 79.5 105.0 548ms 100% 10 5m ago
Devstral Small 2 24B (non-reasoning) Free 78.6 68.2 504ms 100% 13.1 8m ago
GLM 5.2 Pro 78.5 106.2 566ms 100% 50.7 14m ago
Devstral 2 123B (non-reasoning) Free 77.1 66.2 530ms 100% 15.5 8m ago
Kimi K2.6 Pro 71.6 92.0 645ms 93% 42.8 13m ago
GPT-OSS 20B Free 68.0 92.6 543ms 100% 14.9 7m ago
Mistral Large 3 675B (non-reasoning) Pro 64.5 57.8 715ms 100% 16.2 12m ago
Nemotron 3 Super Pro 63.7 62.9 705ms 100% 25.4 12m ago
GPT-OSS 120B Pro 63.5 132.6 449ms 100% 23.8 14m ago
GPT-OSS 20B Pro 55.8 105.3 454ms 100% 14.9 14m ago
MiniMax M3 Pro 53.4 58.7 1.6s 100% 44.4 13m ago
Gemma3 4B (non-reasoning) Pro 53.0 51.8 544ms 99% 1.1 15m ago
GPT-OSS 120B Free 52.4 131.0 450ms 100% 23.8 7m ago
Gemma4 31B Free 51.9 101.4 422ms 100% 29.4 7m ago
Kimi K2.5 Pro 48.7 134.7 1.0s 100% 38.1 14m ago
MiniMax M3 Free 47.9 58.1 1.3s 100% 44.4 5m ago
Gemma3 4B (non-reasoning) Free 46.4 49.0 557ms 100% 1.1 8m ago
Qwen3.5 397B Pro 42.3 86.0 8.3s 100% 33.7 9m ago
Gemma3 12B (non-reasoning) Pro 36.6 39.4 972ms 100% 3.4 15m ago
MiniMax M2.7 Pro 34.8 39.4 2.1s 100% 38.1 13m ago
GLM 5 Pro 31.9 112.0 656ms 100% 39.5 14m ago
Gemma3 12B (non-reasoning) Free 29.3 39.1 669ms 100% 3.4 8m ago
Nemotron 3 Ultra Free 21.6 30.1 28.2s 93% 37.8 4m ago
Gemma3 27B (non-reasoning) Pro 16.7 16.1 574ms 100% 4.8 15m ago
DeepSeek V3.2 Pro 15.8 33.7 732ms 100% 33.4 18m ago
Nemotron 3 Ultra Pro 14.8 19.5 23.4s 98% 37.8 10m ago
Gemma3 27B (non-reasoning) Free 13.6 16.5 752ms 100% 4.8 8m ago
DeepSeek V3.1 671B (non-reasoning) Pro 11.6 9.1 1.5s 100% 21 19m ago
MiniMax M2.5 Free 3.9 85.0 28.7s 100% 33.7 7m ago

Intelligence Index scores from Artificial Analysis.