Ollama Cloud tokens per second — live benchmark
Real inference speed, measured continuously. Every row is a live Ollama Cloud model — sorted by tokens per second, benchmarked every ~10 minutes.
● live
— last benchmark 52s ago
| Trend 24h | |||||||
|---|---|---|---|---|---|---|---|
| GLM 4.7 Pro | 265.8 | 122.4 | 418ms | 100% | 33.8 | | 14m ago |
| Nemotron 3 Nano 30B (non-reasoning) Free | 261.5 | 223.2 | 427ms | 100% | 7.4 | | 4m ago |
| Ministral 3 3B (non-reasoning) Free | 221.5 | 204.0 | 474ms | 100% | 5.6 | | 5m ago |
| Ministral 3 3B (non-reasoning) Pro | 217.8 | 211.4 | 444ms | 100% | 5.6 | | 12m ago |
| GLM 4.7 Free | 195.6 | 109.1 | 448ms | 100% | 33.8 | | 7m ago |
| Ministral 3 8B (non-reasoning) Free | 171.6 | 132.8 | 453ms | 100% | 8.9 | | 5m ago |
| Qwen3 Coder 480B (non-reasoning) Pro | 162.7 | 108.6 | 784ms | 100% | 18 | | 9m ago |
| Qwen3 Coder 480B (non-reasoning) Free | 147.8 | 129.2 | 841ms | 100% | 18 | | 3m ago |
| Gemini 3 Flash Preview Pro | 145.0 | 114.5 | 1.7s | 100% | 37.8 | | 15m ago |
| Ministral 3 8B (non-reasoning) Pro | 140.9 | 129.5 | 591ms | 100% | 8.9 | | 12m ago |
| MiniMax M2.1 Free | 140.1 | 126.2 | 1.4s | 100% | 31.4 | | 7m ago |
| DeepSeek V4 Pro Pro | 137.7 | 152.8 | 656ms | 100% | 44.3 | | 16m ago |
| RNJ 1 8B Free | 126.1 | 126.0 | 327ms | 100% | — | | 3m ago |
| RNJ 1 8B Pro | 125.8 | 126.2 | 347ms | 100% | — | | 9m ago |
| DeepSeek V4 Flash Pro | 125.2 | 214.1 | 557ms | 100% | 40.3 | | 16m ago |
| MiniMax M2.1 Pro | 123.6 | 125.9 | — | 100% | 31.4 | | 13m ago |
| Nemotron 3 Super Free | 106.3 | 55.8 | 609ms | 100% | 25.4 | | 4m ago |
| Kimi K2.7 Code Pro | 105.1 | 149.0 | 929ms | 100% | 41.9 | | 13m ago |
| Qwen3 Coder Next (non-reasoning) Pro | 103.2 | 89.7 | 335ms | 100% | 21.2 | | 9m ago |
| Qwen3 Coder Next (non-reasoning) Free | 98.5 | 88.7 | 374ms | 100% | 21.2 | | 3m ago |
| Nemotron 3 Nano 30B (non-reasoning) Pro | 96.3 | 215.5 | 439ms | 100% | 7.4 | | 12m ago |
| GLM 5.1 Pro | 94.8 | 135.3 | 938ms | 100% | 40.2 | | 14m ago |
| Ministral 3 14B (non-reasoning) Pro | 88.9 | 104.8 | 488ms | 100% | 10 | | 13m ago |
| MiniMax M2.5 Pro | 88.3 | 84.9 | 296ms | 100% | 33.7 | | 13m ago |
| Gemma4 31B Pro | 85.2 | 117.9 | 391ms | 100% | 29.4 | | 15m ago |
| Devstral 2 123B (non-reasoning) Pro | 82.4 | 58.4 | 547ms | 100% | 15.5 | | 16m ago |
| Devstral Small 2 24B (non-reasoning) Pro | 82.3 | 58.9 | 484ms | 100% | 13.1 | | 16m ago |
| Ministral 3 14B (non-reasoning) Free | 79.5 | 105.0 | 548ms | 100% | 10 | | 5m ago |
| Devstral Small 2 24B (non-reasoning) Free | 78.6 | 68.2 | 504ms | 100% | 13.1 | | 8m ago |
| GLM 5.2 Pro | 78.5 | 106.2 | 566ms | 100% | 50.7 | | 14m ago |
| Devstral 2 123B (non-reasoning) Free | 77.1 | 66.2 | 530ms | 100% | 15.5 | | 8m ago |
| Kimi K2.6 Pro | 71.6 | 92.0 | 645ms | 93% | 42.8 | | 13m ago |
| GPT-OSS 20B Free | 68.0 | 92.6 | 543ms | 100% | 14.9 | | 7m ago |
| Mistral Large 3 675B (non-reasoning) Pro | 64.5 | 57.8 | 715ms | 100% | 16.2 | | 12m ago |
| Nemotron 3 Super Pro | 63.7 | 62.9 | 705ms | 100% | 25.4 | | 12m ago |
| GPT-OSS 120B Pro | 63.5 | 132.6 | 449ms | 100% | 23.8 | | 14m ago |
| GPT-OSS 20B Pro | 55.8 | 105.3 | 454ms | 100% | 14.9 | | 14m ago |
| MiniMax M3 Pro | 53.4 | 58.7 | 1.6s | 100% | 44.4 | | 13m ago |
| Gemma3 4B (non-reasoning) Pro | 53.0 | 51.8 | 544ms | 99% | 1.1 | | 15m ago |
| GPT-OSS 120B Free | 52.4 | 131.0 | 450ms | 100% | 23.8 | | 7m ago |
| Gemma4 31B Free | 51.9 | 101.4 | 422ms | 100% | 29.4 | | 7m ago |
| Kimi K2.5 Pro | 48.7 | 134.7 | 1.0s | 100% | 38.1 | | 14m ago |
| MiniMax M3 Free | 47.9 | 58.1 | 1.3s | 100% | 44.4 | | 5m ago |
| Gemma3 4B (non-reasoning) Free | 46.4 | 49.0 | 557ms | 100% | 1.1 | | 8m ago |
| Qwen3.5 397B Pro | 42.3 | 86.0 | 8.3s | 100% | 33.7 | | 9m ago |
| Gemma3 12B (non-reasoning) Pro | 36.6 | 39.4 | 972ms | 100% | 3.4 | | 15m ago |
| MiniMax M2.7 Pro | 34.8 | 39.4 | 2.1s | 100% | 38.1 | | 13m ago |
| GLM 5 Pro | 31.9 | 112.0 | 656ms | 100% | 39.5 | | 14m ago |
| Gemma3 12B (non-reasoning) Free | 29.3 | 39.1 | 669ms | 100% | 3.4 | | 8m ago |
| Nemotron 3 Ultra Free | 21.6 | 30.1 | 28.2s | 93% | 37.8 | | 4m ago |
| Gemma3 27B (non-reasoning) Pro | 16.7 | 16.1 | 574ms | 100% | 4.8 | | 15m ago |
| DeepSeek V3.2 Pro | 15.8 | 33.7 | 732ms | 100% | 33.4 | | 18m ago |
| Nemotron 3 Ultra Pro | 14.8 | 19.5 | 23.4s | 98% | 37.8 | | 10m ago |
| Gemma3 27B (non-reasoning) Free | 13.6 | 16.5 | 752ms | 100% | 4.8 | | 8m ago |
| DeepSeek V3.1 671B (non-reasoning) Pro | 11.6 | 9.1 | 1.5s | 100% | 21 | | 19m ago |
| MiniMax M2.5 Free | 3.9 | 85.0 | 28.7s | 100% | 33.7 | | 7m ago |
No models match your filter.
Intelligence Index scores from Artificial Analysis.