Apples-to-apples LLM inference benchmarks — the same model run on General Compute vs. the GPU cloud baseline, measured across context sizes.
Think this is impressive?
This is running on 4-year-old tech. We're going to 5× this advantage in the coming months.
Reach out to the founders directly with questions about our methodology, results, or running a benchmark against your own workload.
Reach out to the founders