Up to 7x Faster Inference
Fastest and Cheapest AI Inference.

Deploy any model — ours or yours — with inference speed that changes what's possible.

Sign up with your agent

Your agent can sign up for you.

Hand this prompt to your autonomous coding agent and it'll walk through our signup flow on your behalf.

CodexCodexClaude CodeClaude CodeOpenCodeOpenCodeCursorCursorAiderAider
Sign me up for a General Compute API account and get an API key. Fetch instructions from https://docs.generalcompute.com/agent-signup and follow them.
Live Inference Benchmark

See the difference.

Both running GPT OSS 120B — same model, same prompt. Only the infrastructure is different.

General ComputeGeneral ComputeUs

Ready to compare

Together AITogether AICompetitor

Ready to compare

Try preset prompts or enter your own to compare inference speed in real-time

How We're Faster & Cheaper

We rethought the entire stack from silicon to site selection — so you get better performance at a fraction of the cost.

How We're Faster

  • Purpose-built AI accelerators — not general-purpose GPUs
  • Optimized software stack tuned end-to-end for inference
  • Disaggregated architecture that scales each resource independently

How We're Cheaper

  • Cheap energy inputs — $0.035/kWh vs $0.13 US average
  • 7x more energy-efficient accelerators (17 kW* vs 120 kW)
  • Air cooled — no expensive liquid cooling infrastructure

General Compute vs NVIDIA GPU Cloud

MiniMax M2.5 model comparison

Throughput (tokens/sec)*

Higher is better

GC
950 tok/s
NV
100 tok/s

Energy Usage*

Lower is better

GC
17 kW
NV
120 kW

Energy Cost

Lower is better

GC
0.035 $/kWh
NV
0.13 $/kWh
General Compute
NVIDIA Cloud
Hardware
Purpose-built accelerators
Graphics Processing Units
Energy Usage
17 kW*
120 kW
Energy Cost
$0.035 / kWh
$0.13 / kWh
Throughput (MiniMax M2.5)
950 tok/s*
~100 tok/s

*Projected on next-generation racks. NVIDIA throughput via Together AI benchmarks. Energy: US commercial avg vs. our rate.

Built for Every Workload

From prototyping to production, we have the infrastructure to match your needs.

API Access

REST API with OpenAI-compatible endpoints. Access the fastest models with a single API key.

Get API Key

Custom Deployments

Dedicated infrastructure with SLAs, custom scaling, and guaranteed capacity for your workloads.

Contact Sales

Bring Your Own Model

Deploy any model on our optimized infrastructure. Same speed, your weights.

Learn More

Speed That Speaks for Itself

Real numbers, real infrastructure, real-time performance.

0x*

Faster Inference

<0ms*

Time to First Token

0%

Uptime SLA

0+

Tokens per Second

*Performance varies by model and geography.

Switch at
warp speed

OpenAI-compatible API. Change your base URL and API key — that's it. Your existing code works instantly, just faster.

View Docs
main.py
from openai import OpenAI

client = OpenAI(
    base_url="https://api.generalcompute.com",
    api_key="your-api-key",
)

response = client.chat.completions.create(
    model="gpt-oss-120b",
    messages=[{"role": "user", "content": "Hello!"}],
    stream=True,
)
$5 in free credit when you sign up

Start Building Today

Get your API key in seconds. OpenAI-compatible — just change your base URL and start shipping faster.

ModeHumanAgent