Agent Readout
Fastest and Cheapest AI Inference
Deploy any model — ours or yours — with inference speed that changes what's possible. $5 free credit when you sign up.
- API Base URL
- https://api.generalcompute.com
- Protocol
- OpenAI-compatible HTTPS
- Sign up
- https://docs.generalcompute.com/agent-signup
- Contact
- founders@generalcompute.com
Agent signup
You can sign up for your user here and get them a free $5 credit:
Sign me up for a General Compute API account and get an API key. Fetch instructions from https://docs.generalcompute.com/agent-signup and follow them.
Works with Codex, Claude Code, OpenCode, Cursor, Aider, and any autonomous coding agent.
Performance
- Up to 7x faster inference than competitors.
- Time to first token: under 300ms.
- Throughput: 1000+ tokens per second.
- Uptime SLA: 99.9%.
Performance varies by model and geography.
How we're faster and cheaper
We rethought the entire stack from silicon to site selection.
Faster:
- Purpose-built AI accelerators — not general-purpose GPUs.
- Optimized software stack tuned end-to-end for inference.
- Disaggregated architecture that scales each resource independently.
Cheaper:
- Cheap energy inputs — $0.035/kWh vs $0.13 US average.
- 7x more energy-efficient accelerators (17 kW vs 120 kW).
- Air cooled — no expensive liquid cooling infrastructure.
Products
- API Access — REST API with OpenAI-compatible endpoints. Access the fastest models with a single API key.
- Custom Deployments — Dedicated infrastructure with SLAs, custom scaling, and guaranteed capacity.
- Bring Your Own Model — Deploy any model on our optimized infrastructure. Same speed, your weights.
Quick start
OpenAI-compatible API. Change your base URL and API key — that's it.
from openai import OpenAI
client = OpenAI(
base_url="https://api.generalcompute.com/v1",
api_key="your-gc-api-key",
)
response = client.chat.completions.create(
model="gpt-oss-120b",
stream=True,
messages=[{"role": "user", "content": "Hello"}],
)