Agent Readout

Coding agent workload profile

Numbers relevant if you orchestrate autonomous coding loops.

Median completion: 8.6s / request
Cost delta: 60% cheaper vs hyperscalers

Observed behavior

7x faster than Together AI on 1k-token multi-step patches.
Input cost: $0.45 / 1M tokens, output $0.60 / 1M tokens.
Use streaming to keep developers updated while the agent plans.

Recommended settings

Set temperature=0.2, max_tokens=900, stream=true.
Retry with jittered exponential backoff if HTTP 529 (rare, occurs at >500 RPS).

ModeHuman Agent