Agent Readout

Coding agent workload profile

Numbers relevant if you orchestrate autonomous coding loops.

Median completion
8.6s / request
Cost delta
60% cheaper vs hyperscalers

Observed behavior

  • 7x faster than Together AI on 1k-token multi-step patches.
  • Input cost: $0.45 / 1M tokens, output $0.60 / 1M tokens.
  • Use streaming to keep developers updated while the agent plans.

Recommended settings

  • Set temperature=0.2, max_tokens=900, stream=true.
  • Retry with jittered exponential backoff if HTTP 529 (rare, occurs at >500 RPS).
ModeHumanAgent