We raised $15M to build the world's fastest neocloud.Read
vercel ai sdknextjsstreamingtutorialreactedge runtime

Using Vercel AI SDK With GeneralCompute: Full Integration Guide

General Compute·

The Vercel AI SDK is the most popular way to add streaming AI responses to Next.js apps. It ships useChat and useCompletion hooks that handle everything from token streaming to loading states to conversation history, and it connects to any OpenAI-compatible API.

GeneralCompute exposes an OpenAI-compatible API, which means the SDK connects to it with a two-line config change. You get GeneralCompute's inference speeds with the same hooks and components you already know.

This guide walks through:

  • Setting up the Vercel AI SDK to use GeneralCompute as its provider
  • Building a chat interface with useChat
  • Building a text generation UI with useCompletion
  • Deploying your API routes to the Vercel edge runtime for minimum latency

How the SDK Works

The Vercel AI SDK separates the provider (where you call the model) from the hooks (how you handle the response in your UI). On the server you define a route handler that calls the model and streams back the response. On the client, useChat or useCompletion hits that route and handles the stream, exposing React state you bind to your UI.

This architecture keeps API keys on the server and makes the client code trivially simple.

Installation

npm install ai @ai-sdk/openai

The ai package is the core SDK. @ai-sdk/openai is the official OpenAI provider, which works with any OpenAI-compatible endpoint -- including GeneralCompute's.

Configuring the Provider

Create a shared provider config you can import into any route handler:

// lib/generalcompute.ts import { createOpenAI } from "@ai-sdk/openai"; export const generalcompute = createOpenAI({ apiKey: process.env.GENERALCOMPUTE_API_KEY, baseURL: "https://api.generalcompute.com/v1", });

The createOpenAI factory accepts baseURL to override where requests go. From here, use generalcompute("model-name") anywhere you'd normally write openai("gpt-4o").

Add your API key to .env.local:

GENERALCOMPUTE_API_KEY=your_api_key_here

Building a Chat Interface With useChat

useChat is the SDK's primary hook for multi-turn conversations. It manages message history, handles the streaming response, and exposes loading and error states.

The Route Handler

First, define the server-side route that the hook will call. In the Next.js App Router, this goes in app/api/chat/route.ts:

// app/api/chat/route.ts import { streamText } from "ai"; import { generalcompute } from "@/lib/generalcompute"; export const runtime = "edge"; export async function POST(req: Request) { const { messages } = await req.json(); const result = streamText({ model: generalcompute("llama-4-maverick"), messages, system: "You are a helpful assistant. Be concise.", }); return result.toDataStreamResponse(); }

streamText calls the model and returns a streaming result. toDataStreamResponse() formats it as the Vercel AI SDK's data stream protocol, which useChat on the client knows how to consume.

The export const runtime = "edge" line tells Vercel to deploy this function to its edge network. More on that below.

The Chat Component

// app/chat/page.tsx "use client"; import { useChat } from "ai/react"; export default function ChatPage() { const { messages, input, handleInputChange, handleSubmit, isLoading } = useChat({ api: "/api/chat", }); return ( <div className="max-w-2xl mx-auto p-4"> <div className="space-y-4 mb-4 min-h-96"> {messages.map((message) => ( <div key={message.id} className={`p-3 rounded-lg ${ message.role === "user" ? "bg-blue-50 ml-8" : "bg-gray-50 mr-8" }`} > <p className="text-sm font-medium text-gray-500 mb-1"> {message.role === "user" ? "You" : "Assistant"} </p> <p className="text-gray-800 whitespace-pre-wrap">{message.content}</p> </div> ))} {isLoading && ( <div className="bg-gray-50 mr-8 p-3 rounded-lg"> <p className="text-sm text-gray-400">Thinking...</p> </div> )} </div> <form onSubmit={handleSubmit} className="flex gap-2"> <input value={input} onChange={handleInputChange} placeholder="Ask something..." disabled={isLoading} className="flex-1 border border-gray-200 rounded-lg px-3 py-2 text-sm focus:outline-none focus:ring-2 focus:ring-blue-500" /> <button type="submit" disabled={isLoading || !input.trim()} className="px-4 py-2 bg-blue-600 text-white text-sm rounded-lg disabled:opacity-50" > Send </button> </form> </div> ); }

useChat gives you messages (the full conversation array), input (controlled input value), handleInputChange and handleSubmit (event handlers you attach directly to the form), and isLoading (true while a response is streaming).

That's the complete chat UI. The hook handles everything else: appending your message to the history, sending the request, accumulating streamed tokens into the message object, and updating React state on each token.

Building a Text Generation UI With useCompletion

useCompletion is better suited for single-shot generation tasks: summarization, text transformation, content generation from a prompt. Unlike useChat, it doesn't maintain a conversation history.

The Route Handler

// app/api/completion/route.ts import { streamText } from "ai"; import { generalcompute } from "@/lib/generalcompute"; export const runtime = "edge"; export async function POST(req: Request) { const { prompt } = await req.json(); const result = streamText({ model: generalcompute("llama-4-scout"), prompt, maxTokens: 1024, }); return result.toDataStreamResponse(); }

The Completion Component

// app/complete/page.tsx "use client"; import { useCompletion } from "ai/react"; export default function CompletionPage() { const { completion, input, handleInputChange, handleSubmit, isLoading } = useCompletion({ api: "/api/completion", }); return ( <div className="max-w-2xl mx-auto p-4 space-y-4"> <form onSubmit={handleSubmit} className="space-y-3"> <textarea value={input} onChange={handleInputChange} placeholder="Enter your prompt..." rows={4} className="w-full border border-gray-200 rounded-lg px-3 py-2 text-sm focus:outline-none focus:ring-2 focus:ring-blue-500 resize-none" /> <button type="submit" disabled={isLoading || !input.trim()} className="px-4 py-2 bg-blue-600 text-white text-sm rounded-lg disabled:opacity-50" > {isLoading ? "Generating..." : "Generate"} </button> </form> {completion && ( <div className="border border-gray-100 rounded-lg p-4 bg-gray-50"> <p className="text-sm font-medium text-gray-500 mb-2">Output</p> <p className="text-gray-800 whitespace-pre-wrap">{completion}</p> </div> )} </div> ); }

The completion string updates token-by-token as the response streams in. You can bind it directly to any UI element.

Customizing the Hook Behavior

Both hooks accept an options object with several useful settings:

const { messages, input, handleInputChange, handleSubmit } = useChat({ api: "/api/chat", // Initial messages (useful for pre-seeding a conversation) initialMessages: [ { id: "welcome", role: "assistant", content: "Hi, how can I help you today?", }, ], // Run a callback when the stream finishes onFinish: (message) => { console.log("Final message:", message.content); // Save to database, update analytics, etc. }, // Handle errors from the route handler onError: (error) => { console.error("Chat error:", error); }, });

If you need to send additional data with each request (a user ID, a selected document, a session token), pass it through body:

const { messages, handleSubmit } = useChat({ api: "/api/chat", body: { userId: session.user.id, documentId: selectedDoc, }, });

The route handler receives this in the request body alongside messages.

Passing Extra Context From the Route Handler

Sometimes you want to send structured data back to the client alongside the stream -- for example, token usage stats or tool call results. The SDK supports this through data:

// app/api/chat/route.ts import { streamText, createDataStreamResponse } from "ai"; import { generalcompute } from "@/lib/generalcompute"; export const runtime = "edge"; export async function POST(req: Request) { const { messages } = await req.json(); return createDataStreamResponse({ execute: async (dataStream) => { dataStream.writeData({ startTime: Date.now() }); const result = streamText({ model: generalcompute("llama-4-maverick"), messages, onFinish: ({ usage }) => { dataStream.writeData({ inputTokens: usage.promptTokens, outputTokens: usage.completionTokens, }); }, }); result.mergeIntoDataStream(dataStream); }, }); }

On the client, read it with the data field from useChat:

const { messages, data } = useChat({ api: "/api/chat" }); // data is an array of objects you wrote with dataStream.writeData() const latestUsage = data?.[data.length - 1];

This is useful for displaying token counts, cost estimates, or any metadata that doesn't belong in the message content itself.

Edge Runtime Deployment

Adding export const runtime = "edge" to your route handlers deploys them to Vercel's edge network rather than serverless functions. The practical benefit: edge functions run in the region closest to your user, which cuts the round-trip time before the first token arrives.

For a streaming chat app, TTFT (time to first token) is the most noticeable latency. A user in Frankfurt connecting to a US-east serverless function adds ~100ms of network overhead before the model starts generating. Edge deployment removes most of that.

One constraint: edge functions run in a limited runtime that excludes Node.js-specific APIs. The Vercel AI SDK is designed for this -- streamText and toDataStreamResponse work fine in the edge runtime. If your route handler uses Node.js APIs (file system, native modules), you'll need to keep it in the default serverless runtime.

To verify your routes work on edge before deploying, test with:

npx vercel dev

Vercel's local dev server emulates the edge runtime and will surface compatibility issues before you push.

Choosing a Model

GeneralCompute supports several models at different speed and quality points:

| Model | Context | Best for | |---|---|---| | llama-4-maverick | 128K | General chat, complex reasoning | | llama-4-scout | 128K | Faster responses, lighter tasks, lower cost | | qwen3-coder | 128K | Code generation, technical Q&A |

For a conversational UI where response quality matters, llama-4-maverick is a good default. For autocomplete or one-shot generation where speed matters more, llama-4-scout gives noticeably faster TTFT.

Swap the model name in your route handler -- no client code changes needed.

Complete Project Structure

For reference, here's the full file tree for a project using both hooks:

my-ai-app/
  app/
    api/
      chat/
        route.ts        # useChat backend
      completion/
        route.ts        # useCompletion backend
    chat/
      page.tsx          # Chat UI
    complete/
      page.tsx          # Completion UI
  lib/
    generalcompute.ts   # Provider config
  .env.local            # GENERALCOMPUTE_API_KEY
  package.json

Getting Started

If you don't have a GeneralCompute API key yet, get one at generalcompute.com. The same key works across all SDK integrations -- the createOpenAI factory with baseURL pointed at GeneralCompute is all you need to route any Vercel AI SDK app through GeneralCompute's infrastructure.

The Vercel AI SDK docs cover the full API surface if you need to go deeper: tool calling, attachments, multi-step agent flows, and React Server Components support are all available once the basic integration is working.

ModeHumanAgent