Structured Output with Zod: Type-Safe LLM Responses Across Providers
You already use Zod to validate API responses. Now it's the contract layer between your app and LLMs. One schema gives you runtime validation, TypeScript inference, AND guaranteed structured output from GPT, Gemini, and Claude. Without it you're calling JSON.parse() on free text and praying the model didn't sneak in a markdown fence or an apologetic preamble.
The Problem — Free Text at the LLM Boundary
LLMs return strings. Your app needs typed objects. That mismatch used to mean regex extraction, retry loops, and prompt hacks like "respond ONLY in JSON, no other text." All of it fragile. One model update, one edge-case input, and your parser breaks at 3 AM.
What changed: providers now offer constrained decoding. The model's token sampling is masked at generation time so it literally cannot produce tokens that violate your schema. The output is structurally valid by construction, not by hope.
The missing piece was a shared schema language that works at compile time (TypeScript types), runtime (validation), and wire time (JSON Schema for providers). Zod 4 fills that role.
How Each Provider Does It
| Provider | Mechanism | Guarantee | Zod Support |
|---|---|---|---|
| OpenAI | Constrained decoding | 100% structural compliance | zodResponseFormat() in SDK |
| Google Gemini | Constrained generation | 100% structural compliance | Native Zod in SDK |
| Anthropic Claude | Forced tool use | Very high (instruction-based) | Manual via z.toJSONSchema() |
OpenAI — Constrained Decoding
OpenAI masks output tokens at each generation step so only schema-valid continuations are possible. The Node SDK provides zodResponseFormat() to bridge Zod → JSON Schema:
import OpenAI from "openai";
import { zodResponseFormat } from "openai/helpers/zod";
import { z } from "zod";
const ProductExtraction = z.object({
name: z.string(),
sentiment: z.enum(["positive", "negative", "mixed"]),
rating: z.number().min(1).max(5),
pros: z.array(z.string()),
cons: z.array(z.string()),
});
const client = new OpenAI();
const completion = await client.chat.completions.create({
model: "gpt-4o-2024-08-06",
messages: [
{ role: "system", content: "Extract product data from the review." },
{ role: "user", content: reviewText },
],
response_format: zodResponseFormat(ProductExtraction, "product_extraction"),
});
const result = ProductExtraction.parse(
JSON.parse(completion.choices[0].message.content!),
);
// Structurally guaranteed by constrained decoding; .parse() gives you the typed variable
The structural guarantee is absolute — 100% compliance. No retries needed for malformed JSON.
Anthropic — Tool Use Pattern
Claude doesn't have constrained decoding for arbitrary schemas. Instead, you define a "tool" with your schema and force the model to call it. The response arrives in a tool_use content block:
import Anthropic from "@anthropic-ai/sdk";
import { z } from "zod";
const ProductExtraction = z.object({
name: z.string(),
sentiment: z.enum(["positive", "negative", "mixed"]),
rating: z.number().min(1).max(5),
pros: z.array(z.string()),
cons: z.array(z.string()),
});
const client = new Anthropic();
const response = await client.messages.create({
model: "claude-sonnet-4-20250514",
max_tokens: 1024,
tools: [
{
name: "extract_product",
description: "Extract structured product data from a review.",
input_schema: z.toJSONSchema(ProductExtraction),
},
],
tool_choice: { type: "tool", name: "extract_product" },
messages: [{ role: "user", content: reviewText }],
});
const toolBlock = response.content.find((b) => b.type === "tool_use");
const result = toolBlock!.input; // typed as unknown — validate with Zod
const parsed = ProductExtraction.parse(result);
This works well in practice — Claude's instruction-following reliability is very high. But it's not a token-level constraint. The model could theoretically produce invalid output. Always validate.
Google Gemini — Native Zod in the SDK
The @google/genai SDK accepts Zod schemas directly. No manual JSON Schema conversion needed:
import { GoogleGenAI } from "@google/genai";
import { z } from "zod";
const ProductExtraction = z.object({
name: z.string(),
sentiment: z.enum(["positive", "negative", "mixed"]),
rating: z.number().min(1).max(5),
pros: z.array(z.string()),
cons: z.array(z.string()),
});
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY! });
const response = await ai.models.generateContent({
model: "gemini-2.5-flash",
contents: [{ role: "user", parts: [{ text: reviewText }] }],
config: {
responseMimeType: "application/json",
responseSchema: ProductExtraction,
},
});
const result = JSON.parse(response.text!);
// Constrained generation — guaranteed valid
Gemini uses constrained generation internally. Same guarantee as OpenAI, with the added benefit of native Zod support — no adapter function needed.
Zod 4 as the Universal Schema Layer
Zod 4 ships with built-in JSON Schema conversion via z.toJSONSchema(). The old zod-to-json-schema package is no longer needed.
import { z } from "zod";
const schema = z.object({
name: z.string().meta({ description: "Product name" }),
rating: z.number().min(1).max(5),
});
const jsonSchema = z.toJSONSchema(schema);
// { type: "object", properties: {...}, required: ["name", "rating"], additionalProperties: false }
The defaults align perfectly with OpenAI's structured output constraints: all fields are required by default in z.object(), and additionalProperties: false is set in the generated JSON Schema. No manual tweaking needed.
.describe("...")is shorthand for.meta({ description: "..." })— use whichever reads better in your schema. The SDK examples below use.describe()for brevity.
Define once, use everywhere. The same Zod schema validates your form input, types your API response, and constrains your LLM output. One source of truth for your data shape across the entire stack.
Abstraction Layers
If you'd rather not deal with provider-specific wiring, abstraction layers handle it for you.
Vercel AI SDK v6
The AI SDK's Output.object() wraps provider differences into a single interface:
import { generateText, Output } from "ai";
import { openai } from "@ai-sdk/openai";
import { z } from "zod";
const ProductExtraction = z.object({
name: z.string().describe("Product name mentioned in the review"),
sentiment: z.enum(["positive", "negative", "mixed"]),
rating: z.number().min(1).max(5),
pros: z.array(z.string()),
cons: z.array(z.string()),
});
const { output } = await generateText({
model: openai("gpt-4o"),
prompt: `Extract product data from this review:\n\n${reviewText}`,
output: Output.object({ schema: ProductExtraction }),
});
// output is fully typed as z.infer<typeof ProductExtraction>
console.log(output.name, output.sentiment);
Swap openai("gpt-4o") for google("gemini-2.5-flash") or anthropic("claude-sonnet-4-20250514") — same code, same types. The SDK picks the right mechanism per provider.
For streaming, use streamText() with the same Output.object() config. Throws NoObjectGeneratedError if the model fails to produce valid output.
TanStack AI
TanStack AI provides a useGeneration() hook for streaming structured output directly into React state. It's provider-portable via the AG-UI protocol. See the TanStack AI Beta post for the full API and setup guide.
Pitfalls
.optional() vs .nullable() — OpenAI requires all fields to be present in the output. If a field may not have a value, use .nullable() (outputs null) instead of .optional() (omits the key). The latter will cause schema validation failures with OpenAI's structured output.
Schema complexity limits. OpenAI's structured output documentation enforces limits on schema size — including maximum property counts, nesting depth, and enum values. In practice, keep schemas focused on the task. If you're hitting limits, you probably need to split the extraction into multiple calls.
Root-level unions not allowed. You can't use z.union([...]) at the top level with OpenAI. Wrap it in a discriminated object:
// Won't work as response_format
const Bad = z.union([SchemaA, SchemaB]);
// Works — discriminated wrapper
const Good = z.object({
type: z.enum(["a", "b"]),
data: z.discriminatedUnion("type", [SchemaA, SchemaB]),
});
Token overhead. The schema is injected into the model's context window. Complex schemas with many descriptions eat tokens. Keep schemas lean — move lengthy field documentation to .meta({ description }) only where the model needs guidance to fill the field correctly. This matters more as you scale calls. See Copilot Token Efficiency for related context-budget thinking.
Structural guarantee ≠ semantic guarantee. The model will always produce valid JSON matching your schema. It can still hallucinate a product name that doesn't exist or assign a rating that contradicts the review text. Constrained decoding solves the parsing problem, not the accuracy problem. You still need domain validation for critical fields.
End-to-End Example
A complete flow: define a schema, call the model, get a typed result with zero casts.
import { generateText, Output } from "ai";
import { openai } from "@ai-sdk/openai";
import { z } from "zod";
// 1. Define the schema — single source of truth
const ProductReview = z.object({
productName: z.string().describe("Exact product name from the review"),
brand: z.string().nullable().describe("Brand if mentioned, null otherwise"),
sentiment: z.enum(["positive", "negative", "mixed"]),
rating: z.number().min(1).max(5).describe("Inferred 1-5 rating"),
pros: z.array(z.string()).describe("Key positive points"),
cons: z.array(z.string()).describe("Key negative points"),
recommendsBuying: z.boolean(),
});
type ProductReview = z.infer<typeof ProductReview>;
// 2. Call the model with structured output
async function extractReview(reviewText: string): Promise<ProductReview> {
const { output } = await generateText({
model: openai("gpt-4o"),
prompt: `Extract structured product data from the following review.
Be precise — only include information explicitly stated or clearly implied.
Review:
${reviewText}`,
output: Output.object({ schema: ProductReview }),
});
return output;
}
// 3. Use it — fully typed, no parsing gymnastics
const review = await extractReview(`
I've been using the Sony WH-1000XM5 for three months.
The noise cancellation is incredible and battery life exceeds expectations.
My only gripe is the carrying case is bulkier than the XM4's.
Would absolutely recommend for frequent flyers.
`);
console.log(review.productName); // "Sony WH-1000XM5"
console.log(review.sentiment); // "positive"
console.log(review.pros); // ["incredible noise cancellation", "battery life exceeds expectations"]
No JSON.parse(). No type assertions. No retry loops. The schema is the contract, Zod enforces it at every layer, and the provider guarantees structural compliance at generation time.
Where to Go From Here
- TanStack AI Beta — if you want provider-portable structured output with React hooks, start here.
- AG-UI Protocol — the event-based wire format that makes TanStack AI's provider portability possible.
- Copilot Token Efficiency — schema injection costs tokens. This post covers how to think about context budgets.
- Zod 4 JSON Schema docs — the full reference for
z.toJSONSchema()options, targets, and edge cases. - OpenAI Structured Outputs guide — schema constraints, supported types, and refusal handling.
- Vercel AI SDK — Generating Structured Data —
Output.object(),Output.array(), streaming, and error handling. - Google Gemini Structured Output — native Zod support, streaming JSON, and combining with tools.
