Getting Started with LLM JSON Mode: First API Calls
Your first JSON Mode API call is simpler than you think. Modern LLM APIs abstract away the complexity; you define a schema and the API handles the rest. This guide walks you through real, runnable examples with OpenAI and Anthropic, so you can see structured output in action within five minutes.
Prerequisites
You need:
- An API key from OpenAI (gpt-4-turbo or gpt-4o) or Anthropic (Claude 3.5).
- Python 3.8+ with the
openaioranthropicpackage installed. - Basic JSON and Python dict familiarity.
Install dependencies:
pip install openai anthropic
Example 1: OpenAI JSON Mode with Response Format
OpenAI's JSON Mode is the response_format parameter set to json_schema. Here's a complete example:
import json
import os
from openai import OpenAI
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
# Define the schema as a Python dict matching JSON Schema spec
schema = {
"type": "object",
"properties": {
"sentiment": {"type": "string", "enum": ["positive", "negative", "neutral"]},
"confidence": {"type": "number", "minimum": 0, "maximum": 1},
"key_phrases": {"type": "array", "items": {"type": "string"}, "max_items": 5}
},
"required": ["sentiment", "confidence"]
}
# Make the API call
response = client.chat.completions.create(
model="gpt-4-turbo",
messages=[
{
"role": "user",
"content": "Analyze the sentiment of: 'I love this product, works perfectly!'"
}
],
response_format={
"type": "json_schema",
"json_schema": {
"name": "SentimentAnalysis",
"schema": schema,
"strict": True # Enforce strict schema compliance
}
}
)
# Parse the response
result = json.loads(response.choices[0].message.content)
print(f"Sentiment: {result['sentiment']}")
print(f"Confidence: {result['confidence']}")
print(f"Key phrases: {result.get('key_phrases', [])}")
Key points:
"strict": Trueguarantees that the response exactly matches the schema (no extra fields, no missing required fields).- The model cannot emit any token sequence that violates the schema.
- No error handling needed; parsing always succeeds.
Example 2: Anthropic Structured Output
Anthropic's approach uses the response_format parameter (available on Claude 3.5 Sonnet and later):
import json
import os
from anthropic import Anthropic
client = Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))
schema = {
"type": "object",
"properties": {
"product_name": {"type": "string"},
"category": {"type": "string", "enum": ["electronics", "clothing", "food", "other"]},
"price_usd": {"type": "number", "minimum": 0},
"in_stock": {"type": "boolean"}
},
"required": ["product_name", "category", "price_usd"]
}
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=1024,
messages=[
{
"role": "user",
"content": "Extract product info from: 'AirPods Pro, cost $249, available now.'"
}
],
response_format={
"type": "json",
"json_schema": schema
}
)
result = json.loads(response.content[0].text)
print(f"Product: {result['product_name']}")
print(f"Category: {result['category']}")
print(f"Price: ${result['price_usd']}")
print(f"In stock: {result.get('in_stock', True)}")
Example 3: Minimal Schema (Beginner-Friendly)
Not all fields need constraints. A simple schema with just type checking:
from openai import OpenAI
import json
client = OpenAI()
# Minimal schema: just declare the structure
schema = {
"type": "object",
"properties": {
"name": {"type": "string"},
"age": {"type": "integer"},
"email": {"type": "string"}
},
"required": ["name", "age", "email"]
}
response = client.chat.completions.create(
model="gpt-4-turbo",
messages=[
{
"role": "user",
"content": "Extract the person's details from: 'My name is Alice, I'm 30, and [email protected] is my email.'"
}
],
response_format={
"type": "json_schema",
"json_schema": {
"name": "PersonInfo",
"schema": schema,
"strict": True
}
}
)
person = json.loads(response.choices[0].message.content)
print(person) # {"name": "Alice", "age": 30, "email": "[email protected]"}
Testing and Debugging
When first building with JSON Mode, print the raw response to verify correctness:
response = client.chat.completions.create(...)
raw_json = response.choices[0].message.content
print("Raw JSON:", raw_json)
result = json.loads(raw_json)
print("Parsed result:", result)
If parsing fails (rare with JSON Mode), check:
- Is
strict: Trueenabled? (It should be.) - Is the model gpt-4-turbo or later? (JSON Mode is not available on gpt-3.5-turbo.)
- Does your schema have syntax errors? (Use a JSON Schema validator at
https://jsonschema.org/validator.)
Common Schema Patterns
Required fields only (no optional)
{
"type": "object",
"properties": {
"title": {"type": "string"},
"category": {"type": "string"}
},
"required": ["title", "category"]
}
Optional fields
{
"type": "object",
"properties": {
"title": {"type": "string"},
"description": {"type": "string"},
"tags": {"type": "array", "items": {"type": "string"}}
},
"required": ["title"]
}
Constrained numbers
{
"type": "object",
"properties": {
"rating": {"type": "number", "minimum": 0, "maximum": 5},
"count": {"type": "integer", "minimum": 1}
},
"required": ["rating", "count"]
}
Key Takeaways
- JSON Mode is enabled via
response_format={"type": "json_schema", "json_schema": {...}}(OpenAI) orresponse_format={"type": "json", "json_schema": ...}(Anthropic). - Define your schema as a JSON Schema object; the API handles enforcement.
- Set
strict: Trueto guarantee the response exactly matches your schema. - Use
json.loads()to parse; no error handling needed for valid JSON Mode responses. - Start with minimal schemas (type + required fields) and add constraints (enums, min/max, patterns) as needed.
Frequently Asked Questions
Why does my JSON Mode request fail?
Common causes: (1) using an older model (gpt-3.5-turbo doesn't support JSON Mode), (2) schema syntax error (validate at https://jsonschema.org/validator), (3) missing response_format parameter, (4) non-strict mode (omit strict: True to allow extra fields).
Can I use JSON Mode with streaming responses?
Yes. Both OpenAI and Anthropic support streaming with JSON Mode. Streamed chunks are guaranteed to compose into valid JSON. Parse on the final stop chunk, not intermediate chunks.
What if my schema has a typo and the model returns truncated output?
The model will do its best to fit the response to your schema. If your schema is too tight or malformed, responses may be incomplete or truncated. Validate your schema syntax first and test with a simple prompt before using in production.
Is there a latency penalty for using JSON Mode?
Yes, roughly 50–200ms added to the request due to schema validation. This is typically negligible compared to the latency saved by eliminating retry logic for invalid responses.
Can I nest objects and arrays deeply?
Yes. JSON Mode supports arbitrarily nested schemas (objects inside objects, arrays of objects, etc.). See Handling Nested Objects and Arrays for complex examples.