Skip to main content

Advanced JSON Schema: Enums, Conditionals, and Composition

Beyond basic types and constraints, JSON Schema supports powerful composition and conditional patterns. Enums enforce fixed vocabularies, discriminators route complex types, conditionals enable field dependencies, and definitions eliminate duplication. These advanced patterns let you express subtle domain constraints and make your schemas self-documenting.

Enums: Controlled Vocabularies

Enums lock in valid values, preventing hallucination and ensuring downstream code receives only expected strings.

Simple enum:

{
"type": "object",
"properties": {
"status": {
"type": "string",
"enum": ["pending", "approved", "rejected", "archived"]
}
}
}

Numeric enum (for codes):

{
"type": "object",
"properties": {
"http_status": {
"type": "integer",
"enum": [200, 201, 400, 404, 500]
}
}
}

Pydantic enum pattern (Python):

from enum import Enum
from pydantic import BaseModel

class DocumentStatus(str, Enum):
DRAFT = "draft"
REVIEW = "review"
PUBLISHED = "published"
ARCHIVED = "archived"

class Document(BaseModel):
title: str
status: DocumentStatus

# LLM output
response = client.chat.completions.create(...)
doc = Document.model_validate_json(response.choices[0].message.content)
# doc.status is known to be one of the four values

Zod enum pattern (TypeScript):

import { z } from "zod";

const DocumentSchema = z.object({
title: z.string(),
status: z.enum(["draft", "review", "published", "archived"])
});

type Document = z.infer<typeof DocumentSchema>;
// status is "draft" | "review" | "published" | "archived"

Discriminated Unions: Type-Based Routing

A discriminated union uses one field to determine the schema of the entire object. Common in workflows with conditional logic.

Example: Different response types based on query result

{
"type": "object",
"oneOf": [
{
"type": "object",
"properties": {
"result_type": {"type": "string", "const": "success"},
"data": {"type": "string"},
"confidence": {"type": "number"}
},
"required": ["result_type", "data", "confidence"]
},
{
"type": "object",
"properties": {
"result_type": {"type": "string", "const": "error"},
"error_code": {"type": "integer"},
"message": {"type": "string"}
},
"required": ["result_type", "error_code", "message"]
},
{
"type": "object",
"properties": {
"result_type": {"type": "string", "const": "unknown"},
"reason": {"type": "string"}
},
"required": ["result_type", "reason"]
}
]
}

Pydantic discriminated union:

from typing import Union, Literal
from pydantic import BaseModel, Field

class SuccessResult(BaseModel):
result_type: Literal["success"]
data: str
confidence: float

class ErrorResult(BaseModel):
result_type: Literal["error"]
error_code: int
message: str

class UnknownResult(BaseModel):
result_type: Literal["unknown"]
reason: str

QueryResult = Union[SuccessResult, ErrorResult, UnknownResult]

class QueryResponse(BaseModel):
response: QueryResult = Field(..., discriminator="result_type")

# Usage
response_data = {
"response": {
"result_type": "success",
"data": "Found 42 results",
"confidence": 0.95
}
}

response = QueryResponse.model_validate(response_data)
if isinstance(response.response, SuccessResult):
print(f"Success: {response.response.data}")

Zod discriminated union:

import { z } from "zod";

const QueryResultSchema = z.discriminatedUnion("result_type", [
z.object({
result_type: z.literal("success"),
data: z.string(),
confidence: z.number().min(0).max(1)
}),
z.object({
result_type: z.literal("error"),
error_code: z.number().int(),
message: z.string()
}),
z.object({
result_type: z.literal("unknown"),
reason: z.string()
})
]);

type QueryResult = z.infer<typeof QueryResultSchema>;

const result: QueryResult = {
result_type: "success",
data: "Found results",
confidence: 0.9
};

Conditionals: Dependent Fields

Use if/then/else to express field dependencies. If one field has a value, other fields become required or change type.

Example: If escalation_required is true, escalation_reason becomes required

{
"type": "object",
"properties": {
"action": {"type": "string"},
"escalation_required": {"type": "boolean"},
"escalation_reason": {"type": "string"}
},
"required": ["action", "escalation_required"],
"if": {
"properties": {
"escalation_required": {"const": true}
}
},
"then": {
"required": ["escalation_reason"]
}
}

Pydantic field_validator for conditional logic:

from pydantic import BaseModel, field_validator

class ActionPlan(BaseModel):
action: str
escalation_required: bool
escalation_reason: str = ""

@field_validator("escalation_reason")
@classmethod
def check_escalation_reason(cls, v, info):
if info.data.get("escalation_required") and not v:
raise ValueError("escalation_reason required when escalation_required is true")
return v

Schema Definitions and Reuse

Use $defs to define reusable schemas and reference them with $ref, eliminating duplication:

{
"$defs": {
"Address": {
"type": "object",
"properties": {
"street": {"type": "string"},
"city": {"type": "string"},
"country": {"type": "string"},
"zip": {"type": "string"}
},
"required": ["street", "city", "country"]
},
"Contact": {
"type": "object",
"properties": {
"email": {"type": "string", "format": "email"},
"phone": {"type": "string"}
},
"required": ["email"]
}
},
"type": "object",
"properties": {
"billing_address": {"$ref": "#/$defs/Address"},
"shipping_address": {"$ref": "#/$defs/Address"},
"contact": {"$ref": "#/$defs/Contact"}
},
"required": ["billing_address", "contact"]
}

Pydantic inheritance for composition:

from pydantic import BaseModel

class Address(BaseModel):
street: str
city: str
country: str
zip: str = ""

class Contact(BaseModel):
email: str
phone: str = ""

class Order(BaseModel):
billing_address: Address
shipping_address: Address
contact: Contact

# Reuse schemas across multiple models
class Vendor(BaseModel):
name: str
contact: Contact
address: Address

Recursive Schemas: Trees and Graphs

Define schemas that reference themselves to model hierarchical data:

{
"$defs": {
"Document": {
"type": "object",
"properties": {
"title": {"type": "string"},
"content": {"type": "string"},
"sections": {
"type": "array",
"items": {"$ref": "#/$defs/Document"}
}
}
}
},
"$ref": "#/$defs/Document"
}

Pydantic recursive model:

from typing import Optional
from pydantic import BaseModel

class Document(BaseModel):
title: str
content: str
sections: list["Document"] = []

# Enable recursive reference
Document.model_rebuild()

# Usage: Document can contain nested Documents
doc = Document(
title="Main",
content="...",
sections=[
Document(title="Section 1", content="..."),
Document(title="Section 2", content="...", sections=[
Document(title="Subsection", content="...")
])
]
)

Constrained Arrays with Item Dependencies

Require certain items in an array or constrain item relationships:

{
"type": "object",
"properties": {
"ordered_items": {
"type": "array",
"items": {
"type": "object",
"properties": {
"product_id": {"type": "string"},
"quantity": {"type": "integer", "minimum": 1},
"price_per_unit": {"type": "number", "minimum": 0}
},
"required": ["product_id", "quantity", "price_per_unit"]
},
"minItems": 1,
"maxItems": 100,
"uniqueItems": false
}
}
}

Pydantic with array constraints:

from pydantic import BaseModel, Field

class OrderItem(BaseModel):
product_id: str
quantity: int = Field(..., ge=1)
price_per_unit: float = Field(..., ge=0)

class Order(BaseModel):
items: list[OrderItem] = Field(..., min_length=1, max_length=100)

@property
def total(self) -> float:
return sum(item.quantity * item.price_per_unit for item in self.items)

Pattern Validation: Format and Regex

Use format for standard patterns (email, URI, date) or pattern for custom regex:

{
"type": "object",
"properties": {
"email": {
"type": "string",
"format": "email"
},
"website": {
"type": "string",
"format": "uri"
},
"phone": {
"type": "string",
"pattern": "^\\+?[1-9]\\d{1,14}$"
},
"sku": {
"type": "string",
"pattern": "^[A-Z]{3}-\\d{6}$"
}
}
}

Pydantic patterns:

from pydantic import BaseModel, EmailStr, Field
from typing import Annotated

class Product(BaseModel):
sku: Annotated[str, Field(pattern=r"^[A-Z]{3}-\d{6}$")]
email: EmailStr
phone: Annotated[str, Field(pattern=r"^\+?[1-9]\d{1,14}$")]

Composition Example: Complex Pricing Model

Combine multiple patterns into a sophisticated schema:

from typing import Union, Literal
from pydantic import BaseModel, Field

class FixedPrice(BaseModel):
price_type: Literal["fixed"]
amount: float = Field(..., gt=0)

class PercentageDiscount(BaseModel):
price_type: Literal["percentage"]
base_amount: float = Field(..., gt=0)
discount_percent: int = Field(..., ge=1, le=100)

class TieredPrice(BaseModel):
price_type: Literal["tiered"]
tiers: list[dict] = Field(
...,
description="List of {quantity_min, quantity_max, price_per_unit}"
)

PricingStrategy = Union[FixedPrice, PercentageDiscount, TieredPrice]

class Product(BaseModel):
name: str
pricing: PricingStrategy = Field(..., discriminator="price_type")

# The LLM now chooses a pricing strategy type and fills in the corresponding fields

Testing Advanced Schemas

Test discriminated unions and conditional fields:

import pytest
from pydantic import ValidationError

def test_discriminated_union():
"""Test that discriminator selects the correct schema."""
# Valid success response
success = QueryResponse.model_validate({
"response": {
"result_type": "success",
"data": "Results found",
"confidence": 0.9
}
})
assert isinstance(success.response, SuccessResult)

# Valid error response
error = QueryResponse.model_validate({
"response": {
"result_type": "error",
"error_code": 404,
"message": "Not found"
}
})
assert isinstance(error.response, ErrorResult)

# Invalid discriminator value
with pytest.raises(ValidationError):
QueryResponse.model_validate({
"response": {
"result_type": "invalid",
"data": "Something"
}
})

def test_conditional_validation():
"""Test dependent field validation."""
# Valid: escalation not required, no reason needed
action1 = ActionPlan(
action="resolve",
escalation_required=False
)
assert action1.escalation_reason == ""

# Valid: escalation required, reason provided
action2 = ActionPlan(
action="escalate",
escalation_required=True,
escalation_reason="Needs approval"
)
assert action2.escalation_reason == "Needs approval"

# Invalid: escalation required, but no reason
with pytest.raises(ValidationError):
ActionPlan(
action="escalate",
escalation_required=True,
escalation_reason=""
)

Key Takeaways

  • Enums enforce vocabularies and eliminate hallucination; use them wherever valid values are finite.
  • Discriminated unions (oneOf + discriminator) route complex types based on a single field.
  • Conditional schemas (if/then) express field dependencies and make relationships explicit.
  • $defs and $ref eliminate schema duplication and improve maintainability.
  • Recursive schemas model trees, hierarchies, and nested structures naturally.
  • Combine patterns (enums + conditionals + composition) to express sophisticated domain constraints.
  • Test all schema patterns; discriminators and conditionals are easy to get wrong.

Frequently Asked Questions

When should I use oneOf vs. discriminatedUnion?

Use discriminatedUnion (with an explicit discriminator field) when one field determines the schema. Use oneOf for unions without a discriminator (rarer for LLM outputs). Discriminators are more efficient because the schema validator can select the right branch immediately.

Can I have optional fields within a discriminated union?

Yes. Each union variant defines its own required array. Some variants can have fewer required fields.

class SimplePath(BaseModel):
path_type: Literal["simple"]
steps: list[str]

class DetailedPath(BaseModel):
path_type: Literal["detailed"]
steps: list[str]
estimated_time: str = "" # Optional
complexity: str = "" # Optional

Do LLMs understand recursive schemas?

Yes, but keep recursion depth shallow (2–3 levels). Very deep recursion confuses LLMs. Use maxDepth or limit nesting in descriptions.

How do I handle schema evolution (adding fields)?

Use additionalProperties: true during a transition period, then tighten to false. For required fields, add them as optional first, then make them required in the next version.

Can I validate cross-field constraints (e.g., startDate < endDate)?

Yes, with Pydantic @root_validator or model-level custom logic. JSON Schema's conditional support is limited; use code validation for complex constraints.

from pydantic import root_validator

class DateRange(BaseModel):
start_date: str
end_date: str

@root_validator
def check_dates(cls, values):
if values["start_date"] > values["end_date"]:
raise ValueError("start_date must be before end_date")
return values

Further Reading