Type-Safe Tool Wrappers for LLMs
A tool wrapper is an abstraction layer between the LLM and your actual function. It takes the raw JSON the model sends, validates it against the schema, coerces types if needed, and passes a typed object to your function. A good wrapper is transparent to the LLM (it just works) but provides three guarantees to your code: (1) the schema is enforced before your function runs, (2) types are guaranteed correct, and (3) errors are logged and handled gracefully. This article teaches you to design wrappers that make tool calling reliable and maintainable.
The Problem: Untyped Tool Calling
Without a wrapper, you handle tool calls like this:
# Raw LLM tool call (JSON)
tool_call = {
"name": "search",
"arguments": {
"query": "python async",
"limit": 25
}
}
# You manually extract and coerce types
query = tool_call["arguments"]["query"] # Hope it's a string
limit = tool_call["arguments"].get("limit", 10) # Hope it's an int
# What if query is None? What if limit is "not a number"?
result = search(query, limit) # Runtime error if types are wrong
This is fragile. If the LLM sends a malformed call, your code crashes.
The Solution: Typed Wrappers
A wrapper validates and coerces before calling your function:
from pydantic import BaseModel
import json
from typing import Callable, Any
class SearchParams(BaseModel):
query: str
limit: int = 10
def create_tool_wrapper(fn: Callable, params_model: BaseModel) -> Callable:
"""Create a wrapper that validates inputs against a Pydantic model."""
def wrapper(raw_args: dict) -> Any:
try:
# Validate and coerce types
params = params_model(**raw_args)
# Call the function with typed arguments
return fn(**params.model_dump())
except ValueError as e:
return {"error": f"Invalid parameters: {e}"}
return wrapper
# Define your actual function
def search(query: str, limit: int = 10) -> dict:
# Now you can rely on types being correct
results = fetch_results(query)
return {"results": results[:limit]}
# Wrap it
wrapped_search = create_tool_wrapper(search, SearchParams)
# LLM sends this
tool_call = {"query": "python", "limit": "25"} # limit is a string!
# Wrapper handles it gracefully
result = wrapped_search(tool_call)
# Returns {"results": [...]} with limit coerced to int 25
Wrapper Pattern 1: Pydantic-Based (Python)
This is the most robust pattern for Python. Define your parameters as a Pydantic model, use it for schema generation and validation:
from pydantic import BaseModel, Field, ValidationError
from typing import Callable, Any
import json
class DatabaseQueryParams(BaseModel):
"""Parameters for the database_query tool."""
table: str = Field(..., description="Table name")
filters: dict = Field(default_factory=dict,
description="WHERE clause conditions")
limit: int = Field(default=10, ge=1, le=1000,
description="Max rows to return")
class Config:
# Coerce types if possible (str "42" -> int 42)
coerce_numbers_to_str = False
class ToolWrapper:
"""Wraps a function with schema validation and error handling."""
def __init__(self, fn: Callable, params_model: BaseModel,
name: str, description: str):
self.fn = fn
self.params_model = params_model
self.name = name
self.description = description
@property
def schema(self) -> dict:
"""Get the JSON Schema for the LLM."""
schema = self.params_model.model_json_schema()
return {
"name": self.name,
"description": self.description,
"parameters": schema
}
def call(self, raw_arguments: dict) -> dict:
"""Call the function with validation."""
try:
# Validate and coerce
params = self.params_model(**raw_arguments)
# Call the actual function
result = self.fn(**params.model_dump())
return {
"status": "success",
"result": result
}
except ValidationError as e:
# Schema validation failed
return {
"status": "error",
"error": f"Invalid parameters: {e.json()}"
}
except Exception as e:
# Function raised an exception
return {
"status": "error",
"error": f"Function error: {str(e)}"
}
# Define the actual function
def database_query(table: str, filters: dict, limit: int) -> list:
"""Query the database."""
# Your implementation here
results = fetch_from_db(table, filters, limit)
return results
# Create the wrapper
query_wrapper = ToolWrapper(
fn=database_query,
params_model=DatabaseQueryParams,
name="database_query",
description="Query the database by table name and optional filters"
)
# The LLM sees this schema
print(query_wrapper.schema)
# When the LLM calls it:
llm_args = {
"table": "users",
"filters": {"status": "active"},
"limit": "50" # String, but will be coerced to int
}
result = query_wrapper.call(llm_args)
# {"status": "success", "result": [...]}
Wrapper Pattern 2: Generic with Type Hints (TypeScript)
In TypeScript, use Zod for runtime validation and type inference:
import { z, ZodSchema } from "zod";
interface ToolDefinition {
name: string;
description: string;
parameters: ZodSchema;
}
interface ToolResult {
status: "success" | "error";
result?: any;
error?: string;
}
function createToolWrapper<T>(
fn: (params: T) => any | Promise<any>,
definition: ToolDefinition
): { schema: ToolDefinition; call: (args: unknown) => Promise<ToolResult> } {
return {
schema: definition,
call: async (args: unknown): Promise<ToolResult> => {
try {
// Validate and coerce against the Zod schema
const params = definition.parameters.parse(args);
// Call the function
const result = await fn(params);
return {
status: "success",
result: result
};
} catch (error) {
if (error instanceof z.ZodError) {
return {
status: "error",
error: `Validation failed: ${error.message}`
};
}
return {
status: "error",
error: `Function error: ${error}`
};
}
}
};
}
// Define parameters with Zod
const SearchParamsSchema = z.object({
query: z.string().min(1).max(100),
limit: z.number().int().min(1).max(100).default(10),
sort_by: z.enum(["relevance", "date"]).default("relevance")
});
type SearchParams = z.infer<typeof SearchParamsSchema>;
// Define the actual function
function search(params: SearchParams): SearchResult[] {
const { query, limit, sort_by } = params;
// Your implementation
return fetch_results(query, limit, sort_by);
}
// Create the wrapper
const searchWrapper = createToolWrapper(search, {
name: "search",
description: "Search for items",
parameters: SearchParamsSchema
});
// Register with LLM
console.log(searchWrapper.schema);
// When LLM calls it
const result = await searchWrapper.call({
query: "typescript",
limit: "25", // String, coerced to number
sort_by: "date"
});
Wrapper Pattern 3: Async and Streaming Results
For long-running tools, support async execution and result streaming:
from typing import AsyncIterator
import asyncio
class AsyncToolWrapper:
"""Supports async functions and streaming results."""
def __init__(self, fn: Callable, params_model: BaseModel, name: str):
self.fn = fn
self.params_model = params_model
self.name = name
async def call(self, raw_arguments: dict) -> dict:
"""Call async function with validation."""
try:
params = self.params_model(**raw_arguments)
result = await self.fn(**params.model_dump())
return {"status": "success", "result": result}
except ValidationError as e:
return {"status": "error", "error": str(e)}
except Exception as e:
return {"status": "error", "error": str(e)}
async def stream_call(self, raw_arguments: dict) -> AsyncIterator[dict]:
"""Call with streaming results."""
try:
params = self.params_model(**raw_arguments)
async for chunk in self.fn(**params.model_dump()):
yield {"status": "streaming", "chunk": chunk}
yield {"status": "complete"}
except Exception as e:
yield {"status": "error", "error": str(e)}
# Async function that returns an async generator
async def stream_search_results(query: str) -> AsyncIterator[dict]:
async for result in search_service(query):
yield result
async_wrapper = AsyncToolWrapper(
fn=stream_search_results,
params_model=SearchParams,
name="stream_search"
)
# Usage
async for event in async_wrapper.stream_call({"query": "python"}):
if event["status"] == "streaming":
print(event["chunk"])
Best Practices for Robust Wrappers
1. Detailed Error Messages
When validation fails, provide helpful feedback so the LLM (or user debugging the tool) understands what went wrong:
except ValidationError as e:
errors = e.errors()
formatted = "\n".join([
f" {err['loc'][0]}: {err['msg']}"
for err in errors
])
return {
"status": "error",
"error": f"Invalid parameters:\n{formatted}"
}
2. Type Coercion
Coerce types when safe (e.g., string "42" to int 42), but fail explicitly on unsafe coercions:
from pydantic import ConfigDict
class Params(BaseModel):
model_config = ConfigDict(
# Coerce str "42" to int 42
coerce_numbers_to_str=False
)
count: int
# This works
params = Params(count="42") # count is int 42
# This fails
try:
params = Params(count="not_a_number") # Raises ValidationError
except ValidationError:
pass
3. Logging
Log every tool call for debugging and monitoring:
import logging
logger = logging.getLogger(__name__)
def call(self, raw_arguments: dict) -> dict:
logger.info(f"Tool {self.name} called with: {raw_arguments}")
try:
params = self.params_model(**raw_arguments)
result = self.fn(**params.model_dump())
logger.info(f"Tool {self.name} succeeded")
return {"status": "success", "result": result}
except Exception as e:
logger.error(f"Tool {self.name} failed: {e}", exc_info=True)
return {"status": "error", "error": str(e)}
4. Rate Limiting and Quotas
Enforce rate limits to prevent abuse:
from datetime import datetime, timedelta
class RateLimitedToolWrapper:
def __init__(self, fn, params_model, max_calls_per_minute=10):
self.fn = fn
self.params_model = params_model
self.max_calls = max_calls_per_minute
self.calls = []
def call(self, raw_arguments: dict) -> dict:
# Clean old calls
now = datetime.now()
self.calls = [t for t in self.calls
if now - t < timedelta(minutes=1)]
# Check limit
if len(self.calls) >= self.max_calls:
return {
"status": "error",
"error": f"Rate limit exceeded: {self.max_calls} calls per minute"
}
self.calls.append(now)
# Call function...
params = self.params_model(**raw_arguments)
return {"status": "success", "result": self.fn(**params.model_dump())}
Key Takeaways
- A tool wrapper validates inputs against the schema before calling your function.
- Use Pydantic (Python) or Zod (TypeScript) for runtime validation and type coercion.
- Wrappers should return a consistent response format with status, result, and error fields.
- Log every tool call for debugging and monitoring.
- Support async functions and streaming results for long-running tools.
Frequently Asked Questions
Should I use one wrapper class for all tools or one per tool?
One wrapper class that you instantiate per tool is cleaner: SearchWrapper = ToolWrapper(search, SearchParams, ...). This avoids code duplication and makes your system composable.
What if the LLM sends extra fields not in the schema?
By default, Pydantic ignores extra fields (depending on ConfigDict). Set extra = "forbid" if you want to reject unknown fields.
Can I cache wrapper schemas for performance?
Yes. Compute the schema once and cache it:
class ToolWrapper:
def __init__(self, ...):
self._schema = None
@property
def schema(self):
if self._schema is None:
self._schema = self.compute_schema()
return self._schema
Should I retry tool calls if validation fails?
No. If validation fails, it means the LLM sent malformed data. Log it and return an error. Retrying won't fix the underlying issue. The error message might help the LLM adjust its next attempt.
Can I use dataclasses instead of Pydantic?
You can, but Pydantic is better: it has built-in validation, type coercion, and JSON Schema generation. Dataclasses require extra work for these features.