OpenAI Agents SDK and Tool Use Patterns (2026)
OpenAI's Agents SDK is a thin, opinionated layer atop function calling (also called tool use). Unlike LangGraph's explicit state graphs or AutoGen's conversation loops, OpenAI's approach is direct: define tools, invoke the model, let it call tools, and loop. It's lightweight and integrates tightly with GPT and Claude. This article covers tool binding, parallel execution, structured outputs, and building resilient agent loops.
Core Pattern: Tool Use Loop
The OpenAI Agents loop is:
- Define tools: Describe what the agent can do (functions, APIs).
- Invoke model: Send tools and a message to the LLM.
- Parse response: The model returns either a final answer or a tool call request.
- Execute tools: Run the requested tools.
- Loop: Send tool results back to the model; it decides next steps.
from openai import OpenAI
import json
client = OpenAI(api_key="...")
# Define tools (functions the agent can call)
tools = [
{
"type": "function",
"function": {
"name": "web_search",
"description": "Search the web for information.",
"parameters": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "The search query"}
},
"required": ["query"]
}
}
},
{
"type": "function",
"function": {
"name": "get_current_time",
"description": "Get the current time.",
"parameters": {"type": "object", "properties": {}}
}
}
]
# Tool implementations
def web_search(query: str) -> str:
"""Simulate web search."""
return f"Search results for '{query}': AI safety is critical..."
def get_current_time() -> str:
"""Get current time."""
from datetime import datetime
return datetime.now().isoformat()
# Execute tool
def call_tool(tool_name: str, tool_input: dict) -> str:
if tool_name == "web_search":
return web_search(tool_input["query"])
elif tool_name == "get_current_time":
return get_current_time()
return "Unknown tool"
This is minimal boilerplate. The model decides when to call tools; you execute them and loop back.
Agentic Loop
The actual agent loop is straightforward:
messages = [
{"role": "user", "content": "What time is it and what are the latest AI safety developments?"}
]
# Loop until the model stops calling tools
while True:
response = client.chat.completions.create(
model="gpt-4",
messages=messages,
tools=tools,
tool_choice="auto" # Let the model decide when to call tools
)
# Check if the model called a tool or returned a final answer
if response.stop_reason == "tool_calls":
# Model requested tool calls
tool_calls = response.tool_calls
# Execute each tool
tool_results = []
for tool_call in tool_calls:
tool_name = tool_call.function.name
tool_input = json.loads(tool_call.function.arguments)
result = call_tool(tool_name, tool_input)
tool_results.append({
"tool_call_id": tool_call.id,
"role": "tool",
"content": result
})
# Add tool results to messages
messages.append({"role": "assistant", "content": response.content})
messages.extend(tool_results)
else:
# Model returned a final answer
print("Final answer:", response.content)
break
The loop terminates when the model returns a final answer (stop_reason != "tool_calls").
Parallel Tool Execution
If the model requests multiple tools, execute them in parallel for speed:
import asyncio
async def call_tool_async(tool_name: str, tool_input: dict) -> str:
"""Async version of tool execution."""
# Simulate async I/O
await asyncio.sleep(0.1)
if tool_name == "web_search":
return f"Results for {tool_input['query']}"
return "Tool result"
async def agentic_loop_parallel(messages, tools):
"""Execute tools in parallel."""
while True:
response = client.chat.completions.create(
model="gpt-4",
messages=messages,
tools=tools,
tool_choice="auto"
)
if response.stop_reason == "tool_calls":
# Execute all tools in parallel
tasks = [
call_tool_async(tc.function.name, json.loads(tc.function.arguments))
for tc in response.tool_calls
]
results = await asyncio.gather(*tasks)
# Collect results
tool_results = [
{
"tool_call_id": response.tool_calls[i].id,
"role": "tool",
"content": results[i]
}
for i in range(len(results))
]
messages.append({"role": "assistant", "content": response.content})
messages.extend(tool_results)
else:
print("Final answer:", response.content)
break
# Run the parallel loop
asyncio.run(agentic_loop_parallel(messages, tools))
Parallel execution reduces latency when the model calls multiple tools (e.g., search + get_time).
Structured Outputs
For reliable parsing, use OpenAI's structured outputs (JSON schema validation):
from openai import OpenAI
from pydantic import BaseModel
client = OpenAI()
class SearchResult(BaseModel):
"""Structured result from a search."""
query: str
summary: str
sources: list[str]
# Define a tool that returns structured output
tools_structured = [
{
"type": "function",
"function": {
"name": "web_search_structured",
"description": "Search the web and return structured results.",
"parameters": {
"type": "object",
"properties": {
"query": {"type": "string"}
},
"required": ["query"]
}
}
}
]
response = client.chat.completions.create(
model="gpt-4-turbo",
messages=[{"role": "user", "content": "Search for AI safety frameworks."}],
tools=tools_structured,
tool_choice="auto",
response_format={
"type": "json_schema",
"json_schema": {
"name": "SearchResult",
"strict": True,
"schema": SearchResult.model_json_schema()
}
}
)
Structured outputs prevent hallucinations and make parsing reliable.
Error Handling and Retries
Real agents encounter failures. Handle them gracefully:
def agentic_loop_with_errors(messages, tools, max_iterations=10):
"""Agent loop with error handling."""
iterations = 0
while iterations < max_iterations:
iterations += 1
try:
response = client.chat.completions.create(
model="gpt-4",
messages=messages,
tools=tools,
tool_choice="auto"
)
except Exception as e:
# Retry on API error
print(f"API error: {e}. Retrying...")
continue
if response.stop_reason == "tool_calls":
for tool_call in response.tool_calls:
try:
result = call_tool(tool_call.function.name, json.loads(tool_call.function.arguments))
except Exception as e:
# Tool failed; inform the model
result = f"Tool error: {str(e)}"
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": result
})
messages.append({"role": "assistant", "content": response.content})
else:
print("Final answer:", response.content)
return response.content
print("Max iterations reached.")
return None
Catch tool errors and return them to the model as context. The model can retry or try a different approach.
Building an Agent with the OpenAI SDK
A complete example: a data analysis agent.
def data_analysis_agent():
"""Agent that analyzes CSV files."""
tools = [
{
"type": "function",
"function": {
"name": "load_csv",
"description": "Load a CSV file and return its contents.",
"parameters": {
"type": "object",
"properties": {
"file_path": {"type": "string"}
},
"required": ["file_path"]
}
}
},
{
"type": "function",
"function": {
"name": "analyze_data",
"description": "Analyze data and return statistics.",
"parameters": {
"type": "object",
"properties": {
"data": {"type": "string"},
"analysis_type": {"type": "string", "enum": ["descriptive", "correlation", "anomaly"]}
},
"required": ["data", "analysis_type"]
}
}
}
]
def load_csv(file_path: str) -> str:
# Load CSV and return as JSON string
return '{"rows": [...], "columns": [...]}'
def analyze_data(data: str, analysis_type: str) -> str:
# Perform analysis
return f"Analysis ({analysis_type}): Mean=50, Median=48, StdDev=5"
messages = [
{"role": "user", "content": "Analyze the sales_data.csv file. Show me the mean and correlation."}
]
while True:
response = client.chat.completions.create(
model="gpt-4",
messages=messages,
tools=tools,
tool_choice="auto"
)
if response.stop_reason == "tool_calls":
for tool_call in response.tool_calls:
tool_name = tool_call.function.name
tool_input = json.loads(tool_call.function.arguments)
if tool_name == "load_csv":
result = load_csv(tool_input["file_path"])
elif tool_name == "analyze_data":
result = analyze_data(tool_input["data"], tool_input["analysis_type"])
messages.append({"role": "assistant", "content": response.content})
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": result
})
else:
print(response.content)
break
data_analysis_agent()
This agent loads a CSV and analyzes it autonomously.
Key Takeaways
- Tool use loop is OpenAI's core pattern: define tools, invoke model, execute tools, loop.
- Parallel execution speeds up workflows when multiple tools are called.
- Structured outputs improve reliability for complex responses.
- Error handling is essential; wrap tool calls in try-catch and feed errors back to the model.
- Max iterations prevent infinite loops; always set a limit.
Frequently Asked Questions
Can I use OpenAI's SDK with Claude or other models?
No, OpenAI's SDK is tied to OpenAI's API. However, Claude supports function calling via the Anthropic SDK with a very similar pattern. AutoGen and LangGraph support multiple models.
How many tools can an agent have?
No hard limit, but more tools increase token usage and reasoning overhead. Aim for 5-10 tools per agent; beyond that, consider role-based agents (like CrewAI).
What's the difference between function calling and tool use?
In OpenAI parlance, "function calling" is the technical capability; "tool use" is the concept. They're the same thing—the model generates structured function calls that you execute.
Can I use OpenAI's Agents SDK with custom models?
Not directly. The Agents SDK is GPT-only. For custom/fine-tuned models, use the function calling API directly (same pattern, more control).
How much does an agent loop cost?
A 5-step loop (5 model calls, each 1500 tokens in/500 tokens out) costs roughly $0.03–0.05 at GPT-4 pricing (2026). Cheaper with GPT-3.5. Optimize by reducing tool call count and using cheaper models when appropriate.