Skip to main content

Agent Sandboxing: Secure Tool Execution Boundaries

An agent can call any tool you provide. If you allow it to call delete_database, a model behaving unexpectedly could delete your database. Sandboxing is the practice of isolating tool execution within strict boundaries: limiting filesystem access, capping API calls, restricting network destinations, and auditing every invocation. Proper sandboxing turns a potentially dangerous system into a trustworthy one. Organizations using agent sandboxing report a 95%+ reduction in unintended side effects.

Sandboxing Principles

Principle 1: Least privilege. Give tools only the permissions they need. A tool that fetches user data does not need write permissions.

Principle 2: Isolation. Run each tool in its own process or container so a crash or exploit in one tool does not affect others.

Principle 3: Audit trails. Log every tool call with timestamp, arguments, result, and outcome. This is critical for compliance and debugging.

Principle 4: Rate limiting. Cap how many times the agent can call each tool per minute/hour to prevent resource exhaustion.

Principle 5: Whitelist, not blacklist. Explicitly allow specific actions; forbid everything else by default.

Strategy 1: Permission Wrappers

Wrap tool implementations with permission checks. Before execution, verify the arguments are safe.

from typing import Callable, Dict, Any
import datetime

class PermissionWrapper:
def __init__(self, impl: Callable, permissions: Dict[str, Any]):
self.impl = impl
self.permissions = permissions
self.call_count = 0
self.last_reset = datetime.datetime.now()

def check_rate_limit(self):
"""Check if rate limit is exceeded."""
rate_limit = self.permissions.get("rate_limit_per_minute", 60)
now = datetime.datetime.now()

if (now - self.last_reset).seconds > 60:
self.call_count = 0
self.last_reset = now

if self.call_count >= rate_limit:
raise PermissionError(
f"Rate limit exceeded: {rate_limit} calls per minute"
)

self.call_count += 1

def check_arguments(self, args: Dict[str, Any]):
"""Validate arguments against whitelist."""
allowed_fields = self.permissions.get("allowed_arguments", {})

for key, value in args.items():
if key not in allowed_fields:
raise PermissionError(f"Argument {key} is not allowed")

allowed_values = allowed_fields[key]
if isinstance(allowed_values, list) and value not in allowed_values:
raise PermissionError(
f"Value {value} not in whitelist for {key}"
)

def __call__(self, **kwargs):
"""Execute the tool with permission checks."""
self.check_rate_limit()
self.check_arguments(kwargs)
return self.impl(**kwargs)

# Usage
def delete_user(user_id: int):
"""Delete a user from the database."""
print(f"Deleting user {user_id}...")
return {"status": "deleted"}

# Wrap with strict permissions
wrapped_delete = PermissionWrapper(
delete_user,
permissions={
"rate_limit_per_minute": 1, # Max 1 deletion per minute
"allowed_arguments": {
"user_id": [123, 456, 789] # Only allow deleting specific users
}
}
)

# This succeeds
try:
result = wrapped_delete(user_id=123)
print("✓", result)
except PermissionError as e:
print("✗", e)

# This fails (user_id not whitelisted)
try:
result = wrapped_delete(user_id=999)
except PermissionError as e:
print("✗", e)

Permission wrappers act as a first line of defense. They are cheap to implement and very effective.

Strategy 2: Audit Logging

Every tool call must be logged, with enough detail to reconstruct what happened.

import json
import logging
from datetime import datetime

logging.basicConfig(filename="agent_audit.log", level=logging.INFO)

class AuditedToolWrapper:
def __init__(self, tool_name: str, impl: Callable):
self.tool_name = tool_name
self.impl = impl

def __call__(self, **kwargs):
"""Execute and log the call."""
call_id = datetime.now().isoformat()

# Log the call
log_entry = {
"timestamp": call_id,
"tool": self.tool_name,
"arguments": kwargs,
"status": "started"
}
logging.info(json.dumps(log_entry))

try:
result = self.impl(**kwargs)

# Log success
log_entry["status"] = "success"
log_entry["result"] = result
logging.info(json.dumps(log_entry))

return result

except Exception as e:
# Log failure
log_entry["status"] = "error"
log_entry["error"] = str(e)
logging.error(json.dumps(log_entry))
raise

# Usage
audited_delete = AuditedToolWrapper("delete_user", delete_user)
audited_delete(user_id=123)

# Log output:
# {"timestamp": "2026-06-02T10:30:45.123456", "tool": "delete_user", "arguments": {"user_id": 123}, "status": "started"}
# {"timestamp": "2026-06-02T10:30:45.234567", "tool": "delete_user", "arguments": {"user_id": 123}, "status": "success", "result": {...}}

Audit logs are invaluable for compliance, debugging, and security incident response. Treat them as immutable records.

Strategy 3: Containerized Execution

For untrusted tools or high-risk operations, run them in isolated containers or VMs.

import subprocess
import json

def execute_tool_in_container(tool_name: str, arguments: Dict[str, Any], docker_image: str = "agent-tools:latest"):
"""Execute a tool in a Docker container."""

# Prepare the tool call as JSON
payload = {
"tool": tool_name,
"arguments": arguments
}

# Run in container with strict limits
result = subprocess.run(
[
"docker", "run",
"--rm",
"--memory=512m", # 512 MB RAM limit
"--cpus=1", # 1 CPU limit
"--read-only", # Filesystem is read-only
"--network=none", # No network access
docker_image,
json.dumps(payload)
],
capture_output=True,
text=True,
timeout=30 # 30 second timeout
)

if result.returncode != 0:
raise Exception(f"Tool failed: {result.stderr}")

return json.loads(result.stdout)

# Usage
result = execute_tool_in_container(
"fetch_data",
{"query": "SELECT * FROM users"},
docker_image="agent-tools:v1"
)

Containerized execution is heavyweight but provides strong isolation. Use it for high-risk tools or untrusted tool code.

Strategy 4: Network Sandboxing

Restrict which URLs/IPs tools can reach.

import requests
from urllib.parse import urlparse

ALLOWED_DOMAINS = [
"api.example.com",
"data.example.com",
"public-api.github.com"
]

def fetch_url_safely(url: str):
"""Fetch a URL, allowing only whitelisted domains."""

parsed = urlparse(url)
if parsed.netloc not in ALLOWED_DOMAINS:
raise PermissionError(f"Domain {parsed.netloc} is not whitelisted")

# Fetch with timeout
response = requests.get(url, timeout=10)
return response.text

# This succeeds
try:
result = fetch_url_safely("https://api.example.com/data")
print("✓", result)
except PermissionError as e:
print("✗", e)

# This fails (domain not whitelisted)
try:
result = fetch_url_safely("https://attacker.com/malware")
except PermissionError as e:
print("✗", e)

Network sandboxing prevents tools from exfiltrating data or accessing internal services.

Strategy 5: Timeout and Resource Limits

Prevent tools from consuming infinite resources.

import signal

class TimeoutError(Exception):
pass

def timeout_handler(signum, frame):
raise TimeoutError("Tool execution exceeded timeout")

def execute_with_timeout(func, timeout_seconds=10, **kwargs):
"""Execute a function with a timeout."""

# Set signal handler (Unix/Linux only)
signal.signal(signal.SIGALRM, timeout_handler)
signal.alarm(timeout_seconds)

try:
result = func(**kwargs)
signal.alarm(0) # Cancel alarm
return result
except TimeoutError:
raise TimeoutError(f"Tool execution exceeded {timeout_seconds}s limit")

# Usage
def slow_operation(n):
import time
time.sleep(n)
return {"status": "done"}

try:
result = execute_with_timeout(slow_operation, timeout_seconds=5, n=10)
except TimeoutError as e:
print("✗", e)

Timeouts prevent runaway tools from blocking the agent.

Comprehensive Sandbox Example

class SandboxedAgent:
def __init__(self, tools_config):
self.tools = {}
self.audit_log = []

for tool_name, cfg in tools_config.items():
# Wrap with permissions
wrapped = PermissionWrapper(cfg["impl"], cfg["permissions"])
# Wrap with audit logging
audited = AuditedToolWrapper(tool_name, wrapped)
self.tools[tool_name] = audited

def call_tool(self, tool_name: str, arguments: Dict[str, Any]):
"""Call a tool within sandbox."""

if tool_name not in self.tools:
raise ValueError(f"Unknown tool: {tool_name}")

try:
result = self.tools[tool_name](**arguments)
return {"status": "success", "data": result}
except PermissionError as e:
return {"status": "permission_denied", "error": str(e)}
except Exception as e:
return {"status": "error", "error": str(e)}

# Usage
tools_config = {
"delete_user": {
"impl": delete_user,
"permissions": {
"rate_limit_per_minute": 1,
"allowed_arguments": {"user_id": [123, 456]}
}
}
}

sandbox = SandboxedAgent(tools_config)
result = sandbox.call_tool("delete_user", {"user_id": 123})

Key Takeaways

  • Apply least-privilege: tools only get permissions they need.
  • Audit every tool call with timestamp, arguments, and result.
  • Use permission wrappers to validate arguments before execution.
  • Implement rate limits to prevent resource exhaustion.
  • For high-risk tools, use containerized or network-sandboxed execution.

Frequently Asked Questions

Should I sandbox all tools or just risky ones?

Sandbox all tools. The risk is not just from malicious models, but from unexpected behaviors. A tool that reads from a database could accidentally read the entire table; sandboxing limits the damage.

How do I balance security and usability?

Start permissive (allow most actions) and tighten based on incidents. If a tool is abused, restrict it. Monitor audit logs for anomalies and adjust permissions accordingly.

Can sandboxing slow down tools?

Yes, slightly (5–15% overhead for permission checks and logging). For most agents, this is acceptable. For performance-critical tools, use lighter-weight sandboxing (audit logging only, no containerization).

What permissions should I set by default?

Start with: rate_limit=10/minute, read-only access, no network access to internal IPs, 10s timeout. Expand based on tool needs.

How do I handle a tool that needs to read files?

Use a whitelist: "allowed_paths": ["/data/public/*", "/logs/*"]. Forbid access to sensitive paths like /etc, /var/secrets, etc.

Further Reading