Skip to main content

AI Mock Generation: Dependency Testing Strategies

Mock generation is the process of creating test doubles (stubs, mocks, fakes) for external dependencies like APIs, databases, and third-party services. When done manually, mocking is tedious: you must understand the interface, anticipate all interactions, and write boilerplate. AI streamlines this by analyzing function signatures and contracts to generate mocks automatically, reducing mocking code by 60–75% while ensuring test isolation.

I spent three weeks writing mock database clients for a microservices test suite until I started asking AI to generate them. A single well-scoped prompt now produces a complete mock interface with all methods, correct return types, and verification hooks in under two minutes. This guide teaches the principles of AI-assisted mocking and how to generate maintainable test doubles.

Why Mocking Matters and When AI Helps Most

Mocking isolates the code under test from external dependencies, allowing you to: (1) test offline (no real API calls), (2) test error paths (simulate timeouts, 500s), (3) verify interactions (ensure your code calls the dependency correctly), and (4) speed up tests (mocks are ~100x faster than network calls). Manual mocking requires you to understand every method, anticipate call sequences, and write repetitive setup code.

AI excels at generating mocks because language models understand interfaces deeply. Show an AI a class signature or protocol definition, and it instantly generates a matching mock with all methods, correct argument types, and sensible return values.

Technique 1: Generating Mocks from Interface Definitions

The simplest approach: provide an interface definition to your AI, and ask it to generate a corresponding mock class.

from typing import Protocol
from datetime import datetime

class EmailServiceInterface(Protocol):
"""Send emails and track delivery."""

def send_email(
self,
to: str,
subject: str,
body: str,
cc: list[str] | None = None
) -> dict[str, str]:
"""
Send an email.

Returns:
{"message_id": "msg_123", "status": "sent" | "queued"}

Raises:
ValueError: If email is invalid.
TimeoutError: If service is unreachable.
"""
...

def get_delivery_status(self, message_id: str) -> dict[str, str | bool]:
"""
Check email delivery status.

Returns:
{"status": "delivered" | "bounced" | "pending", "delivered_at": "2026-06-02T10:00:00Z"}
"""
...

Prompt the AI:

Generate a mock class MockEmailService that implements EmailServiceInterface. It must: (1) store sent emails in memory, (2) simulate send_email returning valid message IDs, (3) track calls for verification, (4) allow tests to configure return values and exceptions.

from unittest.mock import MagicMock
from typing import Any, Optional

class MockEmailService:
def __init__(self):
self.sent_emails = []
self.delivery_statuses = {}
self.call_count = 0
self._should_fail = False
self._failure_exception = None

def send_email(
self,
to: str,
subject: str,
body: str,
cc: list[str] | None = None
) -> dict[str, str]:
self.call_count += 1

if self._should_fail:
raise self._failure_exception or TimeoutError("Service unreachable")

if not self._is_valid_email(to):
raise ValueError(f"Invalid email: {to}")

message_id = f"msg_{self.call_count}_{''.join(to.split('@'))}"
status = "queued" if self.call_count % 3 == 0 else "sent"

self.sent_emails.append({
"to": to,
"subject": subject,
"body": body,
"cc": cc or [],
"message_id": message_id
})
self.delivery_statuses[message_id] = "pending"

return {"message_id": message_id, "status": status}

def get_delivery_status(self, message_id: str) -> dict[str, str | bool]:
status = self.delivery_statuses.get(message_id, "unknown")
return {
"status": status,
"delivered_at": "2026-06-02T10:00:00Z" if status == "delivered" else None
}

def configure_failure(self, exception: Exception):
self._should_fail = True
self._failure_exception = exception

def reset(self):
self.sent_emails = []
self.delivery_statuses = {}
self.call_count = 0
self._should_fail = False

def _is_valid_email(self, email: str) -> bool:
return "@" in email and "." in email.split("@")[1]

Now tests can use this mock without hitting a real email service:

def test_notification_service_sends_email_on_signup():
mock_email = MockEmailService()
notifier = NotificationService(email_service=mock_email)

notifier.on_user_signup("[email protected]", "John")

assert mock_email.call_count == 1
assert mock_email.sent_emails[0]["to"] == "[email protected]"
assert "John" in mock_email.sent_emails[0]["body"]

Technique 2: Behavior Specification and Verification Hooks

Ask the AI to generate mocks with configurable behavior and verification methods. This lets tests assert not just results, but how the mock was called.

class PaymentGatewayInterface(Protocol):
def charge_card(self, amount: float, card_token: str) -> str:
"""Charge a card. Returns transaction_id."""
...

Prompt:

Generate MockPaymentGateway with: (1) ability to set return value or exception per call, (2) methods to query call history (was_called_with, call_count, last_call_args), (3) support for with_side_effect to simulate multiple calls.

class MockPaymentGateway:
def __init__(self):
self.calls = []
self._side_effects = []
self._effect_index = 0
self._default_return = None

def charge_card(self, amount: float, card_token: str) -> str:
self.calls.append({"amount": amount, "card_token": card_token})

if self._side_effects and self._effect_index < len(self._side_effects):
effect = self._side_effects[self._effect_index]
self._effect_index += 1
if isinstance(effect, Exception):
raise effect
return effect

if self._default_return:
return self._default_return

return f"txn_{len(self.calls)}"

def set_return_value(self, value: str):
self._default_return = value

def with_side_effect(self, *effects):
self._side_effects = list(effects)
self._effect_index = 0
return self

def was_called_with(self, amount: float, card_token: str) -> bool:
return {"amount": amount, "card_token": card_token} in self.calls

@property
def call_count(self) -> int:
return len(self.calls)

@property
def last_call_args(self) -> dict:
return self.calls[-1] if self.calls else None

def reset(self):
self.calls = []
self._side_effects = []
self._effect_index = 0

Usage:

def test_payment_retry_on_timeout():
mock_payment = MockPaymentGateway()
mock_payment.with_side_effect(
TimeoutError("Network timeout"),
"txn_success"
)

processor = PaymentProcessor(payment_gateway=mock_payment)
result = processor.charge_with_retry("user_123", 99.99)

assert mock_payment.call_count == 2
assert result == "txn_success"

Technique 3: Async Mock Generation

For async code, AI can generate async mocks with correct async def syntax and await patterns.

class AsyncDatabaseInterface(Protocol):
async def query(self, sql: str) -> list[dict]:
...

async def insert(self, table: str, data: dict) -> int:
"""Returns row ID."""
...

Prompt:

Generate MockAsyncDatabase implementing AsyncDatabaseInterface. It must: (1) use async def for all methods, (2) store rows in memory (dict of tables), (3) support configurable delays (to test timeout handling), (4) simulate SQL parsing errors.

import asyncio
from typing import Optional

class MockAsyncDatabase:
def __init__(self, delay_ms: int = 0):
self.tables = {}
self.delay_ms = delay_ms
self.query_count = 0

async def query(self, sql: str) -> list[dict]:
await asyncio.sleep(self.delay_ms / 1000.0)
self.query_count += 1

if "invalid" in sql.lower():
raise ValueError(f"Invalid SQL: {sql}")

# Simple mock: return rows from a named table
table_name = sql.split("FROM")[-1].strip().split()[0]
return self.tables.get(table_name, [])

async def insert(self, table: str, data: dict) -> int:
await asyncio.sleep(self.delay_ms / 1000.0)

if table not in self.tables:
self.tables[table] = []

row_id = len(self.tables[table]) + 1
self.tables[table].append({**data, "id": row_id})
return row_id

async def setup_table(self, table: str, rows: list[dict]):
"""Helper to pre-populate tables for tests."""
self.tables[table] = rows

Test:

async def test_user_registration_creates_database_record():
mock_db = MockAsyncDatabase()
await mock_db.setup_table("users", [])

service = RegistrationService(db=mock_db)
user_id = await service.register("[email protected]", "password123")

assert user_id == 1
assert len(mock_db.tables["users"]) == 1

Common Mocking Pitfalls and Fixes

PitfallSymptomSolution
Mock too permissiveTest passes when it shouldn't (mock always succeeds)Constrain mock behavior; verify interactions, not just success
Mock too strictTest fails on harmless refactorsUse loose matchers; avoid checking exact strings or call order unless critical
Forgetting to resetTest pollution: mock state from test A affects test BAdd setup() or teardown() / use @pytest.fixture with autouse=True
Over-mockingMocking code that shouldn't be mocked (your own functions)Mock only external dependencies; test real code when possible
Mock doesn't match realityTest passes but prod fails because mock is wrongPeriodically verify mock behavior against real service; use contract testing

Key Takeaways

  • Mocks isolate tests from external dependencies, enabling offline testing and error path verification.
  • Use AI to generate mock classes from interface definitions, saving boilerplate time.
  • Add verification hooks (call tracking, configurable returns) so tests can assert interactions.
  • Async mocks require async def and await patterns; have AI generate these explicitly.
  • Always reset mock state between tests to prevent pollution.
  • Combine mocks with integration tests to catch mock-reality gaps.

Frequently Asked Questions

Should I mock or use a real test database?

Mock for unit tests (fast, offline, error testing). Use a real test database or test containers for integration tests. The pyramid is unit (mocks) < integration (real deps) < e2e (user flows).

How do I test that my code calls a dependency correctly?

Track calls on the mock: assert mock.was_called_with(...). Record call arguments and verify order if sequence matters. Avoid over-asserting on implementation details.

Can I use unittest.mock instead of writing custom mocks?

Yes, for simple cases. MagicMock and patch are built-in and powerful. Use custom mocks when you need complex stateful behavior, verification hooks, or when you want self-documenting test code that mirrors the real interface.

What's the difference between a stub, mock, and fake?

A stub returns hardcoded values. A mock tracks calls and allows assertions. A fake is a simplified implementation (like an in-memory database). Use all three depending on your test needs.

How do I handle mocks for private or hard-to-reach dependencies?

Refactor to inject dependencies (dependency injection pattern). If you can't inject, use patch or patch.object. Prefer injection; it makes code testable and mocks natural.

Further Reading