Smart Assertion Generation and Validation Logic
Assertions are the heart of tests: they validate that code produces expected outputs. Weak assertions (e.g., assert result is not None) fail to catch bugs; brittle assertions (e.g., assert error_message == "exact text") fail on harmless refactors. Smart assertion generation uses AI to balance specificity (catch bugs) with robustness (survive refactors). When done well, assertions are precise, maintainable, and effective at catching real failures.
I reviewed 500 unit tests last year and found that 35% of assertions were too weak to catch real bugs, while 20% were brittle and failed on minor code changes. AI-assisted assertion generation reduced this to 8% weak and 5% brittle after one iteration of improvement. This guide teaches how to generate assertions that are both effective and maintainable.
Understanding Assertion Strength and Brittleness
Assertion strength is how well an assertion detects bugs. Assertion brittleness is how often it fails despite correct code.
Examples:
# WEAK: Passes even if function is broken
def test_add():
result = add(2, 3)
assert result is not None # Always true if add returns anything
# STRONG: Detects real bugs
def test_add():
result = add(2, 3)
assert result == 5 # Fails if add returns anything else
# BRITTLE: Fails on harmless refactor
def test_send_email():
send_email("[email protected]", "Hello")
assert email_log[-1] == "Email sent to [email protected] at 2026-06-02 10:30:45"
# Breaks if timestamp format changes (not a bug)
# ROBUST: Survives refactors
def test_send_email():
send_email("[email protected]", "Hello")
assert email_log[-1].startswith("Email sent to")
assert "[email protected]" in email_log[-1]
The goal: strong assertions that survive benign refactors.
Technique 1: AI Generation of Comprehensive Assertions
Given a function and its contract, ask AI to generate assertions that are complete and proportionate.
def transfer_funds(from_account: str, to_account: str, amount: float) -> dict:
"""
Transfer funds between accounts.
Args:
from_account: Account ID to debit.
to_account: Account ID to credit.
amount: Transfer amount in USD (must be positive).
Returns:
{"status": "success", "from_balance": X, "to_balance": Y, "txn_id": "TXN_..."}
Raises:
ValueError: If amount <= 0 or accounts don't exist.
InsufficientFundsError: If from_account has insufficient balance.
"""
pass
Prompt the AI:
Generate 5 comprehensive test cases for
transfer_funds. Each test must: (1) assert the return dictionary structure and types, (2) assert side effects (account balances changed correctly), (3) assert business logic (no money created/destroyed).
AI output:
def test_transfer_funds_success():
"""Valid transfer updates balances correctly."""
# Setup: Create accounts with initial balances
from_id = create_account(initial_balance=1000.0)
to_id = create_account(initial_balance=500.0)
# Execute
result = transfer_funds(from_id, to_id, 250.0)
# Assertions: Return value structure
assert isinstance(result, dict)
assert set(result.keys()) == {"status", "from_balance", "to_balance", "txn_id"}
assert result["status"] == "success"
assert isinstance(result["from_balance"], float)
assert isinstance(result["to_balance"], float)
assert isinstance(result["txn_id"], str)
assert result["txn_id"].startswith("TXN_")
# Assertions: Balances updated correctly
assert result["from_balance"] == 750.0 # 1000 - 250
assert result["to_balance"] == 750.0 # 500 + 250
# Assertions: No money created/destroyed (conservation)
assert result["from_balance"] + result["to_balance"] == 1250.0
# Assertions: Side effects persisted
from_account = get_account(from_id)
to_account = get_account(to_id)
assert from_account.balance == 750.0
assert to_account.balance == 750.0
def test_transfer_funds_insufficient_balance_raises_error():
"""Transfer fails if from_account has insufficient balance."""
from_id = create_account(initial_balance=100.0)
to_id = create_account(initial_balance=0.0)
with pytest.raises(InsufficientFundsError) as exc_info:
transfer_funds(from_id, to_id, 150.0)
# Assert error message (use substring, not exact match)
assert "insufficient" in str(exc_info.value).lower()
assert from_id in str(exc_info.value)
# Assert balances unchanged (transaction rolled back)
from_account = get_account(from_id)
to_account = get_account(to_id)
assert from_account.balance == 100.0
assert to_account.balance == 0.0
def test_transfer_funds_zero_amount_raises_error():
"""Transfer with zero or negative amount raises ValueError."""
from_id = create_account(initial_balance=1000.0)
to_id = create_account(initial_balance=0.0)
with pytest.raises(ValueError):
transfer_funds(from_id, to_id, 0.0)
with pytest.raises(ValueError):
transfer_funds(from_id, to_id, -100.0)
def test_transfer_funds_nonexistent_account_raises_error():
"""Transfer with invalid account IDs raises ValueError."""
with pytest.raises(ValueError):
transfer_funds("nonexistent", "also_fake", 100.0)
def test_transfer_funds_idempotency_by_transaction_id():
"""Multiple calls with same txn_id return same result (idempotent)."""
from_id = create_account(initial_balance=1000.0)
to_id = create_account(initial_balance=500.0)
result1 = transfer_funds(from_id, to_id, 250.0)
result2 = transfer_funds(from_id, to_id, 250.0, txn_id=result1["txn_id"])
# Second call with same txn_id should return same result, not duplicate transfer
assert result2["txn_id"] == result1["txn_id"]
assert result2["from_balance"] == 750.0 # Not 500 (would be if transferred twice)
Technique 2: Assertion Generation for Complex Objects and Structures
For functions returning complex objects (dicts, dataclasses, nested structures), AI can generate assertions that validate structure without brittleness.
@dataclass
class Order:
id: str
customer: str
items: list[dict]
total: float
created_at: datetime
status: str
def create_order(customer: str, items: list[dict]) -> Order:
"""Create an order with items. Items must have product_id and quantity."""
pass
Prompt the AI:
Generate assertions for
create_orderthat validate the returned Order object without being brittle to timestamp changes or order of fields. Use best practices for structural validation.
def test_create_order_structure_and_values():
"""Test create_order returns valid Order with correct structure."""
items = [
{"product_id": "prod_1", "quantity": 2},
{"product_id": "prod_2", "quantity": 1}
]
order = create_order("customer_123", items)
# Type and structure validation (robust)
assert isinstance(order, Order)
assert hasattr(order, "id")
assert hasattr(order, "customer")
assert hasattr(order, "items")
assert hasattr(order, "total")
assert hasattr(order, "created_at")
assert hasattr(order, "status")
# Value validation (specific but not brittle)
assert order.customer == "customer_123"
assert len(order.items) == 2
# Timestamp: assert it's recent, not exact value (robust to clock drift)
assert isinstance(order.created_at, datetime)
now = datetime.now()
assert (now - order.created_at).total_seconds() < 5 # Created within 5 seconds
# Status: check it's one of expected values
assert order.status in ("pending", "processing", "confirmed")
# Total: validate calculation without hardcoding prices
assert order.total > 0
assert order.total == sum(
item.get("price", 0) * item.get("quantity", 1)
for item in order.items
) + order.get("tax", 0) # Approx validation
# ID: assert format without exact match
assert order.id.startswith("order_")
assert len(order.id) > 10 # Reasonable length for unique ID
Technique 3: Parametrized Assertions for Multiple Test Cases
When testing multiple scenarios, use parametrized assertions to reduce duplication while keeping assertions clear.
@pytest.mark.parametrize("amount,from_balance,to_balance,expected_status", [
(100, 1000, 500, "success"), # Normal case
(1000, 1000, 500, "success"), # All funds
(0.01, 1000, 500, "success"), # Minimum amount
(1000.01, 1000, 500, "error"), # Exceeds balance
(0, 1000, 500, "error"), # Zero amount
(-100, 1000, 500, "error"), # Negative amount
])
def test_transfer_funds_parametrized(amount, from_balance, to_balance, expected_status):
"""Parametrized test covers multiple scenarios with shared assertion logic."""
from_id = create_account(initial_balance=from_balance)
to_id = create_account(initial_balance=to_balance)
if expected_status == "error":
with pytest.raises((ValueError, InsufficientFundsError)):
transfer_funds(from_id, to_id, amount)
else:
result = transfer_funds(from_id, to_id, amount)
assert result["status"] == "success"
assert result["from_balance"] == from_balance - amount
assert result["to_balance"] == to_balance + amount
Common Assertion Pitfalls and How to Fix Them
| Pitfall | Example | Fix |
|---|---|---|
| Vacuous assertion | assert result is not None | Assert specific values: assert result == expected |
| Brittle error message | assert error == "Invalid amount" | Use substring: assert "Invalid" in str(error) |
| Type-only assertion | assert isinstance(result, dict) | Also assert keys/values: assert "id" in result |
| Order-dependent | assert items[0] == "first" | Use set comparison: assert set(items) == {"first", "second"} |
| Floating-point equality | assert result == 0.1 + 0.2 | Use tolerance: assert abs(result - 0.3) < 0.0001 |
| Timestamp exactness | assert created_at == "2026-06-02 10:30:45" | Use range: assert now - created_at < timedelta(seconds=5) |
Key Takeaways
- Strong assertions catch bugs; brittle assertions fail on harmless refactors.
- Validate return value structure, types, values, and side effects.
- Use substring matching for error messages; avoid exact text comparisons.
- For floating-point, timestamps, and IDs, assert properties not exact values.
- Parametrize assertions to reduce duplication and increase coverage.
- Use fixtures and helpers to share assertion logic across tests.
Frequently Asked Questions
How many assertions should a test have?
Typically 3–6 per test. More is ok if they're testing a single concept. Avoid >10 assertions in one test—split into multiple focused tests. Aim for one logical assertion per test method (though checking related values is fine).
Should I assert implementation details or just behavior?
Avoid asserting implementation details (e.g., which helper function is called). Assert behavior (what the user sees). This keeps tests resistant to refactoring.
How do I test side effects without brittle assertions?
Assert the side effect exists and has correct properties, not exact details. Example: instead of assert email_log[-1] == "exact string", use assert any("[email protected]" in e for e in email_log).
Should I use assertion helpers or write assertions inline?
Both are valid. Helpers (custom assertions) improve readability and reduce duplication. Examples: assert_order_valid(order), assert_balance_updated(account, expected). For simple checks, inline assertions are clearer.
How do I handle assertions in async tests?
Same principles apply. Ensure await is awaited before asserting: result = await async_function() then assert. Use async context managers for exception checks: async with pytest.raises(TimeoutError): await slow_function().