Skip to main content

Managing Ambiguity: Handling Edge Cases in Specs

Specs are written with an implicit happy path in mind: the user provides valid input, the system processes it, and returns valid output. But real-world inputs are messy. Users send empty strings, null values, huge arrays, special characters, and concurrent requests. Edges cases are these boundary conditions where the spec is ambiguous or incomplete. When a spec doesn't explicitly handle an edge case, AI-generated code makes a guess—often wrong. This article teaches how to identify, document, and test edge cases so your specs are robust and your generated code is bulletproof.

Categories of Edge Cases

Edge cases fall into predictable categories:

Category 1: Boundary Value Cases

Values at the limits of acceptable ranges. A spec says "password: minLength 8," but never says what happens at exactly 8, or at 7, or at 1000.

Password field:
- minLength: 8
- maxLength: 256

Edge Cases:
- 7 characters: Should reject (below minimum)
- 8 characters: Should accept (at minimum)
- 256 characters: Should accept (at maximum)
- 257 characters: Should reject (above maximum)

Category 2: Null, Empty, and Missing Values

Most specs assume required fields are present. But what if they're null, empty string, or missing?

User schema:
properties:
name: { type: "string", minLength: 1 }
email: { type: "string", format: "email" }
required: [name, email]

Edge Cases:
- name is null: Should reject (required, null not allowed)
- name is "": Should reject (minLength: 1)
- name is missing from JSON: Should reject (required)
- email is "": Should reject (not a valid email format)
- email is null: Should reject (required)

Category 3: Type Mismatches

A spec says "age: integer," but what if the client sends "age: '25'" (string)?

age: { type: "integer", minimum: 0, maximum: 150 }

Edge Cases:
- age: 25 (integer): Should accept
- age: "25" (string): Should reject (type error) OR coerce to integer?
- age: 25.5 (float): Should reject (not an integer) OR truncate?
- age: -1 (negative): Should reject (minimum: 0)
- age: 151 (too old): Should reject (maximum: 150)

Category 4: Special Characters and Encoding

Strings with Unicode, emoji, SQL injection attempts, or unusual characters.

username: { type: "string", pattern: "^[a-zA-Z0-9_]{1,50}$" }

Edge Cases:
- username: "user_123": Should accept
- username: "user@domain": Should reject (@ not in pattern)
- username: "user 123": Should reject (space not allowed)
- username: "用户": Should reject (Chinese characters not in pattern)
- username: "'; DROP TABLE users;--": Should reject (SQL injection attempt)
- username: "a": Should accept (minimum length is 1)
- username: "a" * 51: Should reject (over 50 char limit)

Category 5: Timing and Concurrency

Simultaneous requests might create race conditions. A spec says "create unique email," but what if two requests arrive simultaneously with the same email?

POST /users with email: "[email protected]" (first request)
POST /users with email: "[email protected]" (second request, arrives 1ms later)

Edge Cases:
- Request 1 commits email to DB
- Request 2 reads DB (email not yet committed due to transaction isolation)
- Request 2 tries to insert email (now conflict!)
- How should Request 2 respond?
- Option A: 409 Conflict (correct per spec)
- Option B: Depends on database isolation level and timing

Spec should state: "Use database unique constraint to prevent race conditions."

Category 6: Resource Exhaustion

What happens when the system runs out of resources (memory, disk, connections)?

GET /files/{fileId}

Edge Cases:
- File is 10 GB: Should stream, not load into memory
- File doesn't exist: Should return 404
- Disk is full: Should return 507 Insufficient Storage
- Database connection pool exhausted: Should return 503 Service Unavailable (or queue request)
- Request timeout (client disconnects): Should clean up resources

Documenting Edge Cases in Specs

Add an edgeCases section to every spec:

POST /users:
summary: Create a new user account

requestBody:
required: true
content:
application/json:
schema:
type: object
properties:
email:
type: string
format: email
password:
type: string
minLength: 8
maxLength: 256
required: [email, password]

responses:
'201':
description: User created successfully
'400':
description: Invalid input
'409':
description: Email already exists

edgeCases:
- id: "EC-001"
category: "Boundary Value"
description: "Password exactly 8 characters"
input: { email: "[email protected]", password: "Pass1234" }
expected_response: 201
rationale: "Minimum length is 8; 8 should be accepted"

- id: "EC-002"
category: "Boundary Value"
description: "Password 7 characters (below minimum)"
input: { email: "[email protected]", password: "Pass123" }
expected_response: 400
error_code: "password_too_short"
rationale: "Must reject below minimum length"

- id: "EC-003"
category: "Missing Required Field"
description: "Email field missing"
input: { password: "Pass1234" }
expected_response: 400
error_code: "missing_required_field"
rationale: "Email is required per spec"

- id: "EC-004"
category: "Type Mismatch"
description: "Email is integer instead of string"
input: { email: 12345, password: "Pass1234" }
expected_response: 400
error_code: "invalid_type"
rationale: "Email must be string format"

- id: "EC-005"
category: "Concurrency"
description: "Two simultaneous requests with same email"
requests:
- { email: "[email protected]", password: "Pass1234" }
- { email: "[email protected]", password: "Pass5678" }
expected_responses: ["201", "409"]
rationale: "First wins (201); second sees conflict (409)"

- id: "EC-006"
category: "Special Characters"
description: "Email with plus sign (valid in RFC 5322)"
input: { email: "[email protected]", password: "Pass1234" }
expected_response: 201
rationale: "Plus sign is valid in email local part"

These edge cases become test vectors. AI uses them to understand what code to generate.

Test Cases Derived from Edge Cases

For each edge case, create a test:

import pytest
from api import create_user

class TestCreateUserEdgeCases:
"""Test edge cases documented in spec"""

# EC-001: Password exactly 8 characters
def test_password_minimum_length_accepted(self):
response = create_user(
email="[email protected]",
password="Pass1234" # exactly 8 chars
)
assert response.status_code == 201

# EC-002: Password 7 characters (below minimum)
def test_password_below_minimum_rejected(self):
response = create_user(
email="[email protected]",
password="Pass123" # 7 chars, below 8
)
assert response.status_code == 400
assert response.json()["error_code"] == "password_too_short"

# EC-003: Email field missing
def test_missing_email_rejected(self):
response = create_user(
email=None, # missing
password="Pass1234"
)
assert response.status_code == 400
assert response.json()["error_code"] == "missing_required_field"

# EC-004: Email is integer (wrong type)
def test_email_wrong_type_rejected(self):
response = create_user(
email=12345, # integer instead of string
password="Pass1234"
)
assert response.status_code == 400
assert response.json()["error_code"] == "invalid_type"

# EC-005: Concurrent duplicate email
@pytest.mark.asyncio
async def test_concurrent_duplicate_email(self):
import asyncio

email = "[email protected]"

# Send two requests concurrently
results = await asyncio.gather(
create_user_async(email=email, password="Pass1234"),
create_user_async(email=email, password="Pass5678")
)

# One should succeed (201), one should conflict (409)
status_codes = sorted([r.status_code for r in results])
assert status_codes == [201, 409], "Expected one success, one conflict"

# EC-006: Email with plus sign (valid in RFC 5322)
def test_email_with_plus_sign_accepted(self):
response = create_user(
email="[email protected]",
password="Pass1234"
)
assert response.status_code == 201

These tests directly verify the edge cases documented in the spec.

Common Ambiguities and How to Resolve Them

Ambiguity 1: What does "optional" mean?

Does an optional field mean:

  • Can be omitted entirely?
  • Can be null?
  • Can be empty string?
  • All of the above?

Resolution: Be explicit:

# Before (ambiguous)
profile:
type: object
properties:
bio: { type: "string" } # Is this optional?

# After (clear)
profile:
type: object
properties:
bio: { type: "string", minLength: 0, maxLength: 500 }
required: [] # bio is not required

clarification: |
- bio can be omitted from request (not in required list)
- bio can be null (null is treated same as omitted)
- bio can be empty string "" (length 0 is allowed)
- Response always includes bio (even if null/empty)

Ambiguity 2: What does "should" vs "must" mean?

Does "The API should return cached results" mean:

  • Always cache?
  • Optionally cache (cache if possible, but fine if not)?
  • Cache by default?

Resolution: Use formal language:

# Before (vague)
caching:
description: "Results should be cached for performance"

# After (precise)
caching:
policy: "MUST cache GET /products for 5 minutes"
control: "Cache-Control: max-age=300"
override: "Client can bypass cache with ?cache=false"

edge_case_1: "If cache is stale (>5 min), fetch fresh data from database"
edge_case_2: "If database is unavailable, return stale cache with stale=true header"

Ambiguity 3: What defines "valid" vs "invalid"?

Is "password123" a valid password? It's 11 characters (meets minimum), but has no special characters.

Resolution: Define validation rules explicitly with tests:

password:
validation:
rules:
- minLength: 8
- maxLength: 256
- allowedChars: "a-z, A-Z, 0-9, !@#$%^&*()"

examples:
valid:
- "Pass1234": "minimum length, mixed case and digits"
- "MyP@ssw0rd!": "includes special character"
invalid:
- "Pass123": "only 7 chars, below minimum"
- "password": "no uppercase or digits"
- "Pass 1234": "space not in allowedChars"
- "pass\nword": "newline not allowed"

Building an Ambiguity Checklist

Before finalizing a spec, run through this checklist:

AMBIGUITY RESOLUTION CHECKLIST
===============================

For each field/endpoint:

[ ] Required vs Optional
[ ] Is this field required? (in "required" list?)
[ ] If optional, can it be null?
[ ] If optional, can it be empty string/zero/empty array?
[ ] If optional, what is the default value?

[ ] Valid Values
[ ] What are all valid values? (enum examples)
[ ] What are all invalid values? (counter-examples)
[ ] Are boundary values tested? (min, max, just-below, just-above)
[ ] Are special cases tested? (null, empty, zero, negative)

[ ] Error Handling
[ ] What HTTP status code for each error type?
[ ] What error message/code is returned?
[ ] Can the error message reveal sensitive info?

[ ] Concurrency & Timing
[ ] Are race conditions possible? (concurrent writes, deletes)
[ ] What happens if two requests conflict?
[ ] What is the isolation level?
[ ] Is timeout specified? (how long before giving up)

[ ] Resource Constraints
[ ] What is the maximum request size? (memory limit)
[ ] What is the maximum response size? (streaming needed?)
[ ] What happens when limits are exceeded?
[ ] Rate limiting: how many requests per time window?

[ ] Security
[ ] What validation prevents injection attacks?
[ ] What sanitization is applied?
[ ] Are there length limits (overflow prevention)?
[ ] Are secrets (passwords, tokens) logged or exposed in errors?

Work through this checklist for every spec. Ambiguities will surface.

Comparison Table: Edge Case Handling

ApproachCoverageAI ClarityTest CountEffort
No explicit edge casesLow (gaps)Low (AI guesses)Few (missing tests)Low (upfront)
Edge cases documented in commentsMediumMediumMediumMedium
Edge cases in formal edgeCases sectionHighHighHighHigh

Key Takeaways

  • Edge cases fall into predictable categories: boundary values, null/empty, type mismatches, special characters, timing, and resource limits.
  • Documenting edge cases in the spec (with IDs and test vectors) dramatically improves AI code generation accuracy.
  • Each edge case should have a corresponding test case, ensuring comprehensive coverage.
  • Ambiguities in specs (optional vs required, "should" vs "must") must be resolved explicitly before code generation.
  • An ambiguity checklist catches gaps before development begins.

Frequently Asked Questions

How many edge cases should I document?

Enough to cover all ambiguities. A typical API endpoint might have 10–30 edge cases. More complex features might have 50+. Quality over quantity: focus on cases that could cause bugs.

What if I discover new edge cases after code is generated?

Add them to the spec, create tests, and regenerate code if needed. This is normal and expected. Specs evolve as understanding improves.

Can AI identify edge cases automatically?

Partially. AI can suggest boundary values, null cases, and timing issues. But AI cannot predict domain-specific edge cases (business logic, user expectations). Humans must review and add those.

Should edge cases be part of the API contract?

Yes, especially the error responses. If your API returns 409 for duplicate emails, that's part of the contract. Document it so clients can handle it.

Further Reading