Skip to main content

Writing Code Specifications: Prompt Guide

Writing an effective code specification prompt is the foundation of reliable code generation. Rather than hoping the AI model guesses your intent, a well-structured specification eliminates ambiguity and guides the model to produce exactly what you need. This article teaches the three-part prompt formula: context, requirements, and examples.

The Three-Part Specification Formula

Part 1: Context and Language. State the programming language, version (if relevant), and the purpose of the code. This grounds the AI model and prevents language confusion.

Language: Python 3.11
Purpose: Parse command-line arguments for a data processing tool

Part 2: Requirements. List the specific behavior you need, numbered for clarity. Each requirement should be testable and unambiguous.

Requirements:
1. Accept input via argparse with --input, --output, --format (csv|json) flags
2. Validate file paths (input must exist, output directory must be writable)
3. If --format is invalid, raise argparse.ArgumentTypeError with a message
4. Return a dictionary with parsed arguments (use vars() conversion)
5. Support default values: --format defaults to 'csv'

Part 3: Examples and Edge Cases. Provide sample function calls and expected outputs. Edge cases are critical: they show the model how to handle boundary conditions.

Example 1: Normal usage
args = parse_args(['--input', 'data.txt', '--format', 'json'])
# Expected: {'input': 'data.txt', 'output': None, 'format': 'json'}

Example 2: Invalid file path
parse_args(['--input', '/nonexistent/path'])
# Expected: SystemExit with error message "input file does not exist"

Example 3: Invalid format
parse_args(['--format', 'xml'])
# Expected: SystemExit with error message about format choices

Structuring Requirements for Clarity

Vague requirements lead to ambiguous code. Here's how to sharpen them:

Vague RequirementSpecific Requirement
"Handle errors gracefully""Catch FileNotFoundError and return None; catch ValueError and log the error message and re-raise"
"Make it fast""Process 10,000 rows in under 2 seconds on a standard laptop (2.4 GHz CPU, 16 GB RAM)"
"Support multiple formats""Accept input formats: JSON (schema provided), CSV (with header row), and YAML (PyYAML 6.0+)"
"Clean code""Follow PEP 8; include docstring for every function; keep functions under 30 lines"

Notice that specific requirements include constraints (error types, performance numbers, library versions) that the AI model can actually check against.

Template: The Five-Section Specification Prompt

Here is a reusable template for single-function code generation:

**Language & Context**
Language: [Python / JavaScript / Rust / etc.]
Version: [3.11 / 18.0 / 1.75 / etc.]
Purpose: [1-sentence description of what this code does]

**Function Signature**
function_name(param1: Type, param2: Type) -> ReturnType

**Requirements**
1. [Requirement 1]
2. [Requirement 2]
3. [Requirement 3]
...

**Examples & Edge Cases**
Example 1: [inputs] → [expected output]
Example 2: [inputs] → [expected output]
Edge case: [unusual input] → [expected behavior]

**Constraints & Notes**
- No external dependencies (only stdlib)
- Must be thread-safe
- Code style: [Google / Black / Prettier / etc.]

Let's apply this to a real problem: a function that converts Celsius to Fahrenheit with validation.

**Language & Context**
Language: Python 3.11
Purpose: Convert a temperature from Celsius to Fahrenheit with validation

**Function Signature**
def celsius_to_fahrenheit(celsius: float | None) -> float:

**Requirements**
1. Accept a float or None
2. If input is None, return 0.0
3. If input is below -273.15 (absolute zero), raise ValueError with message "Temperature below absolute zero"
4. Apply formula: (celsius * 9/5) + 32
5. Return result rounded to 2 decimal places
6. Include docstring with example usage

**Examples & Edge Cases**
Example 1: 0°C → 32.0°F
Example 2: 100°C → 212.0°F
Example 3: -273.15°C → -459.67°F (absolute zero boundary)
Example 4: -274°C → ValueError ("Temperature below absolute zero")
Example 5: None input → 0.0

**Constraints & Notes**
- Use only standard library
- Type hints required

Avoiding Common Prompt Pitfalls

Pitfall 1: Implicit assumptions. Never assume the AI model knows your domain context. If your code must integrate with a specific API, database, or library, mention it explicitly and supply a sample response or schema.

Bad prompt: "Write a function that fetches user data." Good prompt: "Write a function fetch_user(user_id: int) that calls the /api/users/{id} REST endpoint (using the requests library), parses the JSON response, and returns a User dataclass with fields id, name, email, created_at. Handle HTTP errors (404, 500) by logging and returning None."

Pitfall 2: Ambiguous behavior. Describe what happens in error cases, not just the happy path.

Bad prompt: "Write a CSV parser." Good prompt: "Write a function parse_csv(filepath: str) that reads a CSV with headers, returns a list of dicts, and raises FileNotFoundError if the file doesn't exist. If a required column is missing, raise ValueError with the column name."

Pitfall 3: Skipping examples. Without examples, the AI model might produce code that passes your description but fails your actual use case.

Always include at least two examples (happy path and error case) and one boundary condition.

Language-Specific Prompt Adjustments

Different languages have different idioms. Adjust your prompt to match:

Python: Emphasize style (PEP 8 vs. Black), type hints, and docstring format (Google vs. NumPy).

# Include in prompt:
- Follow PEP 8 style
- Use type hints for all parameters and return values
- Docstring format: Google-style with Args, Returns, Raises sections

JavaScript/TypeScript: Emphasize async/await, error handling (try/catch vs. Promises), and module format (ESM vs. CommonJS).

// Include in prompt:
- Use async/await (not Promises)
- Throw custom Error classes for domain-specific errors
- Export as ESM (import/export syntax)
- Assume Node.js 18+ (no polyfills needed)

Rust: Emphasize ownership, error handling (Result vs. panic), and library versions.

// Include in prompt:
- Use Result<T, E> for fallible operations (never unwrap in library code)
- Return custom error types (not String errors)
- Dependencies: tokio 1.35+, serde 1.0+
- Target: stable Rust (no nightly features)

Multi-Function Specifications

For larger specifications (a module with multiple functions), organize by structure:

**Module Overview**
This module provides utilities for parsing and validating user configuration files.

**Exports**
- parse_config(path: str) -> Config
- validate_config(config: Config) -> list[str] (returns list of validation errors)

**Data Structures**
class Config:
name: str
port: int (1-65535)
debug: bool
plugins: list[str]

**Function 1: parse_config**
Requirements:
1. Read YAML file from path
2. Deserialize to Config dataclass
3. If file missing, raise FileNotFoundError
4. If YAML invalid, raise ValueError with line number

**Function 2: validate_config**
Requirements:
1. Check port in range 1-65535; if not, add error "port out of range"
2. Check plugins is non-empty; if empty, add error "no plugins configured"
3. Return list of error strings (empty list if valid)

Key Takeaways

  • Use the three-part formula: Context, Requirements, Examples.
  • Be specific: include error types, performance targets, library versions, and style guides.
  • Structure requirements as a numbered list with testable, unambiguous conditions.
  • Always include edge cases and error handling examples.
  • Adjust prompts for language idioms (Python docstrings, Rust error handling, etc.).

Frequently Asked Questions

How long should a specification prompt be?

For a single function: 100–300 words (10–15 requirement lines). For a module: 400–800 words. Longer is fine if it adds clarity; shorter works if you're very specific. The goal is clarity, not length. A tight 150-word spec beats a vague 500-word ramble.

Should I include comments in the specification prompt itself?

Yes, briefly. Use comments to clarify tricky requirements ("must be idempotent—calling twice with the same input yields the same result"). But keep comments in the prompt separate from code comments you expect in the output (those are a separate requirement).

What if I don't know the function signature in advance?

That's fine. Instead of a signature, describe the intended inputs and outputs: "This function should accept a list of integers and return a dict mapping each unique integer to its count." The AI model will infer a reasonable signature and show it to you; you can refine from there.

How do I know if my specification is good enough?

Try it: write the prompt, generate the code, and review it. If the generated code matches your requirements without iterations, the spec was good. If you had to ask for three rounds of fixes, the spec was underspecified. Iterate on your template.

Should I specify performance targets in every prompt?

Only if performance matters. For CRUD operations and utility functions, it's rarely critical. For data processing pipelines, batch algorithms, or high-traffic APIs, include a performance target (e.g., "process 1 million rows in under 30 seconds"). Always include a test environment (e.g., "single-threaded, 2.4 GHz CPU, no database calls").

Further Reading