Multi-File Planning: Agents Reason About Codebase Scale
Single-file edits are simple; multi-file changes are complex. When an agent refactors a function signature, it must update the function definition and all call sites. When it moves code between modules, it must update imports. When it changes a database schema, it must update queries in 10 different files. Agents that can plan and execute multi-file changes with confidence unlock powerful capabilities. This article covers how to make agents reason about codebase-scale changes.
The Challenge: Cascading Changes
Here's what goes wrong when agents edit naively:
Agent edits function signature: def parse_json(data) → def parse_json(data, strict=False)
Agent updates the definition only.
Result: 15 other files call parse_json(data) with 1 argument.
Code breaks. Tests fail. Agent confused.
The agent must:
- Identify all call sites.
- Plan changes to each.
- Execute them together (atomically).
- Test the result.
Strategy 1: Build a Dependency Graph
Before editing, the agent must understand which files are affected by a change:
import ast
from collections import defaultdict
class DependencyGraphBuilder:
"""Build a graph of which functions call which functions."""
def __init__(self, repo_path: str):
self.repo_path = repo_path
self.functions = {} # {file: {func_name: FunctionDef}}
self.calls = defaultdict(list) # {func_name: [callers]}
self.imports = defaultdict(list) # {module: [imported_names]}
def build(self):
"""Index all Python files and build dependency graph."""
for root, dirs, files in os.walk(self.repo_path):
dirs[:] = [d for d in dirs if not d.startswith('.')]
for file in files:
if not file.endswith('.py'):
continue
filepath = os.path.join(root, file)
try:
with open(filepath) as f:
tree = ast.parse(f.read())
# Extract functions
self.functions[filepath] = {}
for node in ast.walk(tree):
if isinstance(node, ast.FunctionDef):
self.functions[filepath][node.name] = node
# Extract calls
elif isinstance(node, ast.Call):
if isinstance(node.func, ast.Name):
func_name = node.func.id
self.calls[func_name].append(filepath)
except:
pass
def find_impacted_files(self, func_name: str) -> list[str]:
"""Find all files that call a given function."""
return list(set(self.calls.get(func_name, [])))
def get_function_signature(self, filepath: str, func_name: str):
"""Get function signature (args, defaults)."""
if filepath not in self.functions or func_name not in self.functions[filepath]:
return None
func_def = self.functions[filepath][func_name]
args = [arg.arg for arg in func_def.args.args]
defaults = func_def.args.defaults
return {"args": args, "defaults": defaults}
# Usage
graph = DependencyGraphBuilder("src/")
graph.build()
impacted = graph.find_impacted_files("parse_json")
print(f"Editing 'parse_json' will affect: {impacted}")
Strategy 2: Plan Multi-File Edits
Once the agent knows what's affected, it must plan the sequence of edits:
class MultiFileChangeAgentPrompt:
"""Prompt the agent to plan multi-file changes."""
@staticmethod
def plan_refactor(task: str, impacted_files: list, graph) -> str:
"""
Ask agent to plan a refactor across multiple files.
"""
file_info = []
for filepath in impacted_files[:10]: # Limit to 10 files
with open(filepath) as f:
content = f.read()
file_info.append({
"path": filepath,
"size_lines": len(content.split('\n')),
"preview": content[:300] # First 300 chars
})
prompt = f"""
Task: {task}
This change will affect these files:
{json.dumps(file_info, indent=2)}
Please plan the edits:
1. What is changing and why?
2. For each file, what specific edits are needed?
3. What is the order of edits (to avoid conflicts)?
4. After edits, what tests should pass?
Then execute the plan step by step. After each edit, run tests
to validate. If a test fails, analyze and adapt.
Important: Make targeted edits (old_text matches exactly).
Do not make random changes.
"""
return prompt
Strategy 3: Change Propagation Patterns
Common multi-file change patterns include:
Pattern 1: Function Signature Change
def propagate_signature_change(func_name: str,
old_sig: dict,
new_sig: dict,
impacted_files: list) -> dict:
"""
Plan and execute a function signature change across files.
old_sig: {"args": ["data"], "defaults": []}
new_sig: {"args": ["data", "strict"], "defaults": [False]}
"""
edits = []
# Edit 1: Update the function definition
definition_edit = {
"file": "src/parser.py", # Original definition
"old_text": f"def {func_name}(data):",
"new_text": f"def {func_name}(data, strict=False):",
"description": f"Add strict parameter to {func_name}"
}
edits.append(definition_edit)
# Edits 2–N: Update all call sites
for filepath in impacted_files:
with open(filepath) as f:
content = f.read()
# Find calls to the function
import re
pattern = rf'{func_name}\(([^)]+)\)'
matches = re.finditer(pattern, content)
for match in matches:
old_call = match.group(0)
args = match.group(1)
# Preserve positional args, add strict=False by default
new_call = f"{func_name}({args}, strict=False)"
edits.append({
"file": filepath,
"old_text": old_call,
"new_text": new_call,
"description": f"Update call to {func_name} in {filepath}"
})
return {
"total_edits": len(edits),
"files_affected": len(set(e["file"] for e in edits)),
"edits": edits
}
Pattern 2: Module Relocation
def plan_module_move(old_module: str, new_module: str, repo: str) -> dict:
"""
Plan moving a module from old_module to new_module.
Updates imports in all files that use the module.
"""
# Find all imports of old_module
importing_files = []
for root, dirs, files in os.walk(repo):
dirs[:] = [d for d in dirs if not d.startswith('.')]
for file in files:
if not file.endswith('.py'):
continue
filepath = os.path.join(root, file)
with open(filepath) as f:
content = f.read()
if f"from {old_module}" in content or f"import {old_module}" in content:
importing_files.append(filepath)
edits = [
# Edit 1: Copy/move file
{
"action": "move_file",
"from": old_module.replace('.', '/') + ".py",
"to": new_module.replace('.', '/') + ".py"
}
]
# Edits 2–N: Update imports
for filepath in importing_files:
with open(filepath) as f:
content = f.read()
old_import = f"from {old_module} import"
new_import = f"from {new_module} import"
if old_import in content:
edits.append({
"file": filepath,
"old_text": old_import,
"new_text": new_import,
"description": f"Update import in {filepath}"
})
return {"edits": edits, "files_affected": len(importing_files)}
Strategy 4: Atomic Transactions
Execute all edits together; rollback if any fail:
def execute_multi_file_plan(plan: dict, workspace: str) -> dict:
"""
Execute multi-file change plan with rollback on failure.
"""
# Phase 1: Validate all edits
for edit in plan["edits"]:
if edit["action"] == "edit_file":
filepath = os.path.join(workspace, edit["file"])
with open(filepath) as f:
content = f.read()
if edit["old_text"] not in content:
return {
"success": False,
"error": f"Validation failed: old_text not found in {edit['file']}",
"failed_edit": edit
}
# Phase 2: Create snapshot
snapshot = tempfile.mkdtemp()
for edit in plan["edits"]:
if edit["action"] == "edit_file":
src = os.path.join(workspace, edit["file"])
dst = os.path.join(snapshot, edit["file"])
os.makedirs(os.path.dirname(dst), exist_ok=True)
shutil.copy(src, dst)
# Phase 3: Apply all edits
try:
for edit in plan["edits"]:
if edit["action"] == "edit_file":
filepath = os.path.join(workspace, edit["file"])
with open(filepath) as f:
content = f.read()
new_content = content.replace(edit["old_text"], edit["new_text"])
with open(filepath, 'w') as f:
f.write(new_content)
except Exception as e:
# Rollback
for edit in plan["edits"]:
if edit["action"] == "edit_file":
src = os.path.join(snapshot, edit["file"])
dst = os.path.join(workspace, edit["file"])
shutil.copy(src, dst)
shutil.rmtree(snapshot)
return {
"success": False,
"error": f"Apply failed: {e}",
"rolled_back": True
}
# Phase 4: Run tests
test_result = run_command_safe("pytest", timeout_seconds=30)
if not test_result["success"]:
# Rollback on test failure
for edit in plan["edits"]:
if edit["action"] == "edit_file":
src = os.path.join(snapshot, edit["file"])
dst = os.path.join(workspace, edit["file"])
shutil.copy(src, dst)
shutil.rmtree(snapshot)
return {
"success": False,
"error": "Tests failed after edits",
"test_output": test_result["stderr"],
"rolled_back": True
}
# Success: cleanup snapshot
shutil.rmtree(snapshot)
return {
"success": True,
"files_modified": len([e for e in plan["edits"] if e["action"] == "edit_file"]),
"test_result": test_result
}
Complexity Metrics
Know when to stop. Multi-file changes have a complexity ceiling:
| Metric | Safe | Warning | Danger |
|---|---|---|---|
| Files affected | < 5 | 5–15 | > 15 |
| Total edits | < 10 | 10–30 | > 30 |
| Edits per file | < 3 | 3–5 | > 5 |
| Test coverage | > 80% | 50–80% | < 50% |
If a plan exceeds "warning" levels, escalate to human or break the task into smaller steps.
Key Takeaways
- Build dependency graphs before refactoring: know what files are affected.
- Plan multi-file changes step-by-step: define all edits before executing.
- Use atomic transactions: apply all edits together, rollback on any failure.
- Run tests after all edits: catch breakage immediately.
- Know complexity limits: if affecting > 15 files, escalate to human.
Frequently Asked Questions
What if the agent makes a mistake in a multi-file edit?
The atomic transaction rolls back all files to the snapshot. No partial corruption. Then the agent can analyze what went wrong and retry with a different strategy.
How do I handle circular dependencies?
Detect cycles in the dependency graph and flag them. Tell the agent to break cycles by introducing an intermediary module or refactoring. Agents can then plan the refactor with the cycle broken.
What if a change requires changes to tests?
Include test files in the impacted files. The agent can modify tests to match the new behavior (e.g., if a function signature changes, update the test calls too).
Can the agent parallelize edits?
Generally no—maintain order to avoid conflicts. However, if two edits are in separate files with no dependencies, they can theoretically execute in any order. Keep it simple: strict ordering is safer.