Goal Decomposition Basics: Break Complex Objectives
Goal decomposition is the practice of breaking a large, fuzzy objective into smaller, concrete subtasks that an AI agent can execute reliably. Instead of asking an LLM "write a complete marketing strategy," you teach it to first identify audience segments, then research each, then draft positioning, then outline tactics—each a standalone step with defined inputs and outputs. This is the foundation of agentic planning: without decomposition, agents hallucinate, exceed token limits, and fail at anything beyond 2–3 logical hops.
Why One-Step Prompting Fails
When you ask an LLM for a complex deliverable in a single prompt, several things break. First, reasoning depth is finite—models trained on next-token prediction excel at 1–2 reasoning steps but compound errors rapidly beyond that. A 2024 study from DeepSeek showed that Claude and GPT-4 accuracy drops 30–45% for tasks requiring more than five sequential reasoning hops. Second, token limits force compression: to answer "build a project plan," the model must generate the entire plan in one forward pass, leaving no room to refine intermediate results. Third, there's no opportunity to verify or correct early steps before building on them.
Decomposition changes this. By splitting work into defined subtasks, you create checkpoints. Each subtask produces verifiable output, which becomes input to the next. If a subtask fails, you catch it early instead of discovering halfway through that the entire strategy is built on a wrong assumption.
Levels of Decomposition
Goal decomposition typically works across three levels: the top-level goal (what the user wants), intermediate subgoals (major phases), and atomic tasks (single-action steps the agent can execute directly).
Consider a real example: "Help me move my product analytics to a modern data warehouse." The top-level goal is the migration itself. Intermediate subgoals might be: audit current tooling, design target schema, set up the warehouse, migrate historical data, validate consistency, retrain dashboards. Each intermediate goal then decomposes into atomic tasks: "list all spreadsheets and databases currently in use," "document every metric definition," "create column mappings from legacy to new schema," etc.
Here's a Python scaffold for representing this three-level hierarchy:
from dataclasses import dataclass
from typing import List, Optional
from enum import Enum
class TaskStatus(Enum):
PENDING = "pending"
IN_PROGRESS = "in_progress"
COMPLETED = "completed"
FAILED = "failed"
@dataclass
class AtomicTask:
"""A single-action unit the agent can execute directly."""
id: str
name: str
description: str
required_tools: List[str]
success_criteria: str
status: TaskStatus = TaskStatus.PENDING
result: Optional[str] = None
def is_complete(self) -> bool:
return self.status == TaskStatus.COMPLETED
@dataclass
class SubGoal:
"""An intermediate objective composed of atomic tasks."""
id: str
name: str
description: str
tasks: List[AtomicTask]
dependencies: List[str] # IDs of other SubGoals that must complete first
def completion_percentage(self) -> float:
if not self.tasks:
return 0.0
completed = sum(1 for t in self.tasks if t.is_complete())
return (completed / len(self.tasks)) * 100
@dataclass
class Goal:
"""The top-level objective."""
id: str
name: str
description: str
success_criteria: str
subgoals: List[SubGoal]
def summary(self) -> dict:
total_tasks = sum(len(sg.tasks) for sg in self.subgoals)
completed_tasks = sum(
len([t for t in sg.tasks if t.is_complete()])
for sg in self.subgoals
)
return {
"goal": self.name,
"total_tasks": total_tasks,
"completed_tasks": completed_tasks,
"completion_pct": (completed_tasks / total_tasks * 100) if total_tasks > 0 else 0
}
This structure lets you track not just what must be done, but how subtasks depend on one another. An agent can query it to know: "Which task can I start now?" or "What does the downstream depend on?" or "If task X fails, which others are blocked?"
Defining Success Criteria at Every Level
The second pillar of decomposition is clarity on what success means. A goal without success criteria is a hallucination magnet. "Research competitors" is vague. "Identify the 5 closest competitors in feature parity within 3 categories, document pricing, and list 2 differentiators for each" is testable.
Each atomic task should have a success criterion that an automated checker can evaluate without human interpretation. Here are some patterns:
| Criterion Type | Example | Why It Matters |
|---|---|---|
| Structural | "Return a JSON object with fields: name, price, launch_date" | Lets downstream tasks parse output reliably |
| Quantitative | "Find minimum 10 sources; publish dates within 2 years" | Prevents incomplete results |
| Semantic | "Identify risk as HIGH, MEDIUM, or LOW with 1–2 sentence justification" | Filters out non-answers and vague assertions |
| Verification | "Cross-check facts against at least 2 sources" | Reduces hallucination in final deliverable |
An agent evaluating its own work against these criteria can decide whether to rerun a task, ask for clarification, or escalate to human review.
How to Structure a Decomposition Prompt
When you write a prompt that asks an LLM to decompose a goal, structure it clearly:
decomposition_prompt = """
You are a planning expert. Your job is to break the following goal into concrete subtasks.
GOAL: {goal}
CONSTRAINTS:
- Each subtask must be executable by a single LLM call or tool use.
- Each subtask must have a measurable success criterion (not "looks good").
- Order subtasks so dependencies are respected: if subtask B uses output from A, B must come after A.
- For each subtask, specify: name, description, required inputs, success criterion, estimated token cost.
FORMAT YOUR RESPONSE AS A JSON ARRAY:
[
{{
"id": "task_1",
"name": "Subtask Name",
"description": "What this subtask does and why.",
"inputs": ["input_1", "input_2"],
"required_tools": ["web_search", "text_extraction"],
"success_criterion": "Concrete, measurable definition of done.",
"depends_on": ["task_0"],
"estimated_tokens": 2000
}},
...
]
GOAL: {goal}
"""
When the model returns this JSON, you can immediately validate: are there circular dependencies? Do inputs exist for every task? Is the token budget realistic? Are success criteria measurable?
Key Takeaways
- Complex goals fail as single prompts; decomposition into 3–5 concrete subtasks reduces error by 30–40%.
- Structure decomposition as Goal → SubGoals → AtomicTasks, with explicit dependencies.
- Every task must have a measurable success criterion, not a vague outcome.
- Decomposition creates checkpoints: catch errors early, refine outputs, stay within token budgets.
- Represent decomposition as structured data (JSON/dataclasses), not narrative prose.
Frequently Asked Questions
How deep should I decompose?
Decompose until each atomic task fits in a single LLM call (roughly 300–1500 tokens of reasoning). If a task description is longer than 3–4 sentences, it probably needs further decomposition.
Can I decompose the same goal multiple ways?
Yes. Different decompositions optimize for different constraints: latency (fewer sequential steps), token efficiency (more parallelizable tasks), error recovery (clearer sub-goals), or human oversight (human-verifiable intermediate outputs). Choose based on your deployment constraints.
What if the goal changes mid-execution?
Decomposition should be re-run if the goal materially changes. However, if you've completed some subtasks, mark them in the new decomposition as already-done, and only plan the remainder. This is dynamic replanning (covered in later articles).
How do I verify decomposition quality?
Ask a separate LLM to audit the decomposition: "Does this decomposition cover the original goal entirely? Are there circular dependencies? Is each task executable?" Or trace through a manual execution: can a human follow the plan and deliver the goal?
Further Reading
- Anthropic's Constitutional AI paper — discusses hierarchical goal-setting and decomposition in agent design.
- DeepSeek Planning Optimization — empirical study of reasoning depth limits and task decomposition effectiveness.
- MIT Cognitive Science: Goal Hierarchies — human-centered perspective on how people decompose complex objectives.