Skip to main content

Query Planning: Build Research Agent Foundations

Query planning is the foundation of any autonomous research agent. It transforms a user's question into a structured search strategy, breaking complex topics into searchable components and determining the sequence and priority of searches needed to answer the original question comprehensively.

When an agent receives a question like "What are the latest advances in quantum error correction in 2026?", a naive approach would fire a single search and hope for the best. A robust research agent instead decomposes the question into constituent parts: What is quantum error correction? Which techniques exist? What were the 2025–2026 breakthroughs? Which labs are leading research? How do the approaches compare? This planning phase determines the depth and breadth of research that will follow.

What Is Query Decomposition and Why Does It Matter?

Query decomposition is the process of splitting a high-level research question into logically independent sub-questions that an agent can search sequentially or in parallel. Decomposition matters because: (1) it ensures the agent doesn't miss key facets of the topic, (2) it reduces hallucination by constraining the search space, and (3) it creates a structured execution plan that persists across the agent's research loop.

For example, the question "Should companies adopt AI for HR workflows?" naturally decomposes into five sub-questions: (1) What are common AI applications in HR? (2) What are measurable benefits reported? (3) What are documented risks and failures? (4) What compliance challenges exist? (5) How do implementation costs compare to savings? An agent that searches for all five will produce a more balanced and complete report than one that searches for "HR AI benefits" alone.

Structuring Query Decomposition with LLM Prompts

The most effective way to generate query decomposition is to ask the LLM to reason aloud about the question, then extract sub-questions. Here's a production pattern:

import json
from anthropic import Anthropic

client = Anthropic()

def decompose_query(user_question: str) -> dict:
"""Break a research question into sub-questions using Claude."""
system_prompt = """You are a research analyst expert at decomposing complex questions.
Given a user question, identify ALL the key concepts, assumptions, and sub-topics needed to answer it fully.

Output a JSON object with:
- "main_concept": The central topic (2-3 words)
- "sub_questions": Array of 4-7 specific searchable questions
- "priority": "breadth" (explore many aspects) or "depth" (go deep on one topic)
- "entities_to_search": Array of key terms, names, or concepts to search
- "rationale": Brief explanation of your decomposition strategy"""

response = client.messages.create(
model="claude-opus-4-1",
max_tokens=1024,
system=system_prompt,
messages=[
{"role": "user", "content": f"Decompose this research question:\n\n{user_question}"}
]
)

# Extract JSON from response
text = response.content[0].text
try:
return json.loads(text)
except json.JSONDecodeError:
# Fallback: parse JSON from markdown fence
start = text.find('{')
end = text.rfind('}') + 1
return json.loads(text[start:end])

# Example usage
question = "How have AI chip manufacturing techniques evolved between 2024 and 2026?"
plan = decompose_query(question)
print(json.dumps(plan, indent=2))

Output:

{
"main_concept": "AI chip manufacturing evolution",
"sub_questions": [
"What are the dominant AI chip architectures (GPU, TPU, custom ASIC)?",
"Which manufacturing advances improved yield or reduced die size?",
"How did lithography node progression (5nm, 3nm, 2nm) affect AI chips?",
"What role did packaging innovations like chiplets and 3D stacking play?",
"Which companies (TSMC, Samsung, Intel) led manufacturing breakthroughs?"
],
"priority": "breadth",
"entities_to_search": ["TSMC", "Samsung", "AI chip", "GPU", "TPU", "3nm process", "chiplet"]
}

Keyword Prioritization and Search Sequencing

Once sub-questions exist, the agent must decide which searches to run first. This matters because early searches establish context that refines later searches. Use a three-tier priority model:

  • Tier 1 (Foundational): Searches that establish basic definitions or context. Example: "What is quantum error correction?" Run these first to build vocabulary.
  • Tier 2 (Factual): Searches for recent events, specific breakthroughs, or data. Example: "Quantum error correction breakthroughs 2026." Run after Tier 1 establishes terminology.
  • Tier 3 (Comparative/Critical): Searches for contrasting viewpoints, limitations, or challenges. Example: "Quantum error correction limitations challenges." Run after gathering positive evidence.

Here's how to implement this in code:

def prioritize_searches(decomposition: dict) -> list[dict]:
"""Order sub-questions by priority tier."""
queries = []

# Tier 1: definitional / foundational
main_concept = decomposition.get("main_concept", "")
queries.append({
"tier": 1,
"query": f"What is {main_concept}?",
"rationale": "Establish foundational terminology"
})

# Tier 2: factual / recent advances
sub_questions = decomposition.get("sub_questions", [])
for sq in sub_questions[:3]: # Top 3 sub-questions
queries.append({
"tier": 2,
"query": sq,
"rationale": "Gather current facts and advances"
})

# Tier 3: critical / counterarguments
for sq in sub_questions[3:]: # Remaining sub-questions
queries.append({
"tier": 3,
"query": sq + " limitations challenges",
"rationale": "Identify risks, limitations, and critiques"
})

return sorted(queries, key=lambda x: (x["tier"], x.get("rationale")))

# Example
plan = {
"main_concept": "AI chip manufacturing",
"sub_questions": [
"What are dominant AI chip architectures?",
"Which manufacturing advances improved yield?"
]
}
search_order = prioritize_searches(plan)
for q in search_order:
print(f"[Tier {q['tier']}] {q['query']}")

Key Takeaways

  • Query decomposition splits complex questions into searchable sub-questions, reducing hallucination and ensuring comprehensive coverage.
  • Use LLM-guided decomposition prompts to extract main concepts, sub-questions, and key entities automatically.
  • Sequence searches by priority tier: foundational (Tier 1), factual (Tier 2), and critical/comparative (Tier 3) to build context progressively.
  • Document the decomposition plan so later agent steps can track why specific searches were executed and refine the plan if early results warrant it.

Frequently Asked Questions

How many sub-questions should an agent generate?

Aim for 4–7 sub-questions for most topics. Fewer than 4 risks superficiality; more than 7 often introduces redundant or overlapping searches. If the agent generates more, cluster related sub-questions and merge them.

Should the agent always follow the priority tiers strictly?

No. If the first Tier 1 search returns conflicting definitions, the agent should re-plan before continuing to Tier 2. Query planning is a starting point, not a rigid script. The agent should re-evaluate after each search round.

Can query planning be parallelized?

Yes, if sub-questions are truly independent. For example, searches on "manufacturing process" and "cost trends" can run in parallel. However, foundational searches (Tier 1) should complete first to ensure later tiers use consistent terminology.

How do I handle ambiguous questions?

Ask the LLM to flag ambiguities and clarify them in the decomposition step. For example, if "AI advances in 2026" is ambiguous, the agent should list possible interpretations and ask the user which matters most. This prevents wasted searches.

Further Reading