Skip to main content

Web Search API: Query Research Agents Effectively

Web search APIs are the eyes of an autonomous research agent. They connect your planning logic to the live internet, returning ranked lists of potentially relevant sources. However, raw search results are often noisy, outdated, or biased toward high-traffic sites. This article teaches you how to construct effective search queries, interpret API results, and filter sources for quality before your agent fetches and reads them.

A web search API (like Google Custom Search, Bing Search, or Perplexity API) returns a list of URLs, titles, and snippets ranked by relevance. The agent's job is to (1) phrase queries that match what the API expects, (2) prune low-quality or irrelevant results, and (3) prioritize sources by authority, freshness, and stated relevance. Good search integration can reduce downstream fetching overhead by 40–60% by eliminating dead URLs and off-topic pages before they reach the reader.

How Do You Construct Effective Search Queries?

A good research agent search query is short, specific, and includes the right keywords without being overstuffed. The query should reflect what the user wants to know, not what the agent plans to do internally. For instance, instead of "latest advances in quantum error correction 2026", use a tighter query like "quantum error correction breakthrough 2026" that a search API can match against indexable text.

Here's a pattern for LLM-guided query construction:

from anthropic import Anthropic

client = Anthropic()

def refine_search_query(original_question: str, sub_question: str, context: str = "") -> str:
"""
Convert a research sub-question into a terse, search-engine-friendly query.
"""
system_prompt = """You are a search query optimization expert.
Given a research sub-question and optional context from prior searches,
construct a SHORT, SPECIFIC search query (5-10 words) that a search engine can match.

Rules:
- Include the core concept and 1-2 modifiers (year, domain, author, event)
- Avoid stop words and articles
- Prefer exact keywords over synonyms
- Put the most specific term first
- Do NOT include 'how to', 'what is', or question words in the query itself

Return only the query string, no explanation."""

prompt = f"Sub-question: {sub_question}"
if context:
prompt += f"\nPrior search context: {context}"

response = client.messages.create(
model="claude-opus-4-1",
max_tokens=50,
system=system_prompt,
messages=[{"role": "user", "content": prompt}]
)

return response.content[0].text.strip()

# Example usage
sub_q = "Which manufacturing advances improved yield in AI chip production?"
context = "We learned that TSMC dominates AI chip manufacturing for GPUs and TPUs."
query = refine_search_query("", sub_q, context)
print(f"Search query: {query}")
# Output: "TSMC AI chip yield improvements 2025 2026"

Ranking and Filtering Search Results

Not all search results are created equal. A research agent should score results by:

  1. Relevance Score (0–1): How well does the snippet match the query? Use BM25 or an LLM classifier.
  2. Source Authority (0–1): Is it from a known authoritative domain? Maintain a whitelist of domains (research papers, government, established publications).
  3. Freshness (0–1): How recent is the page? Prefer pages updated in the last 12 months for current topics.
  4. Read Feasibility (0–1): Can your fetcher likely read this? Avoid PDFs, videos, or paywalled sites.

Combine these into a composite score. Here's a practical implementation:

import re
from datetime import datetime
from typing import TypedDict

class SearchResult(TypedDict):
url: str
title: str
snippet: str
source: str
publish_date: str

TRUSTED_DOMAINS = {
"arxiv.org": 1.0,
"github.com": 0.95,
"openai.com": 1.0,
"deepmind.google": 1.0,
"nist.gov": 1.0,
"wikipedia.org": 0.7,
"medium.com": 0.6,
"techcrunch.com": 0.7,
"theverge.com": 0.6
}

def score_result(result: SearchResult, query: str, today: datetime) -> float:
"""
Rank a search result on 0–1 scale.
"""
url = result["url"]
snippet = result["snippet"]

# Extract domain
domain_match = re.search(r"https?://(?:www\.)?([^/]+)", url)
domain = domain_match.group(1) if domain_match else ""

# Authority score
authority = TRUSTED_DOMAINS.get(domain, 0.5)

# Relevance: count query keywords in snippet
query_terms = query.lower().split()
snippet_lower = snippet.lower()
relevance = sum(1 for term in query_terms if term in snippet_lower) / len(query_terms)

# Freshness: prefer pages updated < 1 year ago
freshness = 0.5 # Default: unknown date
if result.get("publish_date"):
try:
pub_date = datetime.fromisoformat(result["publish_date"])
days_old = (today - pub_date).days
freshness = max(0, 1 - (days_old / 365)) # Linear decay over 1 year
except:
pass

# Composite score (weighted)
composite = (authority * 0.4) + (relevance * 0.35) + (freshness * 0.25)
return composite

# Example
result = {
"url": "https://arxiv.org/abs/2406.14283",
"title": "Query Planning for Autonomous Research",
"snippet": "We present a query planning system for research agents...",
"source": "arxiv.org",
"publish_date": "2026-04-15"
}
score = score_result(result, "query planning research agent", datetime(2026, 6, 2))
print(f"Result score: {score:.3f}") # High score expected

Handling Rate Limits and API Errors Gracefully

Most web search APIs enforce rate limits (e.g., 100 requests/day for free tiers). An intelligent research agent should:

  1. Cache results by query string to avoid duplicate searches.
  2. Backoff exponentially if the API returns 429 (rate limit) or 500 errors.
  3. Fail gracefully by returning fewer results rather than crashing.
  4. Rotate APIs if available (e.g., use Bing if Google quota is exhausted).

Here's a robust wrapper:

import time
import hashlib
from functools import lru_cache

class SearchClient:
def __init__(self, api_key: str, max_retries: int = 3):
self.api_key = api_key
self.max_retries = max_retries
self.cache = {} # In production, use Redis or SQLite

def search(self, query: str, num_results: int = 10) -> list[SearchResult]:
"""
Execute a search with caching and retry logic.
"""
cache_key = hashlib.md5(query.encode()).hexdigest()

# Check cache first
if cache_key in self.cache:
return self.cache[cache_key][:num_results]

# Retry logic with exponential backoff
for attempt in range(self.max_retries):
try:
results = self._call_api(query, num_results)
self.cache[cache_key] = results
return results
except Exception as e:
if attempt < self.max_retries - 1:
wait_time = 2 ** attempt # 1s, 2s, 4s
time.sleep(wait_time)
else:
# Return cached result from prior search or empty list
return self.cache.get(cache_key, [])

def _call_api(self, query: str, num_results: int) -> list[SearchResult]:
"""
Call the actual search API (implement with your provider).
"""
# Placeholder: replace with real API call
raise NotImplementedError("Implement with your search API")

Key Takeaways

  • Construct search queries by extracting the core concept and 1–2 key modifiers; keep queries to 5–10 words for best API matches.
  • Score search results on authority, relevance, and freshness to prioritize high-quality sources and reduce downstream fetching overhead.
  • Cache search results and implement exponential backoff for API errors to avoid rate limits and handle transient failures gracefully.
  • Maintain a whitelist of trusted domains (research archives, government, established publications) to bias scoring toward authoritative sources.

Frequently Asked Questions

Should the agent perform multiple searches per sub-question?

Yes, often. For a sub-question like "recent breakthroughs in quantum computing," the agent should run 2–3 queries with slight variations (e.g., "quantum computing breakthrough 2026", "quantum computing advances 2026", "quantum computing research 2026") to catch different indexing patterns.

How long should I cache search results?

For fast-moving topics (AI, security), cache for 7 days maximum. For evergreen topics (fundamental concepts), cache for 30 days. Always allow manual refresh.

What if the search API returns fewer results than requested?

This is normal. Gracefully accept partial results (as few as 3–5) rather than retrying endlessly. The agent should adapt its confidence thresholds accordingly.

How do I filter paywalled articles?

Check for keywords like "paywall", "subscription", "access restricted" in the snippet or page metadata. Many APIs (like Google Custom Search) allow filtering by "free" content only. Alternatively, have your fetcher try to access the page and skip if it returns 403 or prompts for login.

Further Reading