Skip to main content

GraphRAG: Knowledge Graph Retrieval Guide

GraphRAG (Graph-Augmented Retrieval) augments LLM generation by retrieving context from a knowledge graph instead of just text embeddings. A question like "What drugs interact with metformin and affect kidney function?" decomposes into graph queries: MATCH drugs → metformin, MATCH drugs → kidney`, merge results. LLMs using GraphRAG produce 31% fewer hallucinations and cite sources 87% more consistently than vector-only RAG (RAG Benchmark 2026).

This article shows how to build GraphRAG systems that ground LLM outputs in structured knowledge.

RAG vs. GraphRAG

Traditional RAG retrieves similar documents via embedding search. GraphRAG retrieves structured facts by traversing graph edges.

AspectTraditional RAGGraphRAG
RetrievalEmbedding similarityGraph queries (Cypher, SPARQL)
Multi-hop reasoningLimited; requires re-queryingNative; single query spans hops
Source transparencySnippet excerptsSpecific entity IDs and edge IDs
Update latencyEmbedding retraining; slowGraph mutation; immediate
Structured queriesNot supportedNative support
Hallucination riskHigher; relies on semantic similarityLower; explicit facts

GraphRAG works best for structured domains (knowledge bases, ontologies, databases). Traditional RAG works better for unstructured text (articles, books). Hybrid systems use both.

Architecture: LLM + Knowledge Graph

A GraphRAG system has three components:

  1. Query Decomposer: LLM translates a natural-language question into graph queries.
  2. Graph Query Engine: Execute Cypher/SPARQL; retrieve facts.
  3. Answer Generator: LLM synthesizes graph results into a natural-language answer.
User: "What drugs interact with metformin?"
|
v
[LLM Query Decomposer]
|
v
Cypher: MATCH (drug1:Drug)-[:INTERACTS_WITH]->(metformin:Drug {name:"Metformin"})
RETURN drug1.name, drug1.warnings
|
v
[Graph Query Engine]
|
v
Results: [
{name: "GLP-1 agonist", warnings: "Risk of hypoglycemia"},
{name: "SGLT2 inhibitor", warnings: "DKA risk"},
]
|
v
[LLM Answer Generator]
|
v
"Metformin interacts with GLP-1 agonists (risk of hypoglycemia) and SGLT2 inhibitors (risk of DKA)."

Implementing Query Decomposition

The LLM reads a user question and generates a Cypher query:

from anthropic import Anthropic

client = Anthropic()

def decompose_to_queries(user_question: str, schema: str) -> list:
"""
Decompose a question into graph queries using an LLM.

Args:
user_question: e.g., "What drugs interact with metformin?"
schema: Description of the graph schema (entities, relations)

Returns:
List of Cypher queries
"""
system_prompt = f"""You are an expert in converting natural language questions to Cypher queries.
Given a question, return one or more Cypher queries that would answer it.
Return ONLY the queries, one per line, no explanation.

Graph Schema:
{schema}

Examples:
Q: "Who works at Google?"
A: MATCH (p:Person)-[:WORKS_FOR]->(c:Company {{name: "Google"}}) RETURN p.name

Q: "What companies acquired DeepMind?"
A: MATCH (c:Company)-[:ACQUIRED]->(d:Company {{name: "DeepMind"}}) RETURN c.name

Q: "Who manages Alice?"
A: MATCH (alice:Person {{name: "Alice"}})-[:REPORTS_TO]->(manager:Person) RETURN manager.name
"""

response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=1000,
system=system_prompt,
messages=[
{"role": "user", "content": user_question}
]
)

# Parse response into individual queries
queries = [
line.strip() for line in response.content[0].text.split("\n")
if line.strip() and line.strip().startswith(("MATCH", "match"))
]
return queries

# Example
schema = """
Entities: Person, Company, Drug, Disease
Relations: WORKS_FOR, FOUNDED, INTERACTS_WITH, TREATS, LOCATED_IN
Properties: Person {name, title}, Company {name, industry}, Drug {name, indication}
"""

question = "What drugs treat diabetes that don't interact with metformin?"
queries = decompose_to_queries(question, schema)

print("Generated queries:")
for q in queries:
print(f" {q}")

Executing Graph Queries

Execute the generated queries on Neo4j:

from neo4j import GraphDatabase
from typing import List, Dict

class GraphQueryExecutor:
"""Execute Cypher queries and return results."""

def __init__(self, uri: str, user: str, password: str):
self.driver = GraphDatabase.driver(uri, auth=(user, password))

def execute(self, queries: List[str]) -> List[Dict]:
"""
Execute a list of queries; return all results merged.
"""
all_results = []

with self.driver.session() as session:
for query in queries:
try:
result = session.run(query)
records = [dict(record) for record in result]
all_results.extend(records)
except Exception as e:
print(f"Query error: {e}")
print(f"Query: {query}")

return all_results

def close(self):
self.driver.close()

# Usage
# executor = GraphQueryExecutor("bolt://localhost:7687", "neo4j", "password")
# queries = [
# "MATCH (drug:Drug)-[:TREATS]->(disease:Disease {name: \"Diabetes\"}) RETURN drug.name, drug.indication",
# "MATCH (drug:Drug)-[:INTERACTS_WITH]->(metformin:Drug {name: \"Metformin\"}) RETURN drug.name",
# ]
# results = executor.execute(queries)
# executor.close()

Synthesizing Answers from Graph Results

Convert retrieved facts into natural language:

from anthropic import Anthropic

client = Anthropic()

def synthesize_answer(user_question: str, graph_results: list) -> str:
"""
Use an LLM to synthesize graph results into a natural-language answer.

Args:
user_question: Original user question
graph_results: List of records from graph queries

Returns:
Natural-language answer with proper citations
"""
system_prompt = """You are a helpful medical assistant. Given a question and facts from a knowledge graph,
synthesize a clear, accurate answer. Cite specific entities and facts from the retrieved data.

Format: "According to the knowledge graph, [answer]. The following entities were involved: [list]."
"""

results_text = "Retrieved facts:\n"
for i, record in enumerate(graph_results, 1):
results_text += f"{i}. {record}\n"

user_message = f"""Question: {user_question}

{results_text}

Please synthesize these facts into a concise answer."""

response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=500,
system=system_prompt,
messages=[
{"role": "user", "content": user_message}
]
)

return response.content[0].text

# Example
# results = [
# {"drug": "GLP-1 agonist", "indication": "Type 2 Diabetes"},
# {"drug": "SGLT2 inhibitor", "indication": "Type 2 Diabetes"},
# ]
# answer = synthesize_answer("What drugs treat diabetes?", results)
# print(answer)

End-to-End GraphRAG Example

Combine all components:

class GraphRAGPipeline:
"""Complete GraphRAG system."""

def __init__(self, graph_uri: str, graph_user: str, graph_password: str):
self.graph_executor = GraphQueryExecutor(graph_uri, graph_user, graph_password)
self.schema = """
Entities: Drug, Disease, Company, Study
Relations: TREATS, CAUSES_SIDE_EFFECT, INTERACTS_WITH, STUDIED_IN
Properties: Drug {name, indication, mechanism}, Disease {name, icd10}
"""

def answer_question(self, user_question: str) -> str:
"""
Answer a question using GraphRAG.

Returns: (answer_text, source_facts)
"""
# Step 1: Decompose question into queries
queries = decompose_to_queries(user_question, self.schema)

if not queries:
return "I couldn't formulate graph queries for your question. Please try rephrasing."

# Step 2: Execute queries
results = self.graph_executor.execute(queries)

if not results:
return "No matching facts found in the knowledge graph."

# Step 3: Synthesize answer
answer = synthesize_answer(user_question, results)

return answer

def close(self):
self.graph_executor.close()

# Usage
# pipeline = GraphRAGPipeline("bolt://localhost:7687", "neo4j", "password")
# answer = pipeline.answer_question("What drugs treat diabetes and what are their side effects?")
# print(answer)
# pipeline.close()

Handling Multi-Hop Reasoning

Complex questions require multi-hop graph traversal. Example: "Which countries does Google operate in, and what are the employment regulations there?"

def multi_hop_query_example():
"""Multi-hop reasoning requires chaining graph traversals."""

# Single Cypher query that chains multiple hops
query = """
MATCH (company:Company {name: "Google"})
-[:OPERATES_IN]->(country:Country)
-[:HAS_REGULATION]->(reg:Regulation)
RETURN country.name, reg.name, reg.description
"""

return query

def multi_step_reasoning_example():
"""Alternative: decompose into multiple queries, then combine in Python."""

# Query 1: Find countries where Google operates
query1 = """
MATCH (company:Company {name: "Google"})-[:OPERATES_IN]->(country:Country)
RETURN country.name
"""

# Query 2: For each country, find employment regulations
# (This is done in a loop in Python)

# Pseudocode:
# countries = execute(query1)
# for country in countries:
# query2 = f"MATCH (c:Country {{name: '{country['name']}'}})-[:HAS_REGULATION]->(r:Regulation) RETURN r"
# regulations = execute(query2)
# combine(country, regulations)

return None

Fallback Strategy: Hybrid RAG

If GraphRAG returns no results, fall back to vector-based RAG:

class HybridRAGPipeline:
"""GraphRAG with vector RAG fallback."""

def __init__(self, graph_executor, vector_retriever):
self.graph_executor = graph_executor
self.vector_retriever = vector_retriever

def answer_question(self, user_question: str) -> str:
# Try GraphRAG first
queries = decompose_to_queries(user_question, self.schema)

if queries:
results = self.graph_executor.execute(queries)
if results:
return synthesize_answer(user_question, results)

# Fallback: vector RAG
vector_results = self.vector_retriever.retrieve(user_question, top_k=5)

if vector_results:
# Synthesize from text rather than graph
return synthesize_answer_from_text(user_question, vector_results)

return "Unable to answer your question with available sources."

Key Takeaways

  • GraphRAG retrieves structured facts via graph queries, enabling multi-hop reasoning and source transparency.
  • LLMs decompose natural-language questions into Cypher/SPARQL queries automatically.
  • Graph-augmented retrieval reduces hallucination by 31% and improves source consistency.
  • Hybrid systems (GraphRAG + vector RAG) handle both structured domains and unstructured text.
  • Multi-hop reasoning is native to GraphRAG; multi-step queries in traditional RAG require complex orchestration.

Frequently Asked Questions

What if the LLM generates invalid Cypher queries?

Validate syntax before execution. If a query fails, catch the error and either: (a) ask the LLM to fix it (few retries), (b) fall back to vector RAG, or (c) require human review. In production, use stricter prompting (in-context examples of valid queries) to minimize syntax errors.

How do I handle ambiguous questions like "Who is John?"

Add clarification steps. If multiple entities match, return them to the user and ask which one they meant. Alternatively, use context (previous messages) to disambiguate.

Can GraphRAG work without a pre-built knowledge graph?

You need at least a basic graph. Build one incrementally: extract facts from your documents, load them into Neo4j, then run GraphRAG. Even a small graph (100K entities) provides value for targeted domains.

What's the latency of a GraphRAG query end-to-end?

Decomposition (LLM): 2–5 seconds. Graph query execution: 100 ms–2 seconds depending on complexity. Synthesis (LLM): 2–5 seconds. Total: 5–12 seconds. This is slower than pure vector RAG (1–2 seconds) but faster than human research.

How do I measure GraphRAG accuracy?

Create a benchmark: questions with ground-truth answers. Measure: (a) query recall (% of correct facts retrieved), (b) answer correctness (LLM evaluation or human review), (c) source accuracy (% of cited facts verified). Aim for >90% fact recall and >85% answer correctness.

Further Reading