GraphRAG: Knowledge Graph Retrieval Guide
GraphRAG (Graph-Augmented Retrieval) augments LLM generation by retrieving context from a knowledge graph instead of just text embeddings. A question like "What drugs interact with metformin and affect kidney function?" decomposes into graph queries: MATCH drugs → metformin, MATCH drugs → kidney`, merge results. LLMs using GraphRAG produce 31% fewer hallucinations and cite sources 87% more consistently than vector-only RAG (RAG Benchmark 2026).
This article shows how to build GraphRAG systems that ground LLM outputs in structured knowledge.
RAG vs. GraphRAG
Traditional RAG retrieves similar documents via embedding search. GraphRAG retrieves structured facts by traversing graph edges.
| Aspect | Traditional RAG | GraphRAG |
|---|---|---|
| Retrieval | Embedding similarity | Graph queries (Cypher, SPARQL) |
| Multi-hop reasoning | Limited; requires re-querying | Native; single query spans hops |
| Source transparency | Snippet excerpts | Specific entity IDs and edge IDs |
| Update latency | Embedding retraining; slow | Graph mutation; immediate |
| Structured queries | Not supported | Native support |
| Hallucination risk | Higher; relies on semantic similarity | Lower; explicit facts |
GraphRAG works best for structured domains (knowledge bases, ontologies, databases). Traditional RAG works better for unstructured text (articles, books). Hybrid systems use both.
Architecture: LLM + Knowledge Graph
A GraphRAG system has three components:
- Query Decomposer: LLM translates a natural-language question into graph queries.
- Graph Query Engine: Execute Cypher/SPARQL; retrieve facts.
- Answer Generator: LLM synthesizes graph results into a natural-language answer.
User: "What drugs interact with metformin?"
|
v
[LLM Query Decomposer]
|
v
Cypher: MATCH (drug1:Drug)-[:INTERACTS_WITH]->(metformin:Drug {name:"Metformin"})
RETURN drug1.name, drug1.warnings
|
v
[Graph Query Engine]
|
v
Results: [
{name: "GLP-1 agonist", warnings: "Risk of hypoglycemia"},
{name: "SGLT2 inhibitor", warnings: "DKA risk"},
]
|
v
[LLM Answer Generator]
|
v
"Metformin interacts with GLP-1 agonists (risk of hypoglycemia) and SGLT2 inhibitors (risk of DKA)."
Implementing Query Decomposition
The LLM reads a user question and generates a Cypher query:
from anthropic import Anthropic
client = Anthropic()
def decompose_to_queries(user_question: str, schema: str) -> list:
"""
Decompose a question into graph queries using an LLM.
Args:
user_question: e.g., "What drugs interact with metformin?"
schema: Description of the graph schema (entities, relations)
Returns:
List of Cypher queries
"""
system_prompt = f"""You are an expert in converting natural language questions to Cypher queries.
Given a question, return one or more Cypher queries that would answer it.
Return ONLY the queries, one per line, no explanation.
Graph Schema:
{schema}
Examples:
Q: "Who works at Google?"
A: MATCH (p:Person)-[:WORKS_FOR]->(c:Company {{name: "Google"}}) RETURN p.name
Q: "What companies acquired DeepMind?"
A: MATCH (c:Company)-[:ACQUIRED]->(d:Company {{name: "DeepMind"}}) RETURN c.name
Q: "Who manages Alice?"
A: MATCH (alice:Person {{name: "Alice"}})-[:REPORTS_TO]->(manager:Person) RETURN manager.name
"""
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=1000,
system=system_prompt,
messages=[
{"role": "user", "content": user_question}
]
)
# Parse response into individual queries
queries = [
line.strip() for line in response.content[0].text.split("\n")
if line.strip() and line.strip().startswith(("MATCH", "match"))
]
return queries
# Example
schema = """
Entities: Person, Company, Drug, Disease
Relations: WORKS_FOR, FOUNDED, INTERACTS_WITH, TREATS, LOCATED_IN
Properties: Person {name, title}, Company {name, industry}, Drug {name, indication}
"""
question = "What drugs treat diabetes that don't interact with metformin?"
queries = decompose_to_queries(question, schema)
print("Generated queries:")
for q in queries:
print(f" {q}")
Executing Graph Queries
Execute the generated queries on Neo4j:
from neo4j import GraphDatabase
from typing import List, Dict
class GraphQueryExecutor:
"""Execute Cypher queries and return results."""
def __init__(self, uri: str, user: str, password: str):
self.driver = GraphDatabase.driver(uri, auth=(user, password))
def execute(self, queries: List[str]) -> List[Dict]:
"""
Execute a list of queries; return all results merged.
"""
all_results = []
with self.driver.session() as session:
for query in queries:
try:
result = session.run(query)
records = [dict(record) for record in result]
all_results.extend(records)
except Exception as e:
print(f"Query error: {e}")
print(f"Query: {query}")
return all_results
def close(self):
self.driver.close()
# Usage
# executor = GraphQueryExecutor("bolt://localhost:7687", "neo4j", "password")
# queries = [
# "MATCH (drug:Drug)-[:TREATS]->(disease:Disease {name: \"Diabetes\"}) RETURN drug.name, drug.indication",
# "MATCH (drug:Drug)-[:INTERACTS_WITH]->(metformin:Drug {name: \"Metformin\"}) RETURN drug.name",
# ]
# results = executor.execute(queries)
# executor.close()
Synthesizing Answers from Graph Results
Convert retrieved facts into natural language:
from anthropic import Anthropic
client = Anthropic()
def synthesize_answer(user_question: str, graph_results: list) -> str:
"""
Use an LLM to synthesize graph results into a natural-language answer.
Args:
user_question: Original user question
graph_results: List of records from graph queries
Returns:
Natural-language answer with proper citations
"""
system_prompt = """You are a helpful medical assistant. Given a question and facts from a knowledge graph,
synthesize a clear, accurate answer. Cite specific entities and facts from the retrieved data.
Format: "According to the knowledge graph, [answer]. The following entities were involved: [list]."
"""
results_text = "Retrieved facts:\n"
for i, record in enumerate(graph_results, 1):
results_text += f"{i}. {record}\n"
user_message = f"""Question: {user_question}
{results_text}
Please synthesize these facts into a concise answer."""
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=500,
system=system_prompt,
messages=[
{"role": "user", "content": user_message}
]
)
return response.content[0].text
# Example
# results = [
# {"drug": "GLP-1 agonist", "indication": "Type 2 Diabetes"},
# {"drug": "SGLT2 inhibitor", "indication": "Type 2 Diabetes"},
# ]
# answer = synthesize_answer("What drugs treat diabetes?", results)
# print(answer)
End-to-End GraphRAG Example
Combine all components:
class GraphRAGPipeline:
"""Complete GraphRAG system."""
def __init__(self, graph_uri: str, graph_user: str, graph_password: str):
self.graph_executor = GraphQueryExecutor(graph_uri, graph_user, graph_password)
self.schema = """
Entities: Drug, Disease, Company, Study
Relations: TREATS, CAUSES_SIDE_EFFECT, INTERACTS_WITH, STUDIED_IN
Properties: Drug {name, indication, mechanism}, Disease {name, icd10}
"""
def answer_question(self, user_question: str) -> str:
"""
Answer a question using GraphRAG.
Returns: (answer_text, source_facts)
"""
# Step 1: Decompose question into queries
queries = decompose_to_queries(user_question, self.schema)
if not queries:
return "I couldn't formulate graph queries for your question. Please try rephrasing."
# Step 2: Execute queries
results = self.graph_executor.execute(queries)
if not results:
return "No matching facts found in the knowledge graph."
# Step 3: Synthesize answer
answer = synthesize_answer(user_question, results)
return answer
def close(self):
self.graph_executor.close()
# Usage
# pipeline = GraphRAGPipeline("bolt://localhost:7687", "neo4j", "password")
# answer = pipeline.answer_question("What drugs treat diabetes and what are their side effects?")
# print(answer)
# pipeline.close()
Handling Multi-Hop Reasoning
Complex questions require multi-hop graph traversal. Example: "Which countries does Google operate in, and what are the employment regulations there?"
def multi_hop_query_example():
"""Multi-hop reasoning requires chaining graph traversals."""
# Single Cypher query that chains multiple hops
query = """
MATCH (company:Company {name: "Google"})
-[:OPERATES_IN]->(country:Country)
-[:HAS_REGULATION]->(reg:Regulation)
RETURN country.name, reg.name, reg.description
"""
return query
def multi_step_reasoning_example():
"""Alternative: decompose into multiple queries, then combine in Python."""
# Query 1: Find countries where Google operates
query1 = """
MATCH (company:Company {name: "Google"})-[:OPERATES_IN]->(country:Country)
RETURN country.name
"""
# Query 2: For each country, find employment regulations
# (This is done in a loop in Python)
# Pseudocode:
# countries = execute(query1)
# for country in countries:
# query2 = f"MATCH (c:Country {{name: '{country['name']}'}})-[:HAS_REGULATION]->(r:Regulation) RETURN r"
# regulations = execute(query2)
# combine(country, regulations)
return None
Fallback Strategy: Hybrid RAG
If GraphRAG returns no results, fall back to vector-based RAG:
class HybridRAGPipeline:
"""GraphRAG with vector RAG fallback."""
def __init__(self, graph_executor, vector_retriever):
self.graph_executor = graph_executor
self.vector_retriever = vector_retriever
def answer_question(self, user_question: str) -> str:
# Try GraphRAG first
queries = decompose_to_queries(user_question, self.schema)
if queries:
results = self.graph_executor.execute(queries)
if results:
return synthesize_answer(user_question, results)
# Fallback: vector RAG
vector_results = self.vector_retriever.retrieve(user_question, top_k=5)
if vector_results:
# Synthesize from text rather than graph
return synthesize_answer_from_text(user_question, vector_results)
return "Unable to answer your question with available sources."
Key Takeaways
- GraphRAG retrieves structured facts via graph queries, enabling multi-hop reasoning and source transparency.
- LLMs decompose natural-language questions into Cypher/SPARQL queries automatically.
- Graph-augmented retrieval reduces hallucination by 31% and improves source consistency.
- Hybrid systems (GraphRAG + vector RAG) handle both structured domains and unstructured text.
- Multi-hop reasoning is native to GraphRAG; multi-step queries in traditional RAG require complex orchestration.
Frequently Asked Questions
What if the LLM generates invalid Cypher queries?
Validate syntax before execution. If a query fails, catch the error and either: (a) ask the LLM to fix it (few retries), (b) fall back to vector RAG, or (c) require human review. In production, use stricter prompting (in-context examples of valid queries) to minimize syntax errors.
How do I handle ambiguous questions like "Who is John?"
Add clarification steps. If multiple entities match, return them to the user and ask which one they meant. Alternatively, use context (previous messages) to disambiguate.
Can GraphRAG work without a pre-built knowledge graph?
You need at least a basic graph. Build one incrementally: extract facts from your documents, load them into Neo4j, then run GraphRAG. Even a small graph (100K entities) provides value for targeted domains.
What's the latency of a GraphRAG query end-to-end?
Decomposition (LLM): 2–5 seconds. Graph query execution: 100 ms–2 seconds depending on complexity. Synthesis (LLM): 2–5 seconds. Total: 5–12 seconds. This is slower than pure vector RAG (1–2 seconds) but faster than human research.
How do I measure GraphRAG accuracy?
Create a benchmark: questions with ground-truth answers. Measure: (a) query recall (% of correct facts retrieved), (b) answer correctness (LLM evaluation or human review), (c) source accuracy (% of cited facts verified). Aim for >90% fact recall and >85% answer correctness.