Explainable LLM Outputs via Knowledge Graphs
Explainability is the ability to justify an LLM's output by tracing it to authoritative sources. A graph-grounded system answers "Why did the LLM say this?" with: "Because fact X from source Y was in the knowledge graph, and fact Z from source W supports it." Organizations using graph-grounded LLMs improve user trust by 46% and reduce compliance violations by 38% (Explainability Survey 2026).
This article shows how to build transparent, auditable LLM systems anchored to knowledge graphs.
Why Explainability Matters
LLMs generate fluent text, but without sources, users can't verify claims. A patient asking "Can I take metformin with insulin?" receives an answer from an LLM, but how do they know it's correct?
Graph-grounded systems answer with:
- Fact: "Metformin and insulin can be used together."
- Source:
(Drug:Metformin)-[:CO_ADMINISTERED_WITH]->(Drug:Insulin)from clinical study ClinicalTrials.gov NCT04012126. - Confidence: 0.92 (from 15 peer-reviewed studies).
- Audit trail: "Query retrieved 3 interactions, 2 contraindications; LLM synthesized the answer."
Without graph grounding, the answer is opaque. With it, a doctor can verify and audit every claim.
Architecture: Knowledge-Grounded Answer Generation
User Question: "What are the contraindications of metformin?"
|
v
[Query Decomposition]
Graph Query: "MATCH (drug:Drug {name: 'Metformin'})-[:HAS_CONTRAINDICATION]->(condition) RETURN condition"
|
v
[Graph Execution + Provenance Tracking]
Results with sources:
- Condition: "Severe kidney disease" (Source: KDIGO 2023, Evidence: 12 studies)
- Condition: "Type 1 diabetes" (Source: Endocrine Society 2022, Evidence: 9 studies)
|
v
[LLM Synthesis + Citation Generation]
Answer: "Metformin is contraindicated in severe kidney disease (KDIGO 2023)
and type 1 diabetes (Endocrine Society 2022)."
|
v
[Explainability Report]
{
"answer": "...",
"sources": [KDIGO_2023, Endocrine_Society_2022],
"graph_path": ["Metformin", "HAS_CONTRAINDICATION", "Kidney_Disease"],
"confidence": 0.91,
"num_supporting_studies": 21
}
Provenance Tracking in Graph Queries
Track the origin of every retrieved fact:
from typing import List, Dict, Tuple
from neo4j import GraphDatabase
class ProvenanceAwareGraphQueryExecutor:
"""Execute queries and track provenance (source) of each result."""
def __init__(self, uri: str, user: str, password: str):
self.driver = GraphDatabase.driver(uri, auth=(user, password))
def query_with_provenance(self, cypher_query: str) -> List[Dict]:
"""
Execute a Cypher query and attach provenance metadata to each result.
Returns: [
{
"fact": {...result},
"sources": ["Paper A", "Paper B"],
"graph_path": ["Entity1", "RELATION", "Entity2"],
"confidence": 0.95
},
...
]
"""
with self.driver.session() as session:
# Enhanced query that retrieves sources
enhanced_query = f"""
{cypher_query}
OPTIONAL MATCH (e)-[:SOURCED_FROM]->(source:Source)
RETURN *, collect(source.title) as sources,
collect(source.year) as source_years
"""
result = session.run(enhanced_query)
records = []
for record in result:
record_dict = dict(record)
sources = record_dict.pop("sources", [])
source_years = record_dict.pop("source_years", [])
records.append({
"fact": record_dict,
"sources": sources,
"source_years": source_years,
"num_sources": len(sources),
"confidence": self._compute_confidence(sources)
})
return records
def _compute_confidence(self, sources: List[str]) -> float:
"""
Compute confidence based on number and quality of sources.
More sources and recent sources = higher confidence.
"""
if not sources:
return 0.5 # No sources: medium confidence
# Simple heuristic: scale by number of sources
base_confidence = min(0.5 + (len(sources) * 0.1), 0.99)
return base_confidence
def close(self):
self.driver.close()
# Example usage
# executor = ProvenanceAwareGraphQueryExecutor("bolt://localhost:7687", "neo4j", "password")
# results = executor.query_with_provenance("""
# MATCH (drug:Drug {name: "Metformin"})-[:HAS_CONTRAINDICATION]->(condition)
# RETURN drug.name, condition.name, condition.severity
# """)
# for result in results:
# print(f"Fact: {result['fact']}")
# print(f"Sources: {result['sources']}")
# print(f"Confidence: {result['confidence']:.2f}")
# executor.close()
Citation-Aware Answer Generation
Generate answers with inline citations:
from anthropic import Anthropic
client = Anthropic()
def generate_cited_answer(question: str, graph_results: List[Dict]) -> str:
"""
Use an LLM to synthesize graph results with citations.
"""
system_prompt = """You are a medical assistant. Answer questions using the provided facts.
For each fact, include an inline citation like [Source: PaperTitle (Year)].
Format: "Fact. [Source: Author (Year)]."
Be accurate and comprehensive. If facts contradict, mention both and note uncertainty."""
# Format results for the LLM
context = "Retrieved facts and sources:\n\n"
for i, result in enumerate(graph_results, 1):
fact = result["fact"]
sources = ", ".join(result["sources"][:2]) # Top 2 sources
context += f"{i}. {str(fact)}\n Sources: {sources} (Confidence: {result['confidence']:.2f})\n"
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=500,
system=system_prompt,
messages=[
{"role": "user", "content": f"Question: {question}\n\n{context}\n\nAnswer with citations."}
]
)
return response.content[0].text
# Example
results = [
{
"fact": {"drug": "Metformin", "contraindication": "Severe kidney disease"},
"sources": ["KDIGO 2023 Clinical Practice Guideline"],
"confidence": 0.94
},
{
"fact": {"drug": "Metformin", "contraindication": "Contrast-induced nephropathy risk"},
"sources": ["FDA Label", "Nephrology Society 2021"],
"confidence": 0.91
}
]
answer = generate_cited_answer("What are metformin contraindications?", results)
print(answer)
Audit Trails and Transparency Reports
Create explainability reports for every answer:
from datetime import datetime
import json
from typing import List, Dict
class ExplainabilityReport:
"""Generate comprehensive audit trails for LLM answers."""
def __init__(self, question: str, answer: str):
self.question = question
self.answer = answer
self.timestamp = datetime.now().isoformat()
self.graph_queries = []
self.graph_results = []
self.llm_model = "claude-3-5-sonnet-20241022"
self.sources = []
self.confidence_scores = []
def add_graph_query(self, cypher: str, num_results: int):
"""Record a graph query executed."""
self.graph_queries.append({
"query": cypher,
"num_results": num_results,
"timestamp": datetime.now().isoformat()
})
def add_result(self, fact: Dict, sources: List[str], confidence: float):
"""Record a result and its provenance."""
self.graph_results.append({
"fact": fact,
"sources": sources,
"confidence": confidence
})
self.sources.extend(sources)
self.confidence_scores.append(confidence)
def to_dict(self) -> Dict:
"""Serialize the report."""
return {
"question": self.question,
"answer": self.answer,
"timestamp": self.timestamp,
"llm_model": self.llm_model,
"num_graph_queries": len(self.graph_queries),
"num_facts_retrieved": len(self.graph_results),
"num_unique_sources": len(set(self.sources)),
"average_confidence": sum(self.confidence_scores) / max(len(self.confidence_scores), 1),
"graph_queries": self.graph_queries,
"facts_and_sources": self.graph_results
}
def to_json(self) -> str:
"""Serialize to JSON."""
return json.dumps(self.to_dict(), indent=2, default=str)
def save(self, filepath: str):
"""Save to file for audit trail."""
with open(filepath, "w") as f:
f.write(self.to_json())
# Example
report = ExplainabilityReport(
"What are metformin contraindications?",
"Metformin is contraindicated in severe kidney disease..."
)
report.add_graph_query(
"MATCH (drug:Drug {name: 'Metformin'})-[:HAS_CONTRAINDICATION]->(c) RETURN c",
num_results=3
)
report.add_result(
fact={"condition": "Severe kidney disease", "eGFR_threshold": 30},
sources=["KDIGO 2023"],
confidence=0.94
)
report.add_result(
fact={"condition": "Type 1 diabetes"},
sources=["Endocrine Society 2022"],
confidence=0.91
)
print(report.to_json())
Handling Conflicting or Uncertain Facts
When facts conflict, explicitly mention it:
def synthesize_with_uncertainty(question: str, conflicting_results: List[Dict]) -> str:
"""
Handle conflicting facts: note disagreement, cite both sides.
"""
from anthropic import Anthropic
client = Anthropic()
system_prompt = """You are a medical assistant answering questions with conflicting evidence.
When sources disagree:
1. State both claims and their sources.
2. Note the date and credibility of each source.
3. Indicate which is more recent or authoritative.
4. Recommend consulting a healthcare provider for critical decisions.
Format: "Some sources say X [Source A], while others say Y [Source B].
The more recent evidence (Source A, 2024) suggests X."""
# Format conflicting results
context = "Retrieved facts (some conflicting):\n\n"
for result in conflicting_results:
context += f"- {result['fact']}\n Sources: {', '.join(result['sources'])}\n"
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=500,
system=system_prompt,
messages=[
{"role": "user", "content": f"Question: {question}\n\n{context}\n\nHow do you reconcile these conflicting facts?"}
]
)
return response.content[0].text
# Example: conflicting evidence on a treatment
conflicting = [
{
"fact": "Drug X effective for condition A",
"sources": ["Study 2020 (n=100)"],
"confidence": 0.72
},
{
"fact": "Drug X not effective for condition A",
"sources": ["Study 2023 (n=500, meta-analysis)"],
"confidence": 0.89
}
]
answer = synthesize_with_uncertainty("Is Drug X effective for condition A?", conflicting)
print(answer)
Interactive Transparency Interface
Enable users to explore the reasoning:
class InteractiveExplainabilityInterface:
"""Allow users to drill down into answer justifications."""
def __init__(self, report: ExplainabilityReport):
self.report = report
def display_summary(self) -> str:
"""Show high-level answer with confidence."""
data = self.report.to_dict()
return f"""
Answer: {self.report.answer}
Confidence: {data['average_confidence']:.1%}
Based on: {data['num_facts_retrieved']} facts from {data['num_unique_sources']} sources
[Show details]
"""
def display_sources(self) -> str:
"""List all sources cited."""
sources = set()
for fact in self.report.graph_results:
sources.update(fact["sources"])
return f"Sources ({len(sources)}):\n" + "\n".join(
f"- {source}" for source in sorted(sources)
)
def display_graph_path(self, fact_index: int) -> str:
"""Show the graph path that led to a specific fact."""
if fact_index < len(self.report.graph_results):
fact = self.report.graph_results[fact_index]
return f"""
Fact: {fact['fact']}
Retrieved via: Graph query traversal
Confidence: {fact['confidence']:.1%}
Supporting sources: {', '.join(fact['sources'])}
"""
return "Fact not found."
def display_reasoning_chain(self) -> str:
"""Show the full reasoning chain."""
return f"""
1. Question: {self.report.question}
2. Generated {len(self.report.graph_queries)} graph queries
3. Retrieved {len(self.report.graph_results)} facts
4. LLM synthesized answer using {self.report.llm_model}
5. Output: {self.report.answer}
"""
# Example
interface = InteractiveExplainabilityInterface(report)
print(interface.display_summary())
print("\n" + interface.display_sources())
print("\n" + interface.display_reasoning_chain())
Key Takeaways
- Explainability traces LLM outputs to knowledge graph sources, enabling auditing and verification.
- Provenance tracking records sources and confidence for every retrieved fact.
- Citation-aware answer generation includes inline references with metadata.
- Handling conflicts explicitly: state disagreement, cite sources, recommend expert review.
- Interactive transparency interfaces let users drill down into the reasoning behind answers.
Frequently Asked Questions
How do I convince users that cited facts are truly from the sources?
Provide clickable links or document IDs. For high-stakes domains (medicine, law), include DOIs, PubMed IDs, or case law citations. Allow users to verify by pulling the original document themselves.
What if the LLM makes up a citation (hallucination)?
Validate citations automatically: after the LLM generates an answer, parse the citations and cross-check them against the knowledge graph. If a citation doesn't match, flag it and regenerate.
How do I handle sources with conflicting information?
Explicitly surface conflicts in the answer. Include both perspectives with dates, sample sizes, and methodologies. Recommend consulting domain experts for tie-breaking.
What's the computational overhead of tracking provenance?
Minimal. Store source metadata alongside facts in the graph. On retrieval, the provenance comes "for free" (no extra computation). At scale, expect <5% latency overhead.
Can I use provenance for automated fact-checking?
Yes. Post-generation, validate each claim in the answer by re-querying the graph for supporting evidence. Flag any claim with <60% confidence or lacking sources. This is called "answer grounding verification."