Skip to main content

Semantic Memory: Building Agent Knowledge Bases

Semantic memory consolidates patterns across episodic records into abstract, reusable facts: rather than storing "user preferred PDF on May 15, May 22, May 29," semantic memory abstracts "user_format_preference=pdf; confidence=0.94." This compression is essential for long-running agents: without it, memory grows linearly with time. Semantic memory also enables faster reasoning: the agent looks up a fact in O(1) instead of scanning episodic history in O(n).

What Semantic Memory Is and Why It Differs from Episodic

Episodic memory answers "what happened?"; semantic memory answers "what is true?" Episodic records are specific and timestamped; semantic facts are general and timeless. An episodic record: "On May 15, Bob asked for a PDF export of his invoice." The abstracted semantic fact: "Bob's export format preference is PDF."

Semantic memory serves two purposes: (1) compression, reducing total memory footprint; and (2) generalization, allowing the agent to infer decisions for new situations based on learned patterns. An agent that has logged 5,000 customer interactions may compress them into 50 learned facts about user preferences, each annotated with confidence and recency.

AspectEpisodicSemantic
ScopeSpecific eventGeneral pattern across events
TemporalTimestampedTimeless (or decayed over time)
Query"What happened on May 15?""What is the user's preference?"
StorageUnbounded growthCompressed, bounded
InferenceDirect recallAbstraction and generalization

Fact Abstraction: From Episodes to Semantic Knowledge

The core process is induction: given a set of episodic records, identify patterns and distill them into semantic facts. For example:

# Episodic records (from database)
episodes = [
{"date": "2026-05-15", "user": "alice", "action": "export", "format": "csv"},
{"date": "2026-05-22", "user": "alice", "action": "export", "format": "csv"},
{"date": "2026-06-01", "user": "alice", "action": "export", "format": "csv"},
]

# Semantic fact (abstracted from episodes)
semantic_fact = {
"subject": "alice",
"predicate": "export_format_preference",
"object": "csv",
"confidence": 0.95,
"evidence_count": 3,
"last_updated": "2026-06-01"
}

Abstraction can be rule-based (e.g., "if user chose X in 3+ episodes, infer preference=X") or learned (e.g., using an LLM to extract facts). Rule-based is fast and interpretable; learned approaches are more nuanced but require additional computation.

# Example: Rule-based fact abstraction
from collections import defaultdict

def extract_semantic_facts(episodic_records: list) -> dict:
"""
Extract semantic facts from episodic records using simple voting.
"""
semantic_facts = {}

# Group episodes by (user, attribute)
patterns = defaultdict(lambda: defaultdict(list))
for record in episodic_records:
user_id = record["user_id"]
# Extract relevant attributes (format, priority, etc.)
if "format" in record.get("input_data", {}):
fmt = record["input_data"]["format"]
patterns[user_id]["format_preference"].append(fmt)

# Vote: if a preference appears in >= 70% of episodes, it's a fact
for user_id, attrs in patterns.items():
for attr_name, values in attrs.items():
value_counts = defaultdict(int)
for v in values:
value_counts[v] += 1

total = len(values)
for value, count in value_counts.items():
confidence = count / total
if confidence >= 0.70: # Threshold
fact_key = f"{user_id}_{attr_name}"
semantic_facts[fact_key] = {
"subject": user_id,
"predicate": attr_name,
"object": value,
"confidence": confidence,
"evidence_count": count
}

return semantic_facts

Confidence Scoring and Decay

Not all semantic facts are equally reliable. An agent should prefer high-confidence facts over low-confidence ones. Confidence can be computed as: confidence = evidence_count / total_observations. Additionally, confidence should decay over time: a preference learned 6 months ago is less reliable than one learned last week.

A practical decay function uses exponential decay: confidence_today = confidence_original * exp(-decay_rate * days_since_learned). For user preferences, a common decay_rate is 0.01 (half-life of ~70 days), meaning a learned preference loses 50% confidence every 2.5 months without reinforcement.

# Example: Confidence decay over time
import math
from datetime import datetime, timedelta

class SemanticFact:
def __init__(self, subject, predicate, obj, confidence, created_at):
self.subject = subject
self.predicate = predicate
self.object = obj
self.initial_confidence = confidence
self.created_at = created_at
self.last_reinforced = created_at # Updated each time we see supporting evidence
self.decay_rate = 0.01 # Adjust per use case

def current_confidence(self, as_of: datetime = None):
"""Compute confidence with decay applied."""
as_of = as_of or datetime.now()
days_since_last_reinforced = (as_of - self.last_reinforced).days

decayed = self.initial_confidence * math.exp(-self.decay_rate * days_since_last_reinforced)
return min(decayed, self.initial_confidence) # Don't exceed original

def reinforce(self, as_of: datetime = None):
"""Update the fact when we see supporting evidence again."""
as_of = as_of or datetime.now()
# Optionally re-learn: bump confidence if it hasn't decayed too much
current = self.current_confidence(as_of)
self.initial_confidence = max(self.initial_confidence, current)
self.last_reinforced = as_of

Knowledge Graphs: Structured Semantic Storage

For systems with many facts, a knowledge graph provides efficient lookup and inference. A knowledge graph is a set of triples: (subject, predicate, object). For example: (bob, email, [email protected]), (alice, manager, carol). Lookup is O(1); inference (multi-hop queries) can be O(k) where k is path length.

Common knowledge graph backends include:

  • Neo4j: Graph database, native Cypher query language, good for reasoning tasks.
  • Weaviate/Qdrant: Vector + relational storage (hybrid approach).
  • RDF triple stores: SPARQL query language, semantic web standard.

A simpler in-memory approach for smaller systems:

# Example: In-memory knowledge graph
class SemanticMemory:
def __init__(self):
self.facts = {} # (subject, predicate) -> {object, confidence, ...}
self.reverse_index = {} # object -> list of (subject, predicate) tuples

def add_fact(self, subject: str, predicate: str, obj: str, confidence: float = 1.0):
"""Add a semantic fact to the graph."""
key = (subject, predicate)
self.facts[key] = {
"object": obj,
"confidence": confidence,
"created_at": datetime.now()
}

# Maintain reverse index for inference
if obj not in self.reverse_index:
self.reverse_index[obj] = []
self.reverse_index[obj].append((subject, predicate))

def get_fact(self, subject: str, predicate: str):
"""Look up a fact."""
key = (subject, predicate)
return self.facts.get(key)

def find_by_object(self, obj: str):
"""Find all facts with this object (reverse lookup)."""
return self.reverse_index.get(obj, [])

def query(self, subject: str, predicate: str = None):
"""Query facts for a subject, optionally filtered by predicate."""
results = []
for (s, p), fact in self.facts.items():
if s == subject and (predicate is None or p == predicate):
results.append((p, fact))
return results

Learning and Updating Semantic Memory

Semantic facts should be updated when new episodic evidence arrives. Two strategies: (1) batch update - periodically re-abstract all facts from episodic records (costly but thorough), (2) incremental update - each new episode is checked against existing facts, and if it supports or contradicts a fact, confidence is updated.

Incremental update is more practical for real-time agents:

def update_semantic_on_episode(
semantic_memory: SemanticMemory,
episode: dict,
threshold: float = 0.70
):
"""
Called each time a new episodic event is logged.
Update semantic facts if the episode provides supporting or contradictory evidence.
"""
user_id = episode.get("user_id")
event_type = episode.get("event_type")

# Example: If the episode is a format choice
if event_type == "user_request" and "format" in episode.get("input_data", {}):
chosen_format = episode["input_data"]["format"]

# Check if we have an existing preference fact
existing_fact = semantic_memory.get_fact(user_id, "format_preference")

if existing_fact:
# Reinforce if it matches; weaken if it doesn't
if existing_fact["object"] == chosen_format:
# Reinforce confidence
new_confidence = min(existing_fact["confidence"] + 0.05, 1.0)
else:
# Contradict: lower confidence
new_confidence = existing_fact["confidence"] - 0.10

semantic_memory.add_fact(
user_id,
"format_preference",
chosen_format if new_confidence >= threshold else existing_fact["object"],
new_confidence
)
else:
# No existing fact; start with low-to-medium confidence
semantic_memory.add_fact(user_id, "format_preference", chosen_format, 0.40)

Privacy and Generalization Risks

Semantic memory abstracts personal data (preferences, behavior patterns), which raises privacy concerns. Care must be taken: (1) don't share semantic facts across unrelated agents/users without explicit consent, (2) disclose to users that preferences are inferred and offer opt-out, (3) encrypt sensitive facts, (4) regularly audit inferred facts for bias (e.g., inferring incorrect gender based on name patterns).

Key Takeaways

  • Semantic memory distills patterns from episodic records into abstract, reusable facts, reducing memory footprint and speeding inference.
  • Use confidence scoring and decay: facts learned long ago or supported by few observations are less reliable than recent, heavily-supported facts.
  • Implement knowledge graphs for efficient storage and inference; even simple in-memory graphs work for small systems.
  • Update semantic facts incrementally: each new episodic event is checked for supporting or contradictory evidence.
  • Protect privacy: encrypt sensitive facts, disclose inference, and audit for bias.

Frequently Asked Questions

How do I decide what facts to abstract?

Focus on high-impact, recurrent patterns: user preferences, capability strengths/weaknesses, common error patterns. Skip one-off events. A practical heuristic: abstract a fact if you've seen >= 3 supporting episodes.

What confidence threshold should I use for facts?

For low-risk decisions (e.g., UI format), use 0.50 (more exploratory). For high-risk (financial, medical), use 0.90+ (conservative). Start at 0.70 as a default and adjust based on feedback.

Should I disclose inferred facts to users?

Yes, especially if they affect the agent's behavior. "I've noticed you prefer PDFs—shall I send it that way?" is transparent and allows users to correct the agent if the inference is wrong.

Can I merge semantic facts across users (e.g., "all users prefer email over SMS")?

Carefully. Cross-user patterns can be useful for default behavior, but they hide individual variation. Store both: per-user facts (personalization) and aggregate facts (defaults), and prefer per-user when available.

How do I detect when a semantic fact is outdated?

Use decay and recency. Periodically compute confidence; if confidence drops below a threshold (e.g., 0.30), flag for review or deletion. For user preferences, spot-check against recent behavior quarterly.

Further Reading