Beyond Vector Retrieval
SYNAPSE: Unified Episodic-Semantic Memory
E-mem: Multi-Agent Episodic Context Reconstruction
Code Example
Comparison of Memory Architectures
Biological Inspiration
References
See Also

Spreading Activation Memory

Biologically-inspired memory architectures for LLM agents go beyond flat vector stores by modeling the dynamic, associative nature of human memory. SYNAPSE introduces a unified episodic-semantic graph with spreading activation, while E-mem uses multi-agent episodic context reconstruction to preserve reasoning integrity over long horizons.

Beyond Vector Retrieval

Standard RAG-based agent memory uses embedding similarity to retrieve relevant context. This approach suffers from:

Static retrieval: Cannot capture dynamic relevance that emerges from the query's relationship to the full memory graph
Weak-signal blindness: Misses distantly related but important memories that have low direct similarity
Context fragmentation: Retrieved chunks lose their temporal and causal relationships
Flat structure: Cannot distinguish between raw episodic memories and abstract semantic knowledge

Biologically-inspired approaches address these limitations by modeling memory as cognitive science suggests humans do – through interconnected networks where activation spreads along associative pathways.

SYNAPSE: Unified Episodic-Semantic Memory

SYNAPSE (arXiv:2601.02744) introduces a brain-inspired memory architecture that unifies episodic and semantic memories in a directed graph with spreading activation dynamics.

Graph Architecture

The memory graph contains two types of nodes:

Episodic nodes: Raw interaction turns with text content, embeddings, and timestamps – the “what happened” memory
Semantic nodes: Abstract concepts, entities, and preferences extracted periodically by the LLM – the “what I know” memory

Three types of edges connect the graph:

Temporal edges: Chronological links between episodic nodes (recency)
Abstraction edges: Bidirectional links between episodic and semantic nodes (generalization/instantiation)
Association edges: Correlations between semantic nodes (conceptual relatedness)

Spreading Activation Mechanism

When a query arrives, the retrieval process works as follows:

Embedding injection: The query embedding activates the most similar nodes in the graph
Energy propagation: Activation “energy” spreads outward through edges:
- Temporal edges propagate based on recency
- Abstraction edges propagate between specific memories and general concepts
- Association edges propagate between related concepts
Convergence: After multiple propagation steps, the activation pattern stabilizes
Context assembly: The highest-activated nodes (both episodic and semantic) are assembled into context for the LLM

This emulates the cognitive science model of spreading activation in human memory: when you think of “dog,” activation spreads to “pet,” “bark,” “walk,” and eventually reaches more distant concepts like “veterinarian” or “loyalty.”

Why Spreading Activation Matters

Weak-signal retrieval: Discovers relevant memories that have low direct similarity but strong associative connections
Dynamic relevance: The same memory graph produces different retrieval results for different queries, based on the activation pattern
Noise filtering: Low-relevance nodes receive insufficient activation to cross the retrieval threshold
Emergent context: Combines episodic and semantic information naturally through the propagation dynamics

Key Results

Outperforms 10 baselines across system-level, graph-based, retrieval, and agentic categories
Hybrid episodic-semantic design achieves Pareto frontier for efficiency and consistency
Removing spreading activation drops performance to static graph levels (average score 30.5 vs. full system)
Pure vector retrieval scores only 25.2 on average – activation is essential for long-horizon reasoning

E-mem: Multi-Agent Episodic Context Reconstruction

E-mem (Wang et al., 2026, arXiv:2601.21714) shifts from memory preprocessing to episodic context reconstruction, addressing the problem of “destructive de-contextualization” in traditional memory systems.

The De-Contextualization Problem

Traditional memory preprocessing methods (embeddings, graphs, summaries) compress complex sequential dependencies into pre-defined structures. This severs the contextual integrity essential for System 2 (deliberative) reasoning:

Embeddings lose sequential structure
Graphs lose nuance and narrative flow
Summaries lose details needed for precise reasoning

Hierarchical Multi-Agent Architecture

Inspired by biological engrams (the physical traces of memories in neural tissue):

Master agent: Orchestrates global planning and coordinates retrieval across the full memory
Assistant agents: Each maintains an uncompressed segment of the memory context – no lossy preprocessing
Active reasoning: Unlike passive retrieval, assistants locally reason within their activated memory segments, extracting context-aware evidence before aggregation

Retrieval Process

Master agent receives a query and formulates a retrieval plan
Relevant assistant agents are activated (analogous to engram activation in neuroscience)
Each activated assistant reasons over its uncompressed memory segment
Assistants extract context-aware evidence (not just raw text chunks)
Master agent aggregates evidence from all assistants into a coherent response

Results

LoCoMo benchmark: Over 54% F1 score
Surpasses state-of-the-art GAM by 7.75%
Reduces token cost by over 70% compared to full-context approaches
Preserves reasoning integrity that preprocessing-based methods destroy

Code Example

import numpy as np
from collections import defaultdict
 
class SpreadingActivationMemory:
    """SYNAPSE-style episodic-semantic memory with spreading activation."""
 
    def __init__(self, decay=0.85, threshold=0.1, max_steps=5):
        self.nodes = {}        # id -> {type, content, embedding}
        self.edges = defaultdict(list)  # id -> [(target_id, edge_type, weight)]
        self.decay = decay
        self.threshold = threshold
        self.max_steps = max_steps
 
    def add_episodic(self, node_id, content, embedding, timestamp):
        self.nodes[node_id] = {
            "type": "episodic", "content": content,
            "embedding": embedding, "timestamp": timestamp
        }
 
    def add_semantic(self, node_id, concept, embedding):
        self.nodes[node_id] = {
            "type": "semantic", "content": concept,
            "embedding": embedding
        }
 
    def add_edge(self, source, target, edge_type, weight=1.0):
        self.edges[source].append((target, edge_type, weight))
 
    def retrieve(self, query_embedding, top_k=10):
        """Spreading activation retrieval."""
        # Step 1: Initial activation from query similarity
        activation = {}
        for nid, node in self.nodes.items():
            sim = np.dot(query_embedding, node["embedding"])
            if sim > self.threshold:
                activation[nid] = sim
 
        # Step 2: Spread activation through edges
        for step in range(self.max_steps):
            new_activation = dict(activation)
            for nid, energy in activation.items():
                if energy < self.threshold:
                    continue
                for target, etype, weight in self.edges.get(nid, []):
                    spread = energy * self.decay * weight
                    new_activation[target] = max(
                        new_activation.get(target, 0), spread
                    )
            activation = new_activation
 
        # Step 3: Return top-k activated nodes
        ranked = sorted(activation.items(), key=lambda x: -x[1])
        return [(self.nodes[nid], score) for nid, score in ranked[:top_k]]

Comparison of Memory Architectures

Architecture	Memory Type	Retrieval	Preserves Context	Multi-Agent
Vector RAG	Flat embeddings	Similarity search	No	No
Knowledge Graph	Structured triples	Graph traversal	Partial	No
SYNAPSE	Episodic + Semantic graph	Spreading activation	Yes (via edges)	No
E-mem	Uncompressed segments	Agent-based reasoning	Yes (uncompressed)	Yes

Biological Inspiration

Both SYNAPSE and E-mem draw on established models from cognitive science and neuroscience:

Spreading activation (Collins & Loftus, 1975): Retrieval from human semantic memory works by activation spreading through associative networks, not by direct lookup
Episodic-semantic distinction (Tulving, 1972): Human memory maintains separate but interconnected systems for specific experiences and general knowledge
Engrams (Tonegawa et al., 2015): Physical memory traces in neural tissue are reactivated during recall – E-mem's assistant agents model this reactivation process
Consolidation: Over time, episodic memories are abstracted into semantic knowledge – SYNAPSE's periodic LLM extraction of semantic nodes mirrors this process

Table of Contents