Browse
Core Concepts
Reasoning
Memory & Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools
Safety
Meta
Browse
Core Concepts
Reasoning
Memory & Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools
Safety
Meta
Effective memory management is essential for Large Language Model (LLM) agents to maintain context, recall past interactions, and enhance performance over time. As of 2025, the field has evolved from simple conversation buffers to sophisticated multi-tier memory systems inspired by human cognitive architecture. This article examines the memory landscape for LLM agents, covering memory types, dedicated memory frameworks, and general-purpose agent libraries with memory capabilities.
Agent memory systems draw from cognitive science, organizing information into complementary types:
Sensory Memory is the initial processing of raw multimodal input (vision, text, audio) through encoder modules like Vision Transformers, CLIP, and Whisper. It acts as a high-bandwidth buffer where attention mechanisms filter what gets promoted to working memory.
Short-Term/Working Memory corresponds to the LLM's context window (128K-1M+ tokens in 2025). It holds the current conversation, retrieved facts, and reasoning traces. KV caches, chain-of-thought scratchpads, and in-context learning all operate within this tier.
Long-Term Memory uses external storage (vector databases, knowledge graphs, structured stores) to persist information across sessions. This tier has effectively unlimited capacity but requires retrieval mechanisms to access.
Explicit/Declarative Memory stores facts, events, and concepts that can be directly queried: user preferences, domain knowledge, interaction history. Implemented via vector stores and knowledge graphs.
Implicit/Procedural Memory encodes learned skills and behaviors in model weights through pretraining and fine-tuning. This includes tool-use patterns, reasoning procedures, and response formatting habits.
These types are organized in hierarchical architectures where information flows between tiers through consolidation, eviction, and retrieval operations. See memory augmentation strategies for techniques that enhance these systems.
import numpy as np from [[openai|openai]] import [[openai|OpenAI]] from dataclasses import dataclass, field from datetime import datetime client = [[openai|OpenAI]]() @dataclass class MemoryEntry: text: str embedding: np.ndarray timestamp: datetime = field(default_factory=datetime.now) metadata: dict = field(default_factory=dict) class AgentMemory: def __init__(self, model: str = "text-embedding-3-small"): self.model = model self.entries: list[MemoryEntry] = [] def _embed(self, text: str) -> np.ndarray: resp = client.[[embeddings|embeddings]].create(input=text, model=self.model) return np.array(resp.data[0].embedding, dtype="float32") def store(self, text: str, metadata: dict = None): embedding = self._embed(text) self.entries.append(MemoryEntry(text=text, embedding=embedding, metadata=metadata or {})) def retrieve(self, query: str, top_k: int = 3) -> liststr: query_emb = self._embed(query) scores = [] for entry in self.entries: sim = np.dot(query_emb, entry.embedding) / ( np.linalg.norm(query_emb) * np.linalg.norm(entry.embedding) ) scores.append((sim, entry)) scores.sort(key=lambda x: x[0], reverse=True) return [entry.text for _, entry in scores[:top_k]] memory = AgentMemory() memory.store("User prefers Python over JavaScript for backend work.") memory.store("Last project used FastAPI with PostgreSQL.") memory.store("User is interested in vector databases and HNSW.") relevant = memory.retrieve("What tech stack does the user like?") print("Retrieved memories:", relevant)
Beyond passive storage, advanced agent memory systems employ active consolidation processes inspired by biological sleep and dreaming to optimize memory organization and defragmentation.
Agent Dreaming is a memory management feature that utilizes cyclical phases to consolidate and reorganize agent memory. Implemented in systems such as OpenClaw 2026.4.51), the dreaming process operates in multiple phases:
A key feature of agent dreaming systems is the generation of a Dream Diary, a human-readable record that documents the agent's internal memory consolidation process. This allows users to understand and audit the agent's evolving internal state, memory priorities, and knowledge structure without directly inspecting embeddings or raw memory stores.
This approach mirrors human memory consolidation during sleep and provides transparency into how agents organize long-term knowledge while reducing memory fragmentation and improving retrieval efficiency over extended operational periods.
A new category of tools has emerged focused specifically on providing persistent memory for agents:
These frameworks include memory as part of broader agent capabilities:
The retrieval layer underlying agent memory relies on efficient similarity search: