Table of Contents

Agent Memory Persistence

Persisting agent state across sessions using database backends, file storage, vector store persistence, and conversation serialization. Comparison of approaches with implementation patterns.

Overview

Every time you start a new session with an LLM agent, it wakes up blank. It does not remember preferences, project context, decisions from last week, or anything from prior sessions. This “goldfish memory” problem is the single biggest practical limitation of LLM agents for real-world use.1)2)

Memory persistence solves this by storing agent state durably and retrieving it at session start. Persistent storage systems for agent context, history, and learning across sessions are necessary for continual learning and adaptive agent behavior in open-world tasks.3) The right approach depends on your scale, query patterns, and whether you need semantic retrieval or structured lookups. In multi-tenant or collaborative environments, synchronization mechanisms for shared state become critical, enabling agents to access and update context collectively.4)

A specialized approach to memory persistence involves background agents that capture screen context and environment state to build continuous memory of user activities and system state, maintaining awareness across sessions without explicit user intervention.5)-catch-up|The Rundown AI - Persistent Memory Agents (2026]]))

Enterprise platforms like the Gemini Enterprise Agent Platform offer integrated persistent memory banks—storage and retrieval systems that enable agents to maintain context and learning across sessions and interactions, supporting stateful agent operations at scale.6)

Types of Agent Memory

Memory Type Purpose Retention Example
Short-Term Current conversation context Session Recent messages in chat
Long-Term User preferences, learned facts Indefinite “User prefers Python over JS”
Episodic Past interactions and task history Time-decayed “Fixed bug X on March 15”
Semantic Domain knowledge and relationships Indefinite Entity relationships, concepts

Memory Architecture

graph TD A[Agent Session Start] --> B[Load Memory] B --> C[Short-Term: Conversation Buffer] B --> D[Long-Term: Database/File] B --> E[Semantic: Vector Store] F[Agent Processing] --> G{Memory Write?} G -->|New Fact| D G -->|Embedding| E G -->|Context| C H[Agent Session End] --> I[Persist Short-Term Summary] I --> D I --> E subgraph Retrieval at Query Time J[User Query] --> K[Embed Query] K --> L[Vector Similarity Search] L --> M[Ranked Memories] J --> N[Key-Value Lookup] N --> O[Structured Facts] M --> P[Inject into Prompt] O --> P end

Approach Comparison

Criterion Database (Redis/PG/SQLite) File Storage Vector Store Serialization
Durability High (ACID for PG/SQLite) Medium (filesystem) Medium (backend-dependent) Low (app-level)
Scalability High (sharding/clustering) Low (file limits) High (distributed) Medium
Query Speed 1-50ms (indexed) 1-100ms (file ops) 10-50ms (similarity) N/A (full load)
Semantic Search No (without extension) No Yes No
Complexity Medium Low Medium Low
Best For Structured state, multi-agent Episodic logs, prototypes Semantic recall Quick prototypes

Database Backends

PostgreSQL with pgvector

Best for production systems needing both structured queries and semantic search.

Redis

Best for high-speed session state and short-term memory caching.7)

SQLite

Best for single-agent systems, prototypes, and embedded deployments.

Implementation: Database-Backed Memory

import json
from datetime import datetime, timezone
from typing import Optional
 
import asyncpg
 
 
class AgentMemoryStore:
    """Persistent agent memory using [[postgresql|PostgreSQL]] with [[pgvector|pgvector]]."""
 
    def __init__(self, dsn: str):
        self.dsn = dsn
        self.pool = None
 
    async def initialize(self):
        self.pool = await asyncpg.create_pool(self.dsn)
        async with self.pool.acquire() as conn:
            await conn.execute("""
                CREATE EXTENSION IF NOT EXISTS vector;
                CREATE TABLE IF NOT EXISTS agent_memories (
                    id SERIAL PRIMARY KEY,
                    agent_id TEXT NOT NULL,
                    memory_type TEXT NOT NULL,
                    content TEXT NOT NULL,
                    embedding vector(1536),
                    metadata JSONB DEFAULT '{}',
                    created_at TIMESTAMPTZ DEFAULT NOW(),
                    accessed_at TIMESTAMPTZ DEFAULT NOW(),
                    relevance_score FLOAT DEFAULT 1.0
                );
                CREATE INDEX IF NOT EXISTS idx_memories_agent
                    ON agent_memories(agent_id, memory_type);
                CREATE INDEX IF NOT EXISTS idx_memories_embedding
                    ON agent_memories USING ivfflat (embedding vector_cosine_ops);
            """)
 
    async def store(
        self,
        agent_id: str,
        content: str,
        memory_type: str = "long_term",
        embedding: Optional[list[float]] = None,
        metadata: Optionaldict = None,
    ):
        async with self.pool.acquire() as conn:
            await conn.execute(
                """INSERT INTO agent_memories
                   (agent_id, memory_type, content, embedding, metadata)
                   VALUES ($1, $2, $3, $4, $5)""",
                agent_id, memory_type, content, embedding,
                json.dumps(metadata or {}),
            )
 
    async def recall_semantic(
        self,
        agent_id: str,
        query_embedding: listfloat,
        limit: int = 10,
        score_threshold: float = 0.7,
    ) -> listdict:
        async with self.pool.acquire() as conn:
            rows = await conn.fetch(
                """SELECT content, metadata,
                       1 - (embedding <=> $2) AS similarity
                   FROM agent_memories
                   WHERE agent_id = $1
                     AND 1 - (embedding <=> $2) > $3
                   ORDER BY similarity DESC
                   LIMIT $4""",
                agent_id, query_embedding, score_threshold, limit,
            )
            return [dict(r) for r in rows]
 
    async def recall_recent(
        self, agent_id: str, memory_type: str, limit: int = 20
    ) -> listdict:
        async with self.pool.acquire() as conn:
            rows = await conn.fetch(
                """SELECT content, metadata, created_at
                   FROM agent_memories
                   WHERE agent_id = $1 AND memory_type = $2
                   ORDER BY created_at DESC LIMIT $3""",
                agent_id, memory_type, limit,
            )
            return [dict(r) for r in rows]

File-Based Memory

The simplest approach: store memory as human-readable markdown files. Used by soul.py and Claude's MEMORY.md pattern.8)9)

Structure: SOUL.md (identity/persona) + MEMORY.md (curated long-term facts) + memory/YYYY-MM-DD.md (daily session logs).

Vector Store Persistence

Persist embeddings for semantic retrieval using Qdrant, Chroma, pgvector, or similar.

Conversation Serialization

Dump full chat histories or agent states to JSON/YAML for reload.

Memory Lifecycle Management

Frameworks

See Also

References

8)
Themenon Lab. “soul.py: Persistent Memory for LLM Agents.” themenonlab.blog
10)
Vectorize. “Best AI Agent Memory Systems.” vectorize.io