====== Cheshire Cat AI ======

**Cheshire Cat AI** is an open-source framework for building custom AI agents as microservices(([[https://github.com/cheshire-cat-ai/core|Cheshire Cat AI on GitHub]])). It provides a ready-made architecture with plugin system, long-term memory (episodic, declarative, procedural), built-in RAG, multi-LLM support, and an API-first design -- all running in a single Docker container.

{{tag>ai_agent framework microservice plugin rag memory docker open_source}}

| **Repository** | [[https://github.com/cheshire-cat-ai/core]] |
| **Website** | [[https://cheshirecat.ai]] |
| **Language** | Python |
| **License** | GPL-3.0 |
| **Creator** | Piero Savastano (pieroit) |
| **Port** | 1865 (default) |

===== Overview =====

Cheshire Cat AI lets developers build custom AI agents in minutes rather than months. Instead of wiring together LLMs, vector databases, memory systems, and APIs from scratch, the framework provides all of these as a cohesive microservice(([[https://cheshire-cat-ai.github.io/docs/|Cheshire Cat Documentation]])). Developers extend functionality through a simple plugin system using hooks, tools, and forms -- no complex OOP required. The framework is 100% Dockerized, API-first, and language-model agnostic.

===== Key Features =====

  * **Plugin Architecture** -- Extend with hooks (event handlers), tools (custom functions), and forms; one-click install from community registry
  * **Long-Term Memory** -- Episodic (chat history), declarative (uploaded documents), and procedural (tools and plugins) memory
  * **Working Memory** -- Temporary session storage for state machines and cross-plugin data sharing
  * **Built-in RAG** -- Retrieval-augmented generation using Qdrant vector database(([[https://qdrant.tech/documentation/frameworks/cheshire-cat/|Qdrant Integration Guide]]))
  * **Multi-Modal RAG** -- Build knowledge bases from PDFs, text files, markdown, URLs, and more
  * **LLM Agnostic** -- Supports OpenAI, Anthropic, Cohere, Hugging Face, Ollama, vLLM, and custom providers
  * **API-First** -- REST and WebSocket endpoints with token streaming; community clients in multiple languages
  * **Multi-Tenancy** -- Manage multiple chatbots with separate settings, plugins, and LLMs
  * **Admin Panel** -- Web-based configuration for LLMs, embedders, plugins, and memory management
  * **Live Reload** -- Development-friendly hot reloading of plugins

===== Architecture =====

<mermaid>
graph TD
    A[Client Application] --> B{API Layer}
    B --> C[REST Endpoints]
    B --> D[WebSocket Chat]
    C --> E[Cheshire Cat Core]
    D --> E
    E --> F[Agent System]
    F --> G{Memory System}
    G --> H[Episodic Memory]
    G --> I[Declarative Memory]
    G --> J[Procedural Memory]
    G --> K[Working Memory]
    H --> L[Qdrant Vector DB]
    I --> L
    J --> L
    F --> M{LLM Provider}
    M --> N[OpenAI]
    M --> O[Anthropic]
    M --> P[Ollama / vLLM]
    M --> Q[Cohere / HuggingFace]
    F --> R[Plugin System]
    R --> S[Hooks]
    R --> T[Tools]
    R --> U[Forms]
    E --> V[RAG Pipeline]
    V --> W[Document Ingestion]
    W --> X[PDF / TXT / MD / URL]
    V --> L
    E --> Y[Admin Panel :1865/admin]
</mermaid>

===== Plugin System =====

Plugins are simple Python folders placed in cat/plugins/. A minimal plugin requires:

  * plugin.json -- Metadata (name, description, version)
  * A Python file with hooks, tools, or forms

<code python>
# Example: Custom hook to filter responses
from cat.mad_hatter.decorators import hook

@hook
def agent_fast_reply(fast_reply, cat):
    if len(cat.working_memory.declarative_memories) == 0:
        fast_reply["output"] = "Sorry, I don't know the answer."
    return fast_reply
</code>

<code python>
# Example: Custom tool
from cat.mad_hatter.decorators import tool

@tool
def get_weather(location: str, cat) -> str:
    """Get the current weather for a location."""
    return f"Weather in {location}: Sunny, 22C"
</code>

===== Memory System =====

| **Memory Type** | **Purpose** | **Storage** |
| Episodic | Chat history and conversation context | Qdrant vectors |
| Declarative | Uploaded documents and external knowledge | Qdrant vectors |
| Procedural | Tools, plugins, and learned procedures | Qdrant vectors |
| Working | Temporary session data, state machines | In-memory (per session) |

All long-term memories support export/import for backup and migration.

===== Installation =====

<code bash>
# Quick start with Docker
docker run --rm -it -p 1865:80 ghcr.io/cheshire-cat-ai/core:latest

# Access points:
# Admin Panel: http://localhost:1865/admin
# API Docs: http://localhost:1865/docs
# WebSocket: ws://localhost:1865/ws
</code>

===== Integration =====

Cheshire Cat integrates with the broader infrastructure as a microservice:((Cheshire Cat AI. Official Website. [[https://cheshirecat.ai|cheshirecat.ai]]))

  * **Reverse Proxies** -- Caddy, Nginx, Traefik
  * **Vector Databases** -- Qdrant (default), extensible
  * **LLM Runners** -- Ollama, vLLM for self-hosted models
  * **Applications** -- Django, WordPress, custom apps via REST/WebSocket API

===== See Also =====

  * [[mobile_agent]] -- GUI agent for mobile and desktop automation
  * [[plandex]] -- AI coding agent with plan/apply workflow
  * [[gptme]] -- Terminal agent with local tools and RAG

===== References =====