MRKL Systems

MRKL (Modular Reasoning, Knowledge and Language) systems are a neuro-symbolic architecture proposed by Karpas et al., 2022 (AI21 Labs) that combines large language models with a set of discrete expert modules to solve complex tasks.¹⁾ The LLM serves as a router that decomposes user queries and dispatches sub-tasks to specialized modules, then synthesizes their outputs into a coherent response. MRKL systems represent an early and influential formalization of tool-augmented AI agents.

Paper: Karpas, E. et al. "MRKL Systems: A [[modular|modular, neuro-symbolic architecture that combines large language models, external knowledge sources and discrete reasoning.“ arXiv:2205.00445, May 2022]]
Authors: Ehud Karpas et al., AI21 Labs

Architecture

MRKL systems consist of three types of modules coordinated by a central router:

Neural Modules: Large language models (e.g., Jurassic-1, GPT-3) that handle natural language understanding, generation, and serve as the routing backbone.

Symbolic Modules: Discrete, specialized tools for tasks requiring precise computation or external knowledge:

Calculators for arithmetic and mathematical reasoning
Database query engines for structured data access
API endpoints for real-time information retrieval
Knowledge bases for domain-specific facts
Code interpreters for programmatic tasks

Router (LLM-based): The central component that:

Analyzes user input to identify required sub-tasks
Selects appropriate expert modules for each sub-task
Generates prompts for each module
Collects and synthesizes results into a final response

Key Properties

Extensibility: New expert modules can be added without retraining the base LLM
Safe fallback: If no specialist module matches, the system falls back to the general LLM
Interpretability: Module-specific outputs provide traceable reasoning paths
Compositionality: Complex queries can be decomposed into multiple module calls, executed sequentially or in parallel
Up-to-date knowledge: Symbolic modules can access current information that the LLM's training data lacks

Modern Implementations

Jurassic-X (AI21 Labs): The first production implementation of MRKL, augmenting the Jurassic language model with symbolic expert modules and external knowledge sources.

The MRKL architecture directly influenced subsequent tool-augmented systems:

OpenAI Function Calling: Provides the structured interface for module dispatch
LangChain Agents: Implement MRKL-style routing with customizable tool sets
ReAct: Adds explicit reasoning traces to the routing loop²⁾
MCP: Standardizes the protocol layer for connecting to expert modules

Code Example: Expert Routing Pattern

import math
import json
import datetime
 
 
def calculator_expert(query: str) -> str:
    allowed = set("0123456789+-*/.() ")
    expr = query.strip()
    if all(c in allowed for c in expr):
        return str(eval(expr))  # Safe: only math chars allowed
    return "Error: invalid expression"
 
 
def datetime_expert(query: str) -> str:
    now = datetime.datetime.now()
    if "date" in query.lower():
        return now.strftime("%Y-%m-%d")
    if "time" in query.lower():
        return now.strftime("%H:%M:%S")
    if "day" in query.lower():
        return now.strftime("%A")
    return f"Current datetime: {now.isoformat()}"
 
 
def knowledge_expert(query: str) -> str:
    kb = {
        "python": "Python is a high-level programming language created by Guido van Rossum in 1991.",
        "transformer": "The [[transformer_architecture|Transformer architecture]] was introduced in 'Attention Is All You Need' (2017).",
        "mrkl": "MRKL systems combine LLMs with specialized expert modules for [[modular|modular]] reasoning.",
    }
    for key, value in kb.items():
        if key in query.lower():
            return value
    return "No relevant information found."
 
 
EXPERTS = {
    "calculator": {"fn": calculator_expert, "keywords": ["calculate", "compute", "math", "+", "-", "*", "/"]},
    "datetime": {"fn": datetime_expert, "keywords": ["date", "time", "day", "today", "now"]},
    "knowledge": {"fn": knowledge_expert, "keywords": ["what is", "tell me", "explain", "who"]},
}
 
 
def route_to_expert(query: str) -> tuple[str, str]:
    best_expert, best_score = "knowledge", 0
    for name, config in EXPERTS.items():
        score = sum(1 for kw in config["keywords"] if kw in query.lower())
        if score > best_score:
            best_expert, best_score = name, score
    result = EXPERTSbest_expert["fn"](query)
    return best_expert, result
 
 
queries = [
    "Calculate 155 * 23 + 17",
    "What day is today?",
    "What is a transformer?",
    "Compute 3.14 * 100",
]
 
for q in queries:
    expert, answer = route_to_expert(q)
    print(f"Query: {q}\n  Routed to: {expert}\n  Answer: {answer}\n")

MRKL vs Modern Agent Architectures

Aspect	MRKL (2022)	Modern Agents (2025)
Routing	LLM-based module selection	Function calling, MCP, semantic search
Modules	Fixed set of expert tools	Dynamic tool discovery, plugin systems
Communication	Custom prompts per module	Standardized JSON schemas, protocols
Scale	Handful of modules	Hundreds of tools via registries
Learning	No tool-use training	Fine-tuned for tool use (Toolformer)

While the specific MRKL implementation has been superseded, its core insight, that LLMs should orchestrate specialized tools rather than trying to do everything themselves, became the foundational principle of modern agentic AI.

References

¹⁾

Karpas, E. et al. "MRKL Systems: A modular, neuro-symbolic architecture that combines large language models, external knowledge sources and discrete reasoning." arXiv:2205.00445, May 2022

²⁾

Yao et al., 2022, ReAct: Synergizing Reasoning and Acting in Language Models

Table of Contents