Core Concepts
Reasoning Techniques
Memory Systems
Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools & Products
Code & Software
Safety & Security
Evaluation
Research
Development
Meta
Core Concepts
Reasoning Techniques
Memory Systems
Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools & Products
Code & Software
Safety & Security
Evaluation
Research
Development
Meta
Voyager is an LLM-powered embodied agent by Wang et al. (2023) that achieves lifelong learning in Minecraft through three interconnected components: an automatic curriculum, an ever-growing skill library, and an iterative code generation loop. Without any model fine-tuning or gradient updates, Voyager uses GPT-4 as a blackbox to explore, acquire skills, and compose increasingly complex behaviors — obtaining 3.3x more unique items, traveling 2.3x longer distances, and unlocking technology tree milestones up to 15.3x faster than prior approaches.
Voyager's architecture integrates three modules that together enable open-ended exploration:
The curriculum module proposes exploration goals by analyzing what the agent has not yet encountered:
A persistent, ever-growing repository of executable code skills:
A feedback-driven loop for synthesizing new skills:
# Simplified Voyager-style skill generation and retrieval loop import openai import numpy as np from typing import List, Dict class SkillLibrary: def __init__(self): self.skills: Dict[str, dict] = {} self.embeddings: Dict[str, np.ndarray] = {} def add_skill(self, name: str, code: str, description: str): self.skills[name] = {"code": code, "description": description} self.embeddings[name] = get_embedding(description) def retrieve(self, query: str, top_k: int = 5) -> List[dict]: query_emb = get_embedding(query) scores = { name: cosine_similarity(query_emb, emb) for name, emb in self.embeddings.items() } top_names = sorted(scores, key=scores.get, reverse=True)[:top_k] return [self.skills[n] for n in top_names] def iterative_code_generation(goal: str, library: SkillLibrary, env_state: dict, max_retries: int = 5): similar_skills = library.retrieve(goal) context = format_context(env_state, similar_skills) for attempt in range(max_retries): code = gpt4_generate(goal, context) success, feedback = execute_in_minecraft(code) if success: library.add_skill(goal, code, description=goal) return code context += f"\nAttempt {attempt+1} failed: {feedback}" return None
Evaluated in the MineDojo framework against ReAct, Reflexion, and AutoGPT baselines:
| Metric | Voyager | Best Baseline | Improvement |
|---|---|---|---|
| Unique items obtained | 63 | 19 (AutoGPT) | 3.3x |
| Travel distance | 2300+ blocks | 1000 blocks | 2.3x |
| Wooden tools (time) | 2 min | 30.6 min (ReAct) | 15.3x faster |
| Stone tools (time) | 5 min | 42.5 min | 8.5x faster |
| Iron tools (time) | 15 min | 96 min | 6.4x faster |
| Diamond tools | Achieved | Not achieved | Unique to Voyager |
Voyager is the only agent to unlock the complete Minecraft technology tree through to diamond-level tools.
The lifelong learning paradigm enables continuous improvement:
<latex>\mathcal{S}_{t+1} = \mathcal{S}_t \cup \{s_{new}\} \text{ where } s_{new} = \text{verify}(\text{generate}(g_t, \mathcal{S}_t, o_t))</latex>
where <latex>\mathcal{S}_t</latex> is the skill library at time <latex>t</latex>, <latex>g_t</latex> is the curriculum-proposed goal, and <latex>o_t</latex> is the environment observation. The library grows monotonically, and skills compound — enabling behaviors impossible through any single generation step.