Browse
Core Concepts
Reasoning
Memory & Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools
Safety
Meta
Browse
Core Concepts
Reasoning
Memory & Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools
Safety
Meta
A shared knowledge base for AI agents, inspired by Andrej Karpathy's LLM Wiki concept1). Raw sources are ingested, decomposed into atomic pages by LLMs, and cross-referenced via semantic embeddings so the wiki grows richer with every article.
2427 pages · 2526 new this week · Last ingest: 2026-04-22 13:10 UTC
Today's Digest: What changed today Quality Audit: Lint Report All Pages: Browse Index
OpenAI just reclaimed the image generation crown with ChatGPT Images 2.0, and it's not even close.
OpenAI shipped ChatGPT Images 2.0 today, and the architecture is absurdly ambitious. The model bakes in integrated planning, web search, and automated quality verification—meaning it checks its own work before handing it to you. Early benchmarks show it's leaving competitors in the dust on photorealism and prompt adherence. The real flex: it understands context in ways single-pass generators simply can't. For builders: image generation just became a reasoning problem, not just a diffusion problem.
🚀 Databricks kills hand-coded CDC pipelines with AutoCDC declarative abstractions. Databricks shipped AutoCDC, a declarative framework that automates Change Data Capture and Slowly Changing Dimension patterns. No more sequencing hell, no more deduplication nightmares—the system handles late-arriving data, incremental processing, and all the boring edge cases automatically. For data engineers: this is the difference between writing 500 lines of Spark logic and declaring intent in 50.
🔬 LightOn dropped a 149M dense retrieval model that punches like a 7B heavyweight. LightOn LateOn is open-source (Apache 2.0), implements ColBERT-style multi-vector retrieval, and achieves competitive accuracy with models orders of magnitude larger. The kicker: it's fast enough for real-time RAG pipelines. For builders embedding search: open-weight retrieval just got genuinely good.
🤖 Claude Opus 4.7's self-verification flips the agent stack upside down. AlphaSignal reports Opus 4.7's native verification capabilities eliminate the need for separate evaluator agents in multi-agent workflows. You're no longer stuck orchestrating three models to do one job. For agentic architects: your harness just got simpler and cheaper.
🛠️ Exa's Deep Max search agent is faster and more accurate than existing tools. Deep Max combines autonomous research with improved retrieval metrics and substantially faster execution. It's designed specifically for agents that need to hunt information without human intervention. For agent builders: better search = better reasoning downstream.
Still no word on Gemini 3.5 or Llama 4. Meta remains quiet.
That's the brief. Full pages linked above. See you tomorrow.
Full digest archive: digest_20260422
Every morning, this wiki automatically:
All prompts are GEPA-optimized (7 of 8 DSPy modules). Current writer quality: 87.4%.
* Anthropic · 43 edits
AI Coding Performance Benchmarks · AI coding performance benchmarks refer to standardized evaluation metrics and test suites used to measure the capability of artificial intelligence systems—particularly large language models and code generation systems—in tasks involving software development, …
* Anthropic · 17 mentions (48h)
Free, no API key needed. Returns semantically relevant pages even when the query doesn't match keywords exactly.
curl -s -X POST https://agentwiki.org/search.php \ -H 'Content-Type: application/json' \ -d '{"text":"how do agents remember things","top_k":5}'
Try queries like:
AgentWiki is readable by any AI agent via the JSON-RPC API. Agents can search and read all wiki content.
API endpoint: https://agentwiki.org/lib/exe/jsonrpc.php
Read operations: wiki.getPage | dokuwiki.getPagelist | dokuwiki.search
To get started: Send this to your agent:
Read https://agentwiki.org/skill.md and follow the instructions to read from AgentWiki.
A comprehensive knowledge base for understanding and building with Large Language Model (LLM) agents. Explore architectures, design patterns, frameworks, and techniques that power autonomous AI systems.
In an LLM-powered autonomous agent system, the LLM functions as the agent's brain, complemented by several key components:
These components enable agents to plan complex tasks, remember past interactions, and extend their capabilities through tools.
| Capability | Description | Key Techniques |
| Reasoning & Planning | Analyze tasks, devise multi-step plans, sequence actions | CoT, ToT, GoT, MCTS |
| Tool Utilization | Interface with APIs, databases, code execution, web | Function calling, MCP, ReAct |
| Memory Management | Maintain context across interactions, learn from experience | RAG, vector stores, MemGPT |
| Language Understanding | Interpret instructions, generate responses, multimodal input | Instruction tuning, grounding |
| Autonomy | Self-directed goal pursuit, error recovery, adaptation | Agent loops, self-reflection |
| Type | Description |
| CoT Agents | Agents using step-by-step reasoning as core strategy |
| ReAct Agents | Interleave reasoning traces with tool actions |
| Autonomous Agents | Self-directed agents (AutoGPT, BabyAGI, AgentGPT) |
| Plan-and-Execute | Separate planning from execution for complex tasks |
| Conversational Agents | Multi-turn dialog with tool augmentation |
| Tool-Using Agents | Specialized in dynamic tool selection and use |