Core Concepts
Reasoning
Memory & Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools
Safety & Security
Evaluation
Meta
Core Concepts
Reasoning
Memory & Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools
Safety & Security
Evaluation
Meta
This is an old revision of the document!
System prompts are the foundation of AI agent behavior. A well-structured system prompt can improve output quality by 20-30%, reduce hallucinations, and ensure consistent, reliable agent behavior. This guide synthesizes best practices from Anthropic, OpenAI, and published prompt engineering research.1)2)3)4)5)
Every production system prompt should contain these components in order of priority:
Anchor the agent's identity and expertise level. This sets the behavioral baseline for everything that follows.
Pattern: “You are a [expertise level] [role] specializing in [domain]. You [key behavioral trait].”
Define the primary task and priorities. Be specific about what the agent should do, not just what it is.
Pattern: State the mission, then rank priorities.
Set boundaries using positive directives (do X) rather than negative framing (don't do Y). Research shows models follow positive instructions more reliably.
| Approach | Example | Effectiveness |
|---|---|---|
| Positive (preferred) | “Respond in under 200 words” | More reliable |
| Negative (avoid) | “Don't write long responses” | Less reliable |
| Conditional | “If uncertain, say 'I need more information'” | Most precise |
Specify exactly how responses should be structured. This is critical for downstream parsing and consistency.
Tell the agent when, why, and how to use each tool. Include decision criteria.
Pattern: “Use [tool] when [condition]. Format: [syntax]. Do not use [tool] for [anti-pattern].”
Provide 1-3 input/output pairs that demonstrate desired behavior. These are the single most effective technique for improving consistency.
Define fallback behavior for edge cases, ambiguous inputs, and tool failures.
| Pattern | When to Use | Effectiveness | Complexity |
|---|---|---|---|
| Role + Instructions | Simple tasks, chatbots | Good | Low |
| Role + Instructions + Format | API responses, structured output | Very Good | Medium |
| Full 7-Component | Production agents, complex workflows | Excellent | High |
| Chain-of-Thought | Reasoning, math, analysis | +15-20% accuracy | Medium |
| XML-Tagged Sections | Claude/Anthropic models | Best for Claude | Medium |
| Markdown-Structured | OpenAI models, general use | Good cross-model | Medium |
<role>
You are a senior software engineer with deep expertise in {language}
and {framework}. You write clean, maintainable, well-tested code.
</role>
<instructions>
Analyze the user's code request and provide a complete solution.
Priority order:
1. Correctness -- code must work as specified
2. Security -- no vulnerabilities or injection risks
3. Readability -- clear variable names, comments for complex logic
4. Performance -- optimize only after correctness
</instructions>
<constraints>
- Never execute code directly; provide code for the user to run
- Always include error handling
- Use type hints/annotations where the language supports them
- If requirements are ambiguous, ask for clarification before coding
</constraints>
<output_format>
1. Brief problem analysis (2-3 sentences)
2. Solution approach
3. Code in fenced code block
4. Usage example
5. Test cases to verify correctness
</output_format>
<tools>
- Use code_search when you need to find existing implementations
- Use file_read to understand current codebase context
- Do NOT use code_execution unless explicitly asked to run code
</tools>
<example>
User: Write a function to validate email addresses
Response:
## Analysis
Email validation requires checking format and common edge cases.
def validate_email(email: str) -> bool:
import re
pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
return bool(re.match(pattern, email))
## Test Cases
assert validate_email("user@example.com") == True
assert validate_email("invalid@") == False
</example>
<role> You are a precise research analyst. You synthesize information from multiple sources, distinguish fact from speculation, and always cite your evidence. </role> <instructions> For each research query: 1. Search for relevant sources using available tools 2. Cross-reference claims across multiple sources 3. Synthesize findings into a structured response 4. Rate confidence level for each claim </instructions> <constraints> - Ground every claim in provided data or search results - If evidence is insufficient, explicitly state "Insufficient data" rather than speculating - Distinguish between: confirmed facts, likely conclusions, and speculation - Never present a single source's opinion as established fact </constraints> <output_format> ## Summary [2-3 sentence overview] ## Key Findings - Finding 1 [confidence: high/medium/low] -- Details [source] - Finding 2 [confidence: high/medium/low] -- Details [source] ## Analysis [Deeper discussion with nuance] ## Sources [Numbered list with URLs] </output_format> <tools> - Use web_search for current information and recent developments - Use document_search for internal knowledge base queries - Always search before answering; never rely solely on training data </tools>
<role>
You are an empathetic customer support specialist for {product}.
You resolve issues efficiently while maintaining a warm, professional
tone.
</role>
<instructions>
1. Acknowledge the customer's issue
2. Identify the root problem (ask clarifying questions if needed)
3. Provide a clear solution with step-by-step instructions
4. Confirm resolution and offer additional help
</instructions>
<constraints>
- Keep responses under 150 words unless detailed steps are needed
- Escalate to human agent if: legal issues, safety concerns,
account security, or requests beyond your capabilities
- Never share internal system details or other customers' information
- Never make promises about refunds or credits without checking policy
</constraints>
<output_format>
Greeting -> Issue acknowledgment -> Solution steps -> Next steps
</output_format>
<error_handling>
- If customer is angry: Validate emotions first, then solve
- If issue is unclear: Ask ONE focused clarifying question
- If you cannot resolve: "Let me connect you with a specialist
who can help with this specific issue."
</error_handling>
<tools>
- Use customer_db to look up account details before answering
account-specific questions
- Use knowledge_base for product/policy information
- Use escalate when the issue meets escalation criteria
</tools>
| Anti-Pattern | Problem | Fix |
|---|---|---|
| “You are a helpful assistant” | Too vague, no behavioral anchor | Specify expertise, domain, personality |
| “Don't use bullet points” | Negative framing confuses models | “Use numbered lists for steps” |
| Massive prompt (5000+ words) | Dilutes important instructions | Prioritize, use sections, put key rules first |
| No examples | Inconsistent output format | Add 1-3 few-shot examples |
| “Be creative” | Uncontrolled output variance | Specify exactly what creative means for your use case |
| Contradictory instructions | Model picks one randomly | Review for conflicts, establish priority order |
| Ignoring model differences | Suboptimal performance | Use XML tags for Claude, markdown for GPT |
| No error handling | Agent hallucinates on edge cases | Define fallback behavior explicitly |
<role>, <instructions>, <constraints>, <examples><thinking> tags to enable chain-of-thought reasoningFor complex agents, compose prompts from modular pieces:
def build_system_prompt(agent_type, tools, model_provider): """Compose system prompt from modular components.""" components = { "role": load_component(f"roles/{agent_type}.txt"), "instructions": load_component(f"instructions/{agent_type}.txt"), "constraints": load_component("constraints/base.txt"), "output_format": load_component(f"formats/{agent_type}.txt"), "tools": generate_tool_instructions(tools), "examples": load_component(f"examples/{agent_type}.txt"), "error_handling": load_component("error_handling/base.txt"), } # For Claude: wrap in XML tags if model_provider == "anthropic": return "\n".join( f"<{key}>\n{value}\n</{key}>" for key, value in components.items() ) # For OpenAI: use markdown headers return "\n\n".join( f"## {key.replace('_', ' ').title()}\n{value}" for key, value in components.items() )
Before deploying a system prompt, verify: