An AI agent is only as reliable as its system prompt. The instructions you give an agent upfront determine whether it plans before acting, recovers gracefully from failures, stays within its guardrails, and produces output you can actually use.

After building agents for a range of tasks — research, automation, data extraction, customer-facing workflows — these 10 patterns consistently separate reliable agents from flaky ones. They're not abstract principles: each one is a concrete prompt structure you can copy directly into your system prompt.

Why System Prompts Determine Agent Reliability

LLMs don't have stable default behaviors. Without explicit instructions, the same model will handle edge cases inconsistently across runs. One run it asks for clarification before deleting a file; the next run it just does it.

The system prompt is your contract with the model. It sets the agent's identity, decision-making rules, output format, and error behavior. A well-crafted system prompt turns a probabilistic model into a predictable system.

These patterns are optimized for Claude but apply to any frontier model. Let's go through them one by one.

Pattern 1: Role + Context Declaration

Problem: Without a clear identity, agents hedge, break character, and produce inconsistent output quality.

Pattern: Open your system prompt with a precise role statement and context. Don't say "You are a helpful assistant." Say exactly what the agent does, who it serves, and what it has access to.

# ✗ Vague
"""You are a helpful AI assistant."""

# ✓ Precise
"""You are a research agent for a software development team.
Your job is to answer technical questions by searching documentation,
summarizing findings, and citing sources. You have access to:
- web_search: search the web for current information
- read_url: fetch and read the content of a URL
- create_summary: save a structured summary to the team's knowledge base

You do NOT write code. You do NOT make changes to files or systems."""

The explicit list of what the agent can and can't do is critical. It prevents scope creep where the agent tries to "help" by doing things outside its intended role.

Pattern 2: Reason Before Acting

Problem: Agents jump straight to tool calls without a plan, leading to redundant calls, wrong tool choices, and hard-to-debug errors.

Pattern: Instruct the agent to reason through its approach before calling any tool. This is chain-of-thought applied to agentic behavior.

"""Before calling any tool, briefly state:
1. What you understand the user is asking for
2. What information you already have
3. What tool calls you plan to make and in what order
4. What the expected output looks like

Then execute the plan. If something unexpected happens, revise the
plan explicitly before continuing."""

This pattern produces a visible "thinking out loud" step in the agent's response. It slows agents down slightly but dramatically reduces wrong-path tool calls. The explicit plan also makes debugging much easier — you can see exactly where the agent's reasoning diverged from the intended behavior.

Why this works

Models perform better at complex tasks when they generate reasoning tokens before committing to an answer. The plan forces the model to "slow down" before acting — the same mechanism behind chain-of-thought prompting.

Pattern 3: Explicit Output Format

Problem: Unstructured agent output is hard to parse downstream. Agents describe what they found instead of returning data you can use.

Pattern: Specify the exact output format in the system prompt. For structured data, include a JSON schema example. For prose, define sections.

"""When you complete a research task, your final response MUST be
structured as follows:

{
  "summary": "2-3 sentence answer to the original question",
  "findings": [
    {
      "point": "Key finding",
      "source": "URL or document name",
      "confidence": "high | medium | low"
    }
  ],
  "limitations": "What you couldn't find or verify",
  "next_steps": "Suggested follow-up actions if applicable"
}

Return only the JSON. Do not add explanatory text before or after it."""

The "Do not add explanatory text" clause is important. Without it, models often wrap JSON output in markdown fences or add prefixes like "Here is the structured output:", which breaks automated JSON parsing.

Pattern 4: Constraint Ladder

Problem: A single list of "don'ts" is fragile. Agents find creative ways around constraints they don't fully understand.

Pattern: Define constraints in three tiers: hard stops (never), soft limits (only with explicit instruction), and defaults (behavior in ambiguous cases).

"""Constraints:

NEVER (hard stops — no exceptions):
- Delete or modify files outside /workspace/
- Execute shell commands that modify system state
- Send emails or make API calls to external services
- Continue past 15 tool calls in a single session

ONLY WITH EXPLICIT INSTRUCTION (require clear user direction):
- Overwrite an existing file (ask: "Are you sure you want to overwrite X?")
- Take an action that affects more than 10 records at once
- Use a third-party API that may incur costs

BY DEFAULT (behavior when ambiguous):
- If a task is unclear, ask one clarifying question before proceeding
- If a tool call fails twice, stop and report the error to the user
- If you're unsure whether an action is in scope, err on the side of asking"""

The ladder structure helps the model understand why constraints exist (hard stops exist for safety, soft limits exist for user intent). This leads to better generalization to cases not explicitly covered.

Pattern 5: Structured Error Recovery

Problem: When a tool fails, agents either give up immediately, loop trying the same call, or silently skip the failed step.

Pattern: Define explicit recovery behavior for tool failures in the system prompt.

"""When a tool call fails or returns an error:

1. Read the error carefully. Is it:
   a) A transient error (network timeout, rate limit)?
      → Wait briefly and retry once
   b) A logical error (wrong arguments, resource not found)?
      → Revise your approach. Don't retry the exact same call.
   c) A permissions or scope error?
      → Stop. Report to the user with: "I can't complete this because [reason]."

2. After a second failure on the same step:
   → Stop trying that approach
   → Report: "I tried X and it failed with Y. I can try Z instead, or you can
      investigate the issue. What would you prefer?"

3. Never silently skip a failed step and continue.
   A task completed with unacknowledged failures is worse than a task stopped early."""

The key insight in step 3: silent failure is the most dangerous failure mode. An agent that skips a data-fetching step and continues with partial data is more likely to produce subtly wrong output than one that stops and reports.

Pattern 6: Task Decomposition First

Problem: Complex multi-step tasks are handled sequentially with no upfront planning, leading to agents discovering problems mid-execution that could have been caught earlier.

Pattern: For complex tasks, require the agent to decompose the task into steps before executing anything.

"""For any task that requires more than 2 tool calls to complete:

1. PLAN first. Output a numbered list of the steps you'll take:
   Example:
   "Plan:
   1. Search for current Python packaging best practices
   2. Fetch the top 3 results
   3. Extract the key recommendations from each
   4. Synthesize into a summary document"

2. Show the plan to the user.
3. Wait for confirmation: "Does this plan look right, or should I adjust?"
4. Execute only after confirmation.

For simple, single-step tasks, skip the planning step and act directly."""

The user confirmation step is optional — for fully automated pipelines you'd remove it. But requiring the plan output still gives you a visible audit trail of what the agent intended to do, even in automated contexts.

Pattern 7: Confirm Before Irreversible Actions

Problem: Agents execute destructive actions (deletes, sends, publishes) without verifying intent, especially when the user gave high-level instructions.

Pattern: Define categories of irreversible actions and require explicit confirmation before executing them.

"""Before executing any irreversible action, you MUST confirm with the user.
Irreversible actions include:
- Deleting any file or record
- Sending any external communication (email, webhook, message)
- Publishing or making content publicly visible
- Modifying data that affects more than one record

For each irreversible action, state:
"I'm about to [specific action]. This cannot be undone. Proceed?"

Wait for explicit confirmation ("yes", "proceed", "do it") before continuing.
If the user says anything ambiguous, ask again."""

This pattern is especially important for agents that operate with elevated permissions. The cost of one extra confirmation turn is trivial compared to the cost of an unintended bulk delete.

Pattern 8: State Summary at Turn End

Problem: In long multi-turn agent sessions, context about what has been done, what's pending, and what was decided gets lost.

Pattern: Require the agent to maintain and update a state summary at the end of each turn.

"""At the end of every response, include a brief status block:

---
STATUS:
- Completed: [list what was done this turn]
- In progress: [list what's still running or pending]
- Blocked: [list anything that couldn't proceed and why]
- Next: [what the agent will do next, if the conversation continues]
---

Keep each item to one line. This helps users track agent state in
long sessions without re-reading the full conversation."""

This pattern doubles as a progress indicator in user-facing agents. Users find it much easier to trust an agent that clearly shows its state than one that produces opaque responses.

Pattern 9: Source Attribution

Problem: Agents confidently present information without indicating where it came from, making it impossible to verify or trust.

Pattern: Require source attribution whenever the agent uses retrieved or external information.

"""For every factual claim based on retrieved information:
- Include the source inline: "According to [source], ..."
- If you used a tool to get the information, say which tool and what query
- If you're drawing on training data rather than a retrieved source, say so:
  "Based on my training data (not verified from a live source): ..."

Never present retrieved information as if it were your own knowledge.
Never assert facts without attribution when those facts could be wrong."""

The last rule — distinguishing retrieved facts from training data — is particularly valuable. It forces the model to signal when it might be hallucinating instead of presenting all output with equal confidence.

Pattern 10: Graceful Uncertainty Handling

Problem: Agents guess when they should ask, and ask when they should guess. The balance is hard to get right without explicit guidance.

Pattern: Define a decision rule for when to act vs. when to ask, tied to the cost of being wrong.

"""When you're uncertain about user intent, apply this rule:

HIGH COST OF BEING WRONG (ask first):
- The action modifies or deletes data
- The action sends a message or notification
- The action affects more than one item
→ Always ask before proceeding

LOW COST OF BEING WRONG (make a reasonable assumption, state it):
- The action is read-only (search, fetch, analyze)
- The action only affects a temporary file or draft
- The action can be easily undone
→ Proceed with the most reasonable interpretation, but state your assumption:
  "I'm assuming you meant X. Let me know if that's wrong."

COMPLETELY UNCLEAR (can't infer reasonable intent):
→ Ask a single, specific question. Not: "What do you mean?"
   Instead: "Do you want me to [option A] or [option B]?"
   Give the user a multiple-choice, not an open text field."""

The multiple-choice formulation for asking clarifying questions is underrated. "What do you mean?" requires the user to think from scratch. "Do you want A or B?" gives them something to react to, which is faster and produces clearer answers.

Composing Patterns: A Production System Prompt

These patterns are most powerful when combined. Here's a minimal production system prompt for a file-management agent that uses Patterns 1, 2, 4, 5, and 7:

"""You are a file organization agent. You help users organize and manage
files in their /workspace/ directory. You have access to:
- list_files(path): list contents of a directory
- read_file(path): read the contents of a file
- move_file(from_path, to_path): move or rename a file
- delete_file(path): permanently delete a file

NEVER access or modify files outside /workspace/.

Before any action:
1. State what you understand the user wants
2. List the tool calls you plan to make
3. Check: does any step delete, move, or rename a file?
   If yes, confirm with the user before executing.

When a tool fails:
- If network/timeout: retry once
- If file not found: stop and ask the user to verify the path
- If permission denied: stop and report immediately, do not try workarounds

At the end of each response, summarize:
- What was done
- What's still pending
- Any issues encountered"""

This is 19 lines. It's not exhaustive, but it handles the cases that cause most agent failures in practice: unclear intent, unconfirmed destructive actions, and silent error skipping.

Try it yourself on the Claude API

These patterns work best when you can experiment quickly. The Anthropic Console lets you test system prompts interactively with the full Claude model — no code required. Spin up a test agent, paste the prompt, and iterate until the behavior is right before building the full integration.

Getting Started

You don't need to implement all 10 patterns at once. Start with the three that address your biggest pain points:

  • Agent does unexpected things? Start with Pattern 4 (Constraint Ladder) and Pattern 7 (Confirm Before Irreversible Actions).
  • Hard to debug agent behavior? Add Pattern 2 (Reason Before Acting) and Pattern 8 (State Summary at Turn End).
  • Unreliable output format? Pattern 3 (Explicit Output Format) solves 90% of downstream parsing issues.
  • Agent hallucinates facts? Pattern 9 (Source Attribution) forces the model to distinguish retrieved data from training data.

Build up your system prompt incrementally. Test each pattern against real user inputs before adding the next one. A focused system prompt that handles your actual edge cases beats a comprehensive one that tries to handle everything hypothetically.

These patterns are building blocks. The production system prompt for a complex agent might combine 6–8 of them — but the fundamentals remain the same: tell the agent who it is, how to plan, when to stop, and how to fail gracefully.