How to Build Production AI Agents with the Claude Agent SDK: Custom Tools, Hooks, and Subagents

Learn to build production AI agents with the Claude Agent SDK in Python. Step-by-step guide covering custom tools, safety hooks, subagents, MCP integration, structured outputs, and deployment patterns with working code.

Why the Claude Agent SDK Changes How You Build AI Agents

If you've built AI agents before, you know the drill. Call the model, check if it wants a tool, run the tool, feed the result back, repeat until done. It's tedious, error-prone, and honestly — it gets old fast. The Claude Agent SDK takes all of that boilerplate off your plate. It gives you the same agent runtime that powers Claude Code, packaged as a Python library you can embed in your own apps.

Released by Anthropic and actively developed through early 2026, the SDK takes a tool-use-first approach. Agents are Claude models equipped with tools — including the ability to spawn other agents as tools. Unlike heavier frameworks like LangGraph or CrewAI, it keeps the agent loop minimal and leans on Claude's native reasoning for planning and coordination.

So, let's walk through building production-ready agents — from basic queries to custom tools, safety hooks, MCP integration, and parallel subagents.

Prerequisites and Installation

You'll need Python 3.10+ and an Anthropic API key from the Anthropic Console. The Claude Code CLI comes bundled with the Python package, so no separate install is needed.

pip install claude-agent-sdk

Set your API key:

export ANTHROPIC_API_KEY="sk-ant-your-key-here"

Quick sanity check:

python -c "import claude_agent_sdk; print('Claude Agent SDK installed successfully')"

Understanding the Two Core Interfaces

The SDK gives you two ways to talk to Claude agents. Picking the right one matters.

The query() Function — Simple, Stateless Queries

Use query() for one-shot tasks. You give Claude a job, it does the thing, and you get a result back. It's an async generator that yields messages as Claude reasons through the problem.

import anyio
from claude_agent_sdk import query

async def main():
    async for message in query(
        prompt="Read the file config.yaml and summarize its contents",
        options={
            "permission_mode": "default",
            "max_turns": 10,
        }
    ):
        if message.type == "assistant":
            print(message.content)

anyio.run(main)

The ClaudeSDKClient — Stateful, Multi-Turn Sessions

This is what you want when things get serious. ClaudeSDKClient handles persistent sessions, custom tools, hooks, and multi-turn conversations. It manages the full agent lifecycle — session state, MCP connections, file checkpointing, all of it.

import anyio
from claude_agent_sdk import ClaudeSDKClient, ClaudeAgentOptions

async def main():
    options = ClaudeAgentOptions(
        permission_mode="acceptEdits",
        max_turns=25,
        allowed_tools=["Read", "Write", "Edit", "Bash", "Glob", "Grep"],
    )

    async with ClaudeSDKClient(options=options) as client:
        await client.query("Refactor the utils module to use dataclasses")
        async for msg in client.receive_response():
            if msg.type == "assistant":
                print(msg.content)

anyio.run(main)

The key difference is straightforward: query() is fire-and-forget, while ClaudeSDKClient maintains state across multiple interactions within a session.

Building Custom Tools

This is where things get really interesting. Custom tools let you extend Claude with your own domain-specific functions. They run as in-process MCP servers — zero startup overhead, direct access to your app's state.

Defining a Custom Tool

The @tool decorator is your friend here. Give it a name, description, and input schema:

from claude_agent_sdk import tool, create_sdk_mcp_server

@tool(
    name="query_database",
    description="Execute a read-only SQL query against the application database",
    input_schema={
        "type": "object",
        "properties": {
            "query": {
                "type": "string",
                "description": "SQL SELECT query to execute"
            }
        },
        "required": ["query"]
    }
)
async def query_database(args):
    sql = args["query"]
    # Validate read-only
    if not sql.strip().upper().startswith("SELECT"):
        return {
            "content": [{"type": "text", "text": "Error: Only SELECT queries are allowed"}]
        }
    # Execute against your real database connection
    results = await db.execute(sql)
    return {
        "content": [{"type": "text", "text": str(results)}]
    }

@tool(
    name="send_slack_notification",
    description="Send a notification message to a Slack channel",
    input_schema={
        "type": "object",
        "properties": {
            "channel": {"type": "string", "description": "Slack channel name"},
            "message": {"type": "string", "description": "Message text to send"}
        },
        "required": ["channel", "message"]
    },
    annotations={
        "readOnlyHint": False,
        "destructiveHint": False,
        "idempotentHint": False,
        "openWorldHint": True,
    }
)
async def send_slack_notification(args):
    await slack_client.post_message(args["channel"], args["message"])
    return {
        "content": [{"type": "text", "text": f"Sent to #{args['channel']}"}]
    }

Registering Tools with the Agent

Wrap your tools in an MCP server and hand it to the client:

async def main():
    server = create_sdk_mcp_server(
        name="app-tools",
        version="1.0.0",
        tools=[query_database, send_slack_notification]
    )

    options = ClaudeAgentOptions(
        mcp_servers=[server],
        allowed_tools=[
            "mcp__app-tools__query_database",
            "mcp__app-tools__send_slack_notification",
            "Read", "Glob", "Grep",
        ],
    )

    async with ClaudeSDKClient(options=options) as client:
        await client.query(
            "Find any users whose subscription expired this week "
            "and send a summary to the #billing channel on Slack"
        )
        async for msg in client.receive_response():
            print(msg.content)

One thing to note: when Claude references a custom tool, it uses the naming convention mcp__<server-name>__<tool-name>. You'll need that pattern in allowed_tools.

MCP Tool Annotations

The annotations parameter on @tool gives the SDK metadata hints about what a tool actually does. These aren't just decorative — they help Claude make smarter decisions about when and how to use tools:

  • readOnlyHint — Tool doesn't modify any state
  • destructiveHint — Tool performs irreversible operations (handle with care)
  • idempotentHint — Same input always produces the same result
  • openWorldHint — Tool talks to external systems beyond the local environment

Implementing Safety Hooks

Hooks are callback functions that fire at specific points in the agent loop. They let you validate, block, modify, or log actions before and after they happen. If you're running agents in production, hooks are basically non-negotiable.

Available Hook Events

Here's what you can hook into:

  • PreToolUse — Fires before a tool executes. Can allow, deny, or modify the input.
  • PostToolUse — Fires after a tool returns. Great for logging or injecting context.
  • PostToolUseFailure — Fires when a tool execution fails.
  • UserPromptSubmit — Fires when the user submits a prompt.
  • SubagentStart / SubagentStop — Fires when subagents are spawned or complete.
  • PermissionRequest — Fires when a permission decision is needed.
  • Stop — Fires when the agent finishes execution.

PreToolUse: Blocking Dangerous Commands

This is the hook you'll implement first (and honestly, you should). A PreToolUse hook receives the tool name and input before execution, and you return a permission decision:

from claude_agent_sdk import ClaudeAgentOptions, HookMatcher, HookContext
from typing import Any

BLOCKED_PATTERNS = [
    "rm -rf /", "DROP TABLE", "DELETE FROM", "TRUNCATE",
    "curl | bash", "wget | sh", "chmod 777",
]

async def validate_bash_command(
    input_data: dict[str, Any],
    tool_use_id: str | None,
    context: HookContext,
) -> dict[str, Any]:
    """Block dangerous bash commands before they execute."""
    command = input_data["tool_input"].get("command", "")

    for pattern in BLOCKED_PATTERNS:
        if pattern.lower() in command.lower():
            return {
                "hookSpecificOutput": {
                    "hookEventName": "PreToolUse",
                    "permissionDecision": "deny",
                    "permissionDecisionReason": (
                        f"Blocked: command matches dangerous pattern '{pattern}'"
                    ),
                }
            }

    # Allow safe commands to proceed
    return {
        "hookSpecificOutput": {
            "hookEventName": "PreToolUse",
            "permissionDecision": "allow",
        }
    }

PostToolUse: Audit Logging

For production, you'll want a paper trail. Here's how to log every tool execution:

import json
from datetime import datetime, timezone

async def audit_log_tool_use(
    input_data: dict[str, Any],
    tool_use_id: str | None,
    context: HookContext,
) -> dict[str, Any]:
    """Log all tool executions with timestamps for audit trails."""
    log_entry = {
        "timestamp": datetime.now(timezone.utc).isoformat(),
        "session_id": input_data.get("session_id"),
        "tool_name": input_data.get("tool_name"),
        "tool_input": input_data.get("tool_input"),
        "tool_use_id": tool_use_id,
    }
    # Write to your logging infrastructure
    logger.info(f"Agent tool execution: {json.dumps(log_entry)}")
    return {}

Wiring Hooks into the Agent

Use HookMatcher to target specific tools or apply hooks globally:

options = ClaudeAgentOptions(
    hooks={
        "PreToolUse": [
            # Only match Bash tool calls
            HookMatcher(
                matcher="Bash",
                hooks=[validate_bash_command],
                timeout=120,
            ),
        ],
        "PostToolUse": [
            # Match all tool calls (no matcher = global)
            HookMatcher(hooks=[audit_log_tool_use]),
        ],
    },
    allowed_tools=["Bash", "Read", "Write", "Edit", "Glob", "Grep"],
)

The matcher field supports pipe-separated patterns like "Write|Edit" if you need to match multiple tools. No matcher means it fires for every event of that type.

Spawning Subagents for Parallel Work

Subagents are separate agent instances your main agent can spin up for focused subtasks. Each one gets its own fresh context window, which keeps things isolated and prevents the parent conversation from getting bloated.

Defining Subagents

You define them programmatically with AgentDefinition:

from claude_agent_sdk import ClaudeAgentOptions, AgentDefinition

options = ClaudeAgentOptions(
    subagents=[
        AgentDefinition(
            name="security-scanner",
            description="Scan code for security vulnerabilities, "
                        "OWASP top 10, and dependency issues",
            prompt=(
                "You are a security auditor. Analyze the provided code "
                "for vulnerabilities. Report severity, location, and "
                "recommended fixes. Focus on injection, auth flaws, "
                "and sensitive data exposure."
            ),
            tools=["Read", "Glob", "Grep", "Bash"],
            model="sonnet",
        ),
        AgentDefinition(
            name="test-writer",
            description="Generate pytest test cases for Python modules",
            prompt=(
                "You are a test engineer. Write comprehensive pytest "
                "test cases for the specified module. Include edge cases, "
                "error paths, and integration tests. Use fixtures and "
                "parametrize where appropriate."
            ),
            tools=["Read", "Write", "Edit", "Glob", "Grep", "Bash"],
            model="sonnet",
        ),
    ],
    allowed_tools=["Read", "Write", "Edit", "Bash", "Glob", "Grep", "Agent"],
)

How Subagents Execute

When Claude decides a subtask fits a subagent's description, it spawns the subagent automatically via the built-in Agent tool. There are a few things you need to know:

  • Fresh context — Subagents start with zero parent history. The prompt string is the only channel from parent to child, so pack everything they need (file paths, error messages, prior decisions) right into the prompt.
  • Parallel execution — Multiple subagents can run concurrently, which is great for collecting information from different sources simultaneously.
  • No permission inheritance — Subagents don't automatically get the parent's permissions. Use PreToolUse hooks to auto-approve specific tools, or set up permission rules that cover subagent sessions.
  • Result passing — The parent receives the subagent's final message as the Agent tool result. Simple and clean.

Connecting External Services via MCP

The Model Context Protocol (MCP) is an open standard for wiring AI agents to external tools and data sources. The Claude Agent SDK has first-class MCP support — connect to servers running as local processes, over HTTP, or in-process within your app.

Connecting to an External MCP Server

from claude_agent_sdk import ClaudeAgentOptions

options = ClaudeAgentOptions(
    mcp_servers=[
        {
            "name": "github",
            "type": "stdio",
            "command": "npx",
            "args": ["-y", "@modelcontextprotocol/server-github"],
            "env": {"GITHUB_TOKEN": os.environ["GITHUB_TOKEN"]},
        },
        {
            "name": "postgres",
            "type": "stdio",
            "command": "npx",
            "args": ["-y", "@modelcontextprotocol/server-postgres"],
            "env": {"DATABASE_URL": os.environ["DATABASE_URL"]},
        },
    ],
    allowed_tools=[
        "mcp__github__*",
        "mcp__postgres__*",
        "Read", "Grep",
    ],
)

Tool Search for Large Tool Sets

Here's a neat detail: when you have tons of MCP tools configured, all those tool definitions can eat up your context window fast. The SDK handles this with tool search (enabled by default), which holds back tool definitions and only loads the ones Claude actually needs on each turn. It keeps token usage efficient even with dozens of MCP servers connected.

Configuring Extended Thinking

Extended thinking gives Claude a visible chain-of-thought, which genuinely helps on complex tasks. I've found it makes a noticeable difference for multi-step debugging and architecture decisions. The SDK gives you fine-grained control:

from claude_agent_sdk import ClaudeAgentOptions

options = ClaudeAgentOptions(
    thinking={
        "type": "enabled",
        "effort": "high",  # "low", "medium", "high", or "max"
    },
    max_turns=30,
)

Use "high" or "max" for tasks that need real reasoning — code architecture, complex debugging, multi-step analysis. Drop to "low" when speed matters more than depth.

File Checkpointing and Rollback

Agents that modify files need a safety net. Period. The SDK's file checkpointing automatically saves file state before changes, so you can roll back instantly if something goes sideways:

options = ClaudeAgentOptions(
    enable_file_checkpointing=True,
    allowed_tools=["Read", "Write", "Edit", "Bash"],
)

async with ClaudeSDKClient(options=options) as client:
    await client.query("Refactor the payment module to use the new API")
    last_safe_message_id = None

    async for msg in client.receive_response():
        if msg.type == "assistant":
            last_safe_message_id = msg.id
            print(msg.content)

    # If something went wrong, rewind to the last known good state
    if needs_rollback:
        await client.rewind_files(last_safe_message_id)

Production Deployment Patterns

Setting max_turns to Prevent Infinite Loops

This one bites people. Agent sessions won't timeout on their own — if Claude gets stuck in a reasoning loop, it'll just keep going. Always set max_turns:

options = ClaudeAgentOptions(
    max_turns=25,  # Reasonable limit for most tasks
    permission_mode="acceptEdits",
)

Structured Outputs for Predictable Results

When you need machine-readable results (not just text), structured outputs give you validated JSON from the agent:

from claude_agent_sdk import query

schema = {
    "type": "object",
    "properties": {
        "summary": {"type": "string"},
        "severity": {"type": "string", "enum": ["low", "medium", "high", "critical"]},
        "files_affected": {"type": "array", "items": {"type": "string"}},
        "recommended_action": {"type": "string"},
    },
    "required": ["summary", "severity", "files_affected", "recommended_action"],
}

async for message in query(
    prompt="Analyze the latest git diff for security issues",
    options={
        "output_schema": schema,
        "max_turns": 15,
    },
):
    if message.type == "result":
        result = message.parsed  # Validated against your schema
        if result["severity"] in ("high", "critical"):
            await alert_security_team(result)

Containerized Hosting

For container deployments, the SDK runs internally while your app exposes HTTP or WebSocket endpoints:

FROM python:3.12-slim
RUN pip install claude-agent-sdk fastapi uvicorn
COPY . /app
WORKDIR /app
EXPOSE 8000
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

Claude Agent SDK vs. LangGraph vs. PydanticAI

This comes up a lot, and the honest answer is: it depends on what you're building.

  • Claude Agent SDK — Best if you're already committed to Claude models and want the tightest integration possible. You get built-in safety via Constitutional AI, native MCP support (Anthropic created the protocol), and computer-use capabilities. The catch? You're locked to Claude models.
  • LangGraph — Go here for complex, stateful multi-agent workflows with graph-based orchestration, time-travel debugging, and durable execution. It's model-agnostic but has the steepest learning curve of the three.
  • PydanticAI — Great for type-safe Python shops where model flexibility matters. Lean, minimal overhead, native MCP support. Works with OpenAI, Anthropic, Google, Groq, and Mistral.

Worth noting: these frameworks aren't mutually exclusive. PydanticAI agents can drop into LangGraph nodes with minimal changes, and both can connect to MCP servers built with tools like FastMCP.

Common Pitfalls and How to Avoid Them

AskUserQuestion Fails Silently in Headless Environments

The AskUserQuestion tool renders a terminal UI. When the SDK runs as a subprocess with no TTY (pretty common in production), the tool fails silently — no error, just nothing happens. For automated setups, either configure ask_user_question behavior explicitly or build user interaction through custom MCP tools instead.

Settings.json Permissions Are Not Inherited

This trips people up: the SDK's permissionMode doesn't automatically pick up permissions from settings.json. Always configure permissions explicitly through allowed_tools and permission_mode in your ClaudeAgentOptions.

Subagent Context Is Empty by Default

Subagents start with a completely blank context window. If your subagent needs to know about specific files, errors, or decisions from the parent, you have to pass all of that directly in the prompt string. It's the only channel between parent and child — there's no shared memory or implicit context.

Frequently Asked Questions

What is the difference between the Claude Agent SDK and the Anthropic Messages API?

The Messages API is stateless — send a prompt, get a completion, manage the tool loop yourself. The Claude Agent SDK is a full agent runtime. It handles the loop internally: Claude reasons, calls tools, observes results, and iterates on its own. You also get built-in tools (file ops, shell, web search) and session management out of the box.

Can I use the Claude Agent SDK with models other than Claude?

No. It's tightly coupled to Anthropic's Claude model family (Opus, Sonnet, Haiku). If model flexibility is a priority, look at PydanticAI or LangGraph instead. The trade-off for this lock-in is deeper integration with Claude-specific features like extended thinking, computer use, and Constitutional AI safety.

How do I handle user interaction in production agents that run without a terminal?

Don't rely on the built-in AskUserQuestion — it needs a TTY and fails silently without one. Instead, build a custom MCP tool that routes questions through your own UI (web dashboard, Slack bot, API endpoint). Use PermissionRequest hooks to send approval requests to external notification systems.

Is the Claude Agent SDK ready for production use?

It's in active development with Alpha status on PyPI as of March 2026. The core features — agent loop, custom tools, hooks, MCP integration, subagents — are stable and already running in production at multiple organizations. That said, expect API surface changes between versions. Pin your dependency and test upgrades carefully.

How does the Claude Agent SDK compare to the OpenAI Agents SDK?

Both are vendor-specific agent runtimes. The Claude Agent SDK leans into tool-use-first architecture with built-in file and shell operations, native MCP support, and Constitutional AI safety. The OpenAI Agents SDK (which evolved from Swarm) focuses more on handoff patterns between specialized agents. Your choice mostly comes down to preferred model provider and whether you need Claude-specific features like extended thinking and computer use.

About the Author Editorial Team

Our team of expert writers and editors.