Back to Blog

Claude Agent SDK TypeScript - Building Production AI Agents With the Full API

April 3, 20269 min readMichael Ridland

The Claude Agent SDK TypeScript reference is one of those documents where reading it top to bottom changes how you think about building AI agents. It's not a tutorial - it's a complete API surface covering everything from basic queries to sandbox configuration, hook systems, and session management.

I've been building with this SDK for several months now, and the depth of what's available keeps surprising me. The full TypeScript SDK reference runs thousands of lines. Here's what matters most if you're building agents that need to work in production, not just in demos.

The query() Function - Where Everything Starts

Every interaction with the SDK starts with query(). It takes a prompt and an options object, and returns an async generator that streams messages as they arrive.

import { query } from "@anthropic-ai/claude-agent-sdk";

for await (const message of query({
  prompt: "Analyse the current codebase for security issues",
  options: {
    cwd: "/path/to/project",
    model: "claude-opus-4-6",
    permissionMode: "default"
  }
})) {
  if (message.type === "assistant") {
    // Process assistant response
  }
}

The options object is where the real configuration happens, and it's worth understanding the key properties early.

cwd sets the working directory. Your agent operates relative to this path. Get this wrong and file operations fail silently or target the wrong files.

model picks the Claude model. For agent workloads, claude-opus-4-6 gives you the best reasoning but costs more. claude-sonnet-4-5-20250929 is the sweet spot for most production use cases - good reasoning at lower cost.

permissionMode controls how tool permissions work. "default" prompts the user, "acceptEdits" auto-accepts file changes, and "bypassPermissions" skips all checks. That last one needs allowDangerouslySkipPermissions: true as well - the naming is intentionally scary because it should be.

maxTurns caps the number of tool-use round trips. Without this, an agent can loop indefinitely if it gets confused. We set this in every production deployment. A reasonable starting point is 20-30 turns for most tasks.

maxBudgetUsd caps spending. If you're running agents that could potentially make many API calls, this prevents a runaway session from blowing through your budget.

The Query Object - More Than a Generator

The return value from query() isn't just an async generator. It has methods for interacting with the session while it's running.

interrupt() stops the current operation - useful when you detect the agent going down the wrong path. setModel() lets you switch models mid-session, which we use for cost optimisation: start with Opus for complex reasoning, then switch to Sonnet for execution.

initializationResult() gives you the full session setup data - available commands, models, account info. We use this to validate that the session started correctly before sending the real prompt.

mcpServerStatus() tells you which MCP servers connected successfully. If a server failed to connect, you want to know before the agent tries to use a tool that doesn't exist.

The newer methods are interesting too. setMcpServers() lets you dynamically add or remove MCP servers during a session. stopTask() cancels a background task by ID. These are the kind of things you need when building multi-step agentic workflows where the available tools change based on what stage you're in.

Tools and MCP Servers

I've written about custom tools in detail before, but the TypeScript-specific parts are worth highlighting.

The tool() function creates type-safe tool definitions using Zod schemas. This matters because Zod gives you automatic type inference on the handler's args parameter:

import { tool, createSdkMcpServer } from "@anthropic-ai/claude-agent-sdk";
import { z } from "zod";

const lookupOrder = tool(
  "lookup_order",
  "Look up an order by ID and return status, items, and delivery estimate",
  { orderId: z.string().describe("The order number e.g. ORD-12345") },
  async ({ orderId }) => {
    const order = await orderService.getOrder(orderId);
    return {
      content: [{ type: "text", text: JSON.stringify(order, null, 2) }]
    };
  },
  { annotations: { readOnlyHint: true } }
);

The createSdkMcpServer() function wraps tools into an in-process MCP server. No separate process, no network setup. You define tools, create the server, and pass it to query() via the mcpServers option.

const orderServer = createSdkMcpServer({
  name: "orders",
  version: "1.0.0",
  tools: [lookupOrder, updateOrderStatus, cancelOrder]
});

Both Zod 3 and Zod 4 are supported, which is nice if you haven't migrated yet.

Hooks - The Event System

The hooks system in the SDK deserves more attention than most people give it. Hooks let you respond to events during the agent lifecycle - before and after tool calls, on session start and end, on permission requests, and more.

The event list is extensive: PreToolUse, PostToolUse, PostToolUseFailure, Notification, UserPromptSubmit, SessionStart, SessionEnd, Stop, SubagentStart, SubagentStop, PreCompact, PermissionRequest, and several more.

For production deployments, the ones that matter most are:

PreToolUse lets you inspect and modify tool inputs before they execute. We use this for data sanitisation - stripping PII from prompts before they hit external APIs.

PostToolUse lets you inspect tool results. Useful for logging, metrics, and catching unexpected outputs before the agent processes them.

PermissionRequest gives you programmatic control over permissions. Instead of prompting a user, your code decides whether to allow or deny a tool call. This is how you build automated agent pipelines that don't need human intervention for expected operations.

Each hook callback receives the input, a tool use ID, and an AbortSignal. The return value can continue or block execution, inject system messages, or modify the tool input.

Session Management

The SDK includes full session management - listing past sessions, reading transcripts, renaming, and tagging. This is underrated for production systems.

import { listSessions, getSessionMessages } from "@anthropic-ai/claude-agent-sdk";

const sessions = await listSessions({ dir: "/path/to/project", limit: 10 });

Each session has metadata: ID, summary, last modified time, working directory, git branch, and optional custom titles and tags. You can search sessions by project directory or across all projects.

getSessionMessages() reads the actual conversation transcript. We use this for building dashboards that show what agents have been doing across the organisation - which tasks succeed, which fail, and where agents get stuck.

The resume option in query() picks up where a previous session left off. Combined with forkSession, you can branch a session to try different approaches without losing the original context. This is particularly useful for debugging - fork the session at the point where things went wrong and try a different approach.

Subagents

The agents option lets you define subagents programmatically. Each agent gets its own description, prompt, tool set, and optional model override.

const result = query({
  prompt: "Review this pull request",
  options: {
    agents: {
      "code-reviewer": {
        description: "Reviews code for quality, bugs, and security issues",
        prompt: "You are a senior code reviewer. Focus on correctness, security, and maintainability.",
        tools: ["Read", "Grep", "Glob"],
        model: "opus"
      },
      "test-checker": {
        description: "Verifies test coverage and test quality",
        prompt: "You are a testing specialist. Check test coverage and quality.",
        tools: ["Read", "Grep", "Glob", "Bash"],
        model: "sonnet"
      }
    }
  }
});

The main agent can delegate to subagents based on what the task requires. Each subagent runs with its own tools and model, which means you can give different agents different capabilities and different cost profiles.

The maxTurns property on agents prevents runaway subagents. The disallowedTools property lets you explicitly block certain tools for certain agents - useful when you want a read-only reviewer that can't modify files.

Permissions and Sandbox

The permission system is more flexible than it looks at first glance.

allowedTools auto-approves specific tools. disallowedTools blocks them entirely, overriding everything else including bypassPermissions. canUseTool is a custom function that gives you full programmatic control over every tool call.

The canUseTool callback receives the tool name, input, and context including an AbortSignal and the agent ID (for subagent calls). You return either { behavior: "allow" } or { behavior: "deny", message: "reason" }. This is where you implement custom authorisation logic for production deployments.

The sandbox system adds another layer. When enabled, bash commands run in an isolated environment with configurable network and filesystem restrictions. You can allow specific domains, block write access to certain paths, and control Unix socket access.

The interaction between allowUnsandboxedCommands and canUseTool is worth understanding. When a sandboxed command needs to break out (say, to access Docker), the model sets dangerouslyDisableSandbox: true on the tool input, and that request flows through your canUseTool handler. You decide whether to allow it. This gives you a nice security model where the sandbox is the default, with controlled exceptions.

Settings and Configuration

The settingSources option controls which configuration files the SDK loads. By default, the SDK loads nothing - full isolation. You opt in to user settings, project settings, or local settings explicitly.

This matters for production. You don't want a CI/CD pipeline picking up developer-specific settings from ~/.claude/settings.json. Set settingSources: ["project"] and you get only the team-shared configuration.

To load CLAUDE.md project instructions, you need both settingSources: ["project"] and the preset system prompt. This is a common gotcha - people wonder why their CLAUDE.md isn't being read, and it's because they didn't configure both options.

V2 Preview

There's a new V2 interface in preview with send() and stream() patterns that simplify multi-turn conversations. I haven't used it in production yet, but the API looks cleaner for chat-style interactions where you're sending multiple messages back and forth.

For now, the V1 query() interface is stable, well-documented, and proven. The V2 preview is worth watching but not worth betting on until it stabilises.

What This Means for Building AI Agents

The TypeScript SDK is mature enough for production workloads. The permission system, sandbox, hooks, and session management all point to a platform designed for real deployment, not just prototyping.

If you're building AI agents on Claude, the TypeScript SDK is the right foundation for anything that runs on Node.js, Bun, or Deno. The Python SDK exists too, but for web applications and server-side TypeScript projects, the type safety and Zod integration make the TypeScript version the stronger choice.

We've been using this SDK across multiple AI agent development projects for Australian enterprises. The pattern we see working best is starting with a narrow use case - a single agent with two or three tools - and expanding from there. The SDK's architecture supports that growth well. You can add MCP servers, subagents, and hooks incrementally without rearchitecting.

If you're evaluating the Claude Agent SDK for your organisation, or if you're building agents and want to get the architecture right from the start, our AI agent builders team has been through this process multiple times. We can help you set up the right permission model, tool architecture, and deployment pattern for your specific use case.

The full TypeScript SDK reference is essential reading. It's dense, but everything in there exists for a reason, and you'll find yourself coming back to it regularly as you build more sophisticated agents.