Claude Agent SDK Tool Reference - Server Tools Client Tools and Versioning
When you start building serious agents on Claude, the tool reference page is something you'll come back to constantly. It looks straightforward on first read. There's a list of tools, each with a type string, a version date, and a status. What it doesn't tell you is the things that actually matter once you're in production. Which tools are interchangeable? Which ones you can swap for free? What happens when Anthropic releases a new version and your old version still works?
We've been building agent systems on the Anthropic SDK for our clients since early 2025, and the tool reference has shaped a lot of those decisions. Let me walk through what the current state of the page actually means, what the categories actually do, and where the gotchas live.
Server Tools and Client Tools - The Distinction That Actually Matters
The single most important concept on the page is the difference between server tools and client tools. Anthropic groups their provided tools into these two categories, and the difference is structural, not cosmetic.
A server tool runs on Anthropic's infrastructure. When Claude decides to call it, Anthropic executes it and returns the result. Your application doesn't have to do anything. The Web Search tool, the Web Fetch tool, the Code Execution tool, and the Tool Search tool all work this way. You pay for the execution as part of your normal Claude usage. Your code stays simple. The downside is that you have less visibility into what's happening, and you can't intercept or modify the tool's behaviour.
A client tool is different. Anthropic defines the schema but your application handles execution. The Bash tool, the Text Editor tool, the Computer Use tool, and the Memory tool all work this way. When Claude wants to use one, the API returns a tool use block and waits for you to send back the result. This means you control exactly what happens. You can sandbox the execution, redirect file operations, log every command, or substitute behaviour entirely.
The practical implication is that server tools are easier to start with and harder to control. Client tools are harder to start with and easier to control. For most production agents we build at Team 400, we use a mix of both, but the choice for any specific tool is driven by how much oversight we need. A customer service agent that needs to search the web? Server tool is fine. An ops automation agent that runs commands on your infrastructure? Definitely client tool, with our own sandboxing layer in front of it.
What the Date Suffixes Actually Mean
Every Anthropic-provided tool has a type string that ends in a date. web_search_20260209. code_execution_20260120. text_editor_20250728. The natural assumption is that the latest date is the best one and you should always use that. This is wrong about half the time, and getting it wrong costs you either capability or compatibility.
The reference page lists four different ways the date suffix can work. This is worth understanding properly because the docs are dry about it.
Capability-keyed versioning. Some tools have multiple current versions, and the newer one adds capability the older one doesn't have. web_search_20260209 and web_fetch_20260209 both add dynamic content filtering. code_execution_20260120 adds programmatic tool calling. The older versions still work. Which one you use depends on whether you need the new capability or whether you'd rather not deal with the cognitive overhead of the new feature.
Model-keyed versioning. Some tools have different versions for different Claude models. text_editor_20250728 works with Claude 4 series. text_editor_20250124 works with earlier models. Pick the one that matches the model you're using. Get this wrong and you get errors that look like the tool is broken when it's actually just incompatible.
Variants, not versions. Two tools released on the same date can be alternatives rather than one superseding the other. tool_search_tool_regex_20251119 and tool_search_tool_bm25_20251119 are two search algorithms for the same problem. Pick whichever fits your use case. Neither is the upgrade of the other.
Legacy versions. code_execution_20250522 only supports Python. code_execution_20250825 adds Bash and file operations. Unless you have a specific reason to stay on the Python-only version, move to the newer one. This is the closest the page comes to "use the latest" being correct advice.
One quirk worth mentioning. The Tool Search tool has undated aliases. You can use tool_search_tool_regex or tool_search_tool_bm25 and they resolve to the latest dated version automatically. None of the other tools work this way. The MCP toolset is also different because its versioning lives in the anthropic-beta header, not in the type string.
For our Claude consulting work, we typically lock client agents to specific dated versions for stability, and only move to newer versions after testing them in a staging environment. This is the boring discipline that prevents nasty surprises when Anthropic ships a tool update that subtly changes behaviour. The undated aliases are tempting but for production agents they're a foot gun waiting to happen.
Tool Definition Properties That Actually Matter
Beyond picking the right version, every tool definition supports optional properties that change how it behaves. Most of these compose, meaning you can set several on the same tool. The reference page lists them but doesn't really explain when to bother.
cache_control. Sets a prompt-cache breakpoint at the tool definition. If your agent makes lots of similar calls with the same tool definitions, this matters a lot for cost and latency. We typically cache the tool definitions for any agent that runs more than a few times per minute. The savings show up immediately in your billing.
strict. Guarantees schema validation on tool names and inputs. Sounds harmless. The catch is that strict mode trades off latency for correctness. You get faster more reliable tool calls at the cost of a slightly longer initial response. For our agents handling regulated workloads (financial advisers, insurance underwriting, healthcare workflows) we always turn this on. For exploratory or creative agents, we usually leave it off.
defer_loading. Excludes the tool from the initial system prompt and only loads it on demand when tool search returns a reference for it. This is the property nobody understands until they've built an agent with 80 tools and watched the system prompt balloon. Deferred loading combined with the Tool Search tool lets you build agents with hundreds of available tools without paying the token cost upfront. Genuinely useful for large agent systems, though most teams don't need it because their agents have ten tools, not a hundred.
allowed_callers. Restricts which callers can call the tool. Mostly relevant for programmatic tool calling scenarios where you want to enforce that certain tools can only be invoked by the orchestrating code, not by Claude directly. We use this defensively for tools that touch sensitive systems.
input_examples. Provides example input objects to help Claude understand how to call the tool. This is one of the most underrated properties on the list. The difference between a tool with no examples and a tool with two well-chosen examples can be the difference between Claude calling it correctly first try and Claude getting confused about the schema. If your tool has any non-obvious parameters, add examples. It costs you nothing and improves reliability significantly.
What This Means for How You Should Build Agents
A few patterns that have come out of building real systems with these tools.
Start with server tools when you can. They reduce the surface area of code you need to maintain. Web search, web fetch, and code execution are all production-ready and saved us months of work that would otherwise have gone into reimplementing those capabilities.
Move to client tools when control matters. The minute you need observability over what tools are doing, or you need to enforce your own security model, client tools become non-negotiable. The bash tool and the text editor tool are designed to be client-executed. Use the schema Anthropic provides, but execute the calls inside your own sandbox. Our forward deployed engineering teams build these sandboxes routinely for clients running Claude agents in production environments.
Pay attention to versioning early. Locking versions costs you nothing and saves you from surprise behaviour changes. The alternative is that one day Anthropic releases a new tool version, your agent picks it up automatically, and a subtle behavioural change breaks something in production that nobody catches until a customer complains.
Use the tool definition properties. Most teams ignore them and miss easy wins on caching, validation, and reliability. Half an hour spent setting up cache control on the right boundaries pays for itself within the first week of production traffic.
The Bottom Line
The Claude tool reference is one of those pages where the surface presentation hides the depth of what's actually going on underneath. The categories of tools, the versioning model, and the optional properties together give you a lot of control over how your agents behave. Most teams use about a tenth of what's available.
For Australian businesses building agent systems on Claude, the practical advice is straightforward. Read the reference carefully before you start. Pick versions deliberately. Use the optional properties that match your reliability and cost requirements. And test changes in staging before pushing them to production agents, because the small details on this page have outsized effects on how your agents actually run.
Reference: Claude tool reference documentation