Claude API Tool Use: Connecting the Model to the Outside World
In the previous chapter we learned Prompt Caching to cut costs. But conversation alone isn't enough — powerful AI applications need Claude to call external tools: query databases, hit APIs, execute code, search the web. That's Tool Use.
This chapter builds a complete tool-calling Agent from scratch, covers the full request-response loop, and shows how to combine tool definitions with Prompt Caching for optimal cost.
The Tool Use Model
Tool Use follows a loop:
Claude doesn't execute tool code directly. It returns a structured call request (function name + arguments), your application handles the actual execution, then you pass the result back.
Two categories: Client Tools (you define + you execute) and Server Tools (Anthropic provides and executes, like web_search). This chapter focuses on Client Tools — the tools you define yourself.
Defining Tools
Tools are passed via the tools parameter. Each needs a name, description, and JSON Schema for parameters:
Tool description quality directly impacts Claude's calling accuracy. Be clear about: what the tool does, when it should be used, and parameter meaning/format. Good descriptions > short descriptions.
Handling the Tool Call Loop
When Claude decides to call a tool, the response has stop_reason: "tool_use" and content contains a tool_use block:
Execute the tool and send the result back:
Complete Agent Loop
In real applications, Claude may call multiple tools sequentially or use one tool's result to inform the next call. You need a loop:
Always set a maximum iteration count (e.g., 10) to prevent infinite loops. In production, add timeout and error handling for each tool execution.
Controlling Tool Selection: tool_choice
The tool_choice parameter controls when and how Claude uses tools:
| tool_choice | Behavior | Use Case |
|---|---|---|
auto (default) | Claude decides | General conversation + tools |
tool + name | Must call the specified tool | Structured data extraction |
any | Must call some tool | Force agent to take action |
none | Disable tools | When you need pure text response |
Strict Mode: Guaranteed Schema Compliance
Add strict: true to ensure Claude's tool calls strictly follow your JSON Schema:
Strict mode is strongly recommended in production. Without it, Claude occasionally returns parameters that don't match your schema (e.g., missing required fields), causing downstream code to break.
Combining with Prompt Caching
Tool definitions are first in the cache hierarchy — tools → system → messages. This means:
- If tool definitions don't change (true for most applications), they get cached and subsequent requests read them for free
- If you frequently add/remove tools, the entire cache chain invalidates
Best practice:
Cost breakdown:
| Component | First Request | Subsequent Requests |
|---|---|---|
| tools (~500 tokens) | 1.25x (write) | 0.1x (read) |
| system prompt (~200 tokens) | 1.25x (write) | 0.1x (read) |
| conversation history (growing) | 1.25x (write) | 0.1x (read) |
| new user message | 1x (normal) | 1x (normal) |
In a 10-turn conversation, tools + system prompt are cache-read 9 times, saving 90% each time.
Streaming with Tool Calls
When streaming, tool calls arrive incrementally:
Parallel Tool Calls
Claude can return multiple tool_use blocks in a single response, indicating parallel tool calls are needed:
Error Handling
When tool execution fails, send a result with is_error: true:
When Claude receives an error, it typically: informs the user something went wrong, retries with different parameters, or answers using alternative methods.
Practical Example: Customer Service Agent with Caching
Combining Prompt Caching from the previous chapter with Tool Use from this chapter:
Summary
| Concept | Key Points |
|---|---|
| Tool definition | name + description + input_schema (JSON Schema) |
| Call loop | Check stop_reason == "tool_use" → execute → send tool_result → repeat |
| tool_choice | auto (default) / tool (force specific) / any (force any) / none (disable) |
| strict mode | Recommended for production — guarantees schema compliance |
| Parallel calls | Single response may contain multiple tool_use blocks — execute in parallel |
| Error handling | is_error: true tells Claude execution failed |
| Caching combo | tools are first in cache hierarchy — keep stable for maximum cache benefit |
Next step: combine Tool Use with Extended Thinking to let Claude reason deeply before calling tools in complex decision scenarios, further improving Agent accuracy.