In the previous chapter we learned Prompt Caching to cut costs. But conversation alone isn't enough — powerful AI applications need Claude to call external tools: query databases, hit APIs, execute code, search the web. That's Tool Use.

This chapter builds a complete tool-calling Agent from scratch, covers the full request-response loop, and shows how to combine tool definitions with Prompt Caching for optimal cost.

The Tool Use Model

Tool Use follows a loop:

Claude doesn't execute tool code directly. It returns a structured call request (function name + arguments), your application handles the actual execution, then you pass the result back.

Two categories: Client Tools (you define + you execute) and Server Tools (Anthropic provides and executes, like web_search). This chapter focuses on Client Tools — the tools you define yourself.

Defining Tools

Tools are passed via the tools parameter. Each needs a name, description, and JSON Schema for parameters:

Tool description quality directly impacts Claude's calling accuracy. Be clear about: what the tool does, when it should be used, and parameter meaning/format. Good descriptions > short descriptions.

Handling the Tool Call Loop

When Claude decides to call a tool, the response has stop_reason: "tool_use" and content contains a tool_use block:

Execute the tool and send the result back:

Complete Agent Loop

In real applications, Claude may call multiple tools sequentially or use one tool's result to inform the next call. You need a loop:

Always set a maximum iteration count (e.g., 10) to prevent infinite loops. In production, add timeout and error handling for each tool execution.

Controlling Tool Selection: tool_choice

The tool_choice parameter controls when and how Claude uses tools:

tool_choice	Behavior	Use Case
`auto` (default)	Claude decides	General conversation + tools
`tool` + name	Must call the specified tool	Structured data extraction
`any`	Must call some tool	Force agent to take action
`none`	Disable tools	When you need pure text response

Strict Mode: Guaranteed Schema Compliance

Add strict: true to ensure Claude's tool calls strictly follow your JSON Schema:

Strict mode is strongly recommended in production. Without it, Claude occasionally returns parameters that don't match your schema (e.g., missing required fields), causing downstream code to break.

Combining with Prompt Caching

Tool definitions are first in the cache hierarchy — tools → system → messages. This means:

If tool definitions don't change (true for most applications), they get cached and subsequent requests read them for free
If you frequently add/remove tools, the entire cache chain invalidates

Best practice:

Cost breakdown:

Component	First Request	Subsequent Requests
tools (~500 tokens)	1.25x (write)	0.1x (read)
system prompt (~200 tokens)	1.25x (write)	0.1x (read)
conversation history (growing)	1.25x (write)	0.1x (read)
new user message	1x (normal)	1x (normal)

In a 10-turn conversation, tools + system prompt are cache-read 9 times, saving 90% each time.

Streaming with Tool Calls

When streaming, tool calls arrive incrementally:

Parallel Tool Calls

Claude can return multiple tool_use blocks in a single response, indicating parallel tool calls are needed:

Error Handling

When tool execution fails, send a result with is_error: true:

When Claude receives an error, it typically: informs the user something went wrong, retries with different parameters, or answers using alternative methods.

Practical Example: Customer Service Agent with Caching

Combining Prompt Caching from the previous chapter with Tool Use from this chapter:

Summary

Concept	Key Points
Tool definition	name + description + input_schema (JSON Schema)
Call loop	Check `stop_reason == "tool_use"` → execute → send `tool_result` → repeat
tool_choice	`auto` (default) / `tool` (force specific) / `any` (force any) / `none` (disable)
strict mode	Recommended for production — guarantees schema compliance
Parallel calls	Single response may contain multiple `tool_use` blocks — execute in parallel
Error handling	`is_error: true` tells Claude execution failed
Caching combo	tools are first in cache hierarchy — keep stable for maximum cache benefit

Next step: combine Tool Use with Extended Thinking to let Claude reason deeply before calling tools in complex decision scenarios, further improving Agent accuracy.

This chapter builds a complete tool-calling Agent from scratch, covers the full request-response loop, and shows how to combine tool definitions with Prompt Caching for optimal cost.

The Tool Use Model

Tool Use follows a loop:

Claude doesn't execute tool code directly. It returns a structured call request (function name + arguments), your application handles the actual execution, then you pass the result back.

Defining Tools

Tools are passed via the tools parameter. Each needs a name, description, and JSON Schema for parameters:

Tool description quality directly impacts Claude's calling accuracy. Be clear about: what the tool does, when it should be used, and parameter meaning/format. Good descriptions > short descriptions.

Handling the Tool Call Loop

When Claude decides to call a tool, the response has stop_reason: "tool_use" and content contains a tool_use block:

Execute the tool and send the result back:

Complete Agent Loop

In real applications, Claude may call multiple tools sequentially or use one tool's result to inform the next call. You need a loop:

Always set a maximum iteration count (e.g., 10) to prevent infinite loops. In production, add timeout and error handling for each tool execution.

Controlling Tool Selection: tool_choice

The tool_choice parameter controls when and how Claude uses tools:

tool_choice	Behavior	Use Case
`auto` (default)	Claude decides	General conversation + tools
`tool` + name	Must call the specified tool	Structured data extraction
`any`	Must call some tool	Force agent to take action
`none`	Disable tools	When you need pure text response

Strict Mode: Guaranteed Schema Compliance

Add strict: true to ensure Claude's tool calls strictly follow your JSON Schema:

Strict mode is strongly recommended in production. Without it, Claude occasionally returns parameters that don't match your schema (e.g., missing required fields), causing downstream code to break.

Combining with Prompt Caching

Tool definitions are first in the cache hierarchy — tools → system → messages. This means:

If tool definitions don't change (true for most applications), they get cached and subsequent requests read them for free
If you frequently add/remove tools, the entire cache chain invalidates

Best practice:

Cost breakdown:

Component	First Request	Subsequent Requests
tools (~500 tokens)	1.25x (write)	0.1x (read)
system prompt (~200 tokens)	1.25x (write)	0.1x (read)
conversation history (growing)	1.25x (write)	0.1x (read)
new user message	1x (normal)	1x (normal)

In a 10-turn conversation, tools + system prompt are cache-read 9 times, saving 90% each time.

Streaming with Tool Calls

When streaming, tool calls arrive incrementally:

Parallel Tool Calls

Claude can return multiple tool_use blocks in a single response, indicating parallel tool calls are needed:

Error Handling

When tool execution fails, send a result with is_error: true:

When Claude receives an error, it typically: informs the user something went wrong, retries with different parameters, or answers using alternative methods.

Practical Example: Customer Service Agent with Caching

Combining Prompt Caching from the previous chapter with Tool Use from this chapter:

Summary

Concept	Key Points
Tool definition	name + description + input_schema (JSON Schema)
Call loop	Check `stop_reason == "tool_use"` → execute → send `tool_result` → repeat
tool_choice	`auto` (default) / `tool` (force specific) / `any` (force any) / `none` (disable)
strict mode	Recommended for production — guarantees schema compliance
Parallel calls	Single response may contain multiple `tool_use` blocks — execute in parallel
Error handling	`is_error: true` tells Claude execution failed
Caching combo	tools are first in cache hierarchy — keep stable for maximum cache benefit

Next step: combine Tool Use with Extended Thinking to let Claude reason deeply before calling tools in complex decision scenarios, further improving Agent accuracy.

Claude API Tool Use: Connecting the Model to the Outside World

The Tool Use Model

Defining Tools

Handling the Tool Call Loop

Complete Agent Loop

Controlling Tool Selection: tool_choice

Strict Mode: Guaranteed Schema Compliance

Combining with Prompt Caching

Streaming with Tool Calls

Parallel Tool Calls

Error Handling

Practical Example: Customer Service Agent with Caching

Summary

Claude API Tool Use: Connecting the Model to the Outside World

The Tool Use Model

Defining Tools

Handling the Tool Call Loop

Complete Agent Loop

Controlling Tool Selection: tool_choice

Strict Mode: Guaranteed Schema Compliance

Combining with Prompt Caching

Streaming with Tool Calls

Parallel Tool Calls

Error Handling

Practical Example: Customer Service Agent with Caching

Summary