Teacher’s Edition
Module 13
The API and Beyond
How organizations embed Claude at scale. The restaurant analogy makes it concrete for a mixed audience. The anatomy of an API call, tool use that connects Claude to real systems, and the economics behind the investment.
Charter Oak Strategic Partners · Claude Mastery Program · Version 1.0 · Confidential · Not for distribution to participants
The Claude API is a programmatic interface that lets software send requests to Claude and receive responses without a human typing into a chat window. Every feature participants have used — Chat, Projects, Cowork, Skills — is built on top of this API. When a developer integrates Claude into a customer support system, an email processor, or an internal tool, they are writing code that sends messages to the API and handles the responses.
This module is not teaching participants to write API code. It is teaching them to understand what the API makes possible, how much it costs, and how to build a business case for integrating Claude into their organization’s systems. The people in this room will commission API integrations, not build them. They need to know enough to define requirements, evaluate proposals, and calculate ROI.
Use this analogy throughout the module. It makes abstract concepts concrete:
The API endpoint is the restaurant counter where you place your order. There is one main address: https://api.anthropic.com/v1/messages. All requests go here.
The API request is your order ticket. It contains: which model you want (the menu item), what you want Claude to do (the order), and how much output you want (the portion size). In technical terms: the model name, the messages array, and max_tokens.
The API response is the meal delivered to your table. It contains Claude’s text response plus metadata about how many tokens were used (the receipt).
The system prompt is standing instructions for the kitchen. “Always serve gluten-free.” In API terms: persistent instructions sent with every request that define Claude’s behavior for this use case. “You are a customer support agent for Acme Corp. Be helpful, concise, and always offer to escalate complex issues.”
Tokens are the ingredients. Claude processes text in chunks called tokens — roughly 4 characters or 0.75 words per token. You pay per token, both for what you send (input tokens) and what Claude generates (output tokens). The phrase “Hello, how are you today?” is approximately 7 tokens.
Tool use is the waiter checking the back for ingredient availability. Claude can call external functions during a conversation — look up an order in a database, check inventory, calculate a shipping estimate — and use the results in its response. The customer does not see the kitchen. They see the answer.
Opening: The Conceptual Foundation — 15 minutes
demo-data/module-13/api-demo-walkthrough.md— Annotated API request/response examples, pricing tables, and tool use diagram.“Every time you type a message in Claude Chat and hit enter, your computer sends an API request. Claude’s server processes it and sends back a response. The chat interface is a wrapper around the API. What we are going to learn today is what happens when you remove the wrapper and work with the API directly.”
“Why would you want to? Three reasons. Scale: the chat interface handles one conversation at a time. The API handles thousands simultaneously. Integration: the API plugs into your existing systems — your helpdesk, your CRM, your email. Customization: the API gives you control over every parameter — which model, how long the response should be, what tools Claude can use, what instructions it follows.”
Show an annotated API request on screen. Walk through each part:
“The model field: which version of Claude to use. Like choosing between a quick lunch counter (Haiku — fastest, cheapest), a good restaurant (Sonnet — balanced), or a Michelin-starred chef (Opus — most capable, most expensive). Most production use cases run on Sonnet because it offers the best balance of quality and cost.”
“The messages array: the conversation history. Every message has a role (user or assistant) and content (the text). You send the full conversation every time — Claude does not remember previous requests. This is a fundamental difference from Chat, where the conversation persists.”
“The max_tokens field: how long Claude’s response can be. Set it low (50 tokens) for classification tasks. Set it high (4,000 tokens) for long-form generation. You pay for what Claude actually generates, not the maximum.”
“The system field: persistent instructions. Every request can include a system prompt that defines Claude’s behavior. For a support bot: ‘You are a helpful support agent for Acme Corp. Be concise. If you do not know the answer, say so and offer to escalate.’ This is identical to the system prompts you learned in Module 05, applied at scale.”
The Models: Choosing the Right One — 10 minutes
Anthropic offers three model tiers. Pricing is per million tokens (MTok). Display this table on screen:
Claude Opus 4.6 — The most capable model. Best for complex reasoning, nuanced analysis, and tasks requiring the highest quality. Input: $5/MTok. Output: $25/MTok. Context window: up to 1 million tokens (with long-context pricing above 200K).
Claude Sonnet 4.6 / 4.5 / 4 — The workhorse. Best for most production use cases: support, classification, generation, analysis. Input: $3/MTok. Output: $15/MTok. Same context window. This is where most organizations start.
Claude Haiku 4.5 — The speed model. Best for high-volume, low-complexity tasks: classification, routing, extraction, moderation. Input: $1/MTok. Output: $5/MTok. Fastest response times.
Long-context pricing applies when total input tokens exceed 200,000. Above that threshold, input and output costs increase by approximately 50%.
Practical guidance for the room: “Start with Sonnet. If the output quality is not good enough for your use case, try Opus. If cost matters more than nuance, try Haiku. Most organizations land on Sonnet for 80% of their API use.”
Prompt caching. If you send the same system prompt and context with every request (common in production), Anthropic caches the repeated portion. Cached tokens are cheaper than fresh tokens. For high-volume use cases with consistent system prompts, caching can reduce costs by 50-90% on the cached portion.
Batch API. For tasks that do not need real-time responses (overnight data processing, bulk classification, weekly report generation), the Batch API offers a 50% discount. You submit a batch of requests, and Anthropic processes them within a specified window.
Model selection. Using Haiku instead of Sonnet for classification tasks reduces costs by 67%. Using Sonnet instead of Opus for generation tasks reduces costs by 40%. Right-sizing the model to the task is the single biggest cost lever.
Tool Use: Claude Calls Functions — 15 minutes
Tool use (also called function calling) allows Claude to call external functions during a conversation and use the results in its response. This is what makes Claude useful in real systems — without tools, Claude can only generate text. With tools, Claude can look up data, check inventory, calculate prices, send emails, update records, and take real actions.
The flow works in four steps:
1. You define the tools. When sending an API request, you include a list of tools Claude can use. Each tool has a name, a description, and a schema defining what inputs it expects. Example: a tool called get_order_status that takes an order ID and returns the current status.
2. Claude decides to use a tool. Based on the user’s question and the available tools, Claude determines which tool to call and with what inputs. If the user asks “Where is my order #12345?” Claude calls get_order_status with order_id “12345.”
3. Your system executes the function. The API response includes the tool call. Your code runs the actual function — queries the database, calls the shipping API, whatever the tool does — and sends the result back to Claude.
4. Claude generates the response. Claude reads the tool result and incorporates it into a natural language response. “Your order #12345 shipped yesterday via FedEx and is expected to arrive Thursday.”
The user never sees steps 2 and 3. They see a question and an accurate answer. The tools are invisible infrastructure.
Show the demo file on screen. Walk through a complete tool use flow:
“A customer asks: ‘Where is my order?’ Claude sees that a tool called get_order_status is available. It extracts the order number from the customer’s message. It calls the tool. The tool returns: shipped, FedEx, arriving Thursday. Claude responds: ‘Your order shipped yesterday via FedEx and should arrive Thursday. Is there anything else I can help with?’”
“The customer got a helpful, specific answer. Claude did not hallucinate a tracking number. It looked up the real data. This is the difference between a chatbot that guesses and a system that knows.”
Anthropic also provides a built-in code execution tool. When enabled, Claude can write and run code (Python, bash, file operations) within a sandboxed environment as part of an API response. This is the same capability that powers Cowork’s code execution, made available programmatically.
Use cases: data analysis (Claude writes a script to process a CSV and returns the results), document generation (Claude writes code to create a formatted report), and complex calculations (Claude writes the math rather than doing mental arithmetic). The code runs in a sandboxed container — it cannot access external systems unless tools are provided.
Pricing Exercise: Real-World Cost Calculation — 15 minutes
“Let us calculate what it would cost to run a real use case. I want you to follow along and do the math with me.”
Scenario: Customer support ticket classification.
“You receive 200 support tickets per day. Each ticket averages 150 words (approximately 200 input tokens, including the system prompt). Claude classifies each one into a category and returns a one-word label (approximately 5 output tokens). You use Haiku because speed and cost matter more than nuance for classification.”
“Daily input tokens: 200 tickets × 200 tokens = 40,000 tokens. Daily output tokens: 200 tickets × 5 tokens = 1,000 tokens. Monthly input: 1.2 million tokens. Monthly output: 30,000 tokens.”
“Cost: 1.2 MTok × $1 = $1.20 input. 0.03 MTok × $5 = $0.15 output. Monthly total: $1.35. For the year: $16.20.”
“Sixteen dollars a year to classify every support ticket automatically. That classification step takes your support team about two hours per day. With the API handling it, those two hours go back to your team — for solving complex cases, building customer relationships, improving documentation. The routine sorting is handled. Your people do the work that requires their expertise.”
Let the number land. Then: “This is why the API changes what your team spends their time on.”
“Now a more expensive use case. Personalized product descriptions.”
“An e-commerce site with 50 new products per day. Each needs a 200-word description. System prompt + product specs: approximately 500 input tokens. Generated description: approximately 250 output tokens. Using Sonnet for quality.”
“Daily input: 50 × 500 = 25,000 tokens. Daily output: 50 × 250 = 12,500 tokens. Monthly input: 750,000 tokens. Monthly output: 375,000 tokens.”
“Cost: 0.75 MTok × $3 = $2.25 input. 0.375 MTok × $15 = $5.63 output. Monthly total: $7.88. Annual: $94.50.”
“Under a hundred dollars a year for 15,000 first-draft product descriptions. Your content team reviews and refines them instead of writing each one from scratch. They focus on brand voice, creative campaigns, and strategic content — the work that requires human creativity. The routine descriptions are drafted automatically.”
“Now your turn. Pick a task from your organization. Estimate: how many requests per day? How many input tokens per request? How many output tokens? Which model? Calculate the monthly cost. Then calculate what it costs with a human. Share your numbers with your table.”
Give five minutes for individual calculation, five minutes for table discussion.
Participants consistently overestimate tokens. A 100-word email is about 135 tokens, not 500. A one-paragraph classification response is 20-30 tokens, not 200. Provide the rough conversion: 1 token is approximately 4 characters or 0.75 words. Have participants count the words in a sample text and multiply by 1.33 for a reasonable token estimate.
When API Makes Sense — 5 minutes
The API makes sense when four conditions are met:
Volume: The task occurs often enough that manual processing is a bottleneck. Below 10 requests per day, Chat or Cowork is fine. Above 50 per day, the API becomes compelling. Above 500, it is the only practical option.
Consistency: The task needs the same treatment every time. Classification, extraction, routing, and formatting tasks are ideal because the instructions are stable and the output format is predictable.
Integration: The task connects to existing systems. If the output needs to go into a database, trigger a workflow, update a CRM record, or send a notification, the API is the delivery mechanism.
Speed: The response is needed in seconds, not minutes. Real-time applications (chatbots, live classification, dynamic content) require the API’s programmatic speed.
If fewer than two conditions are met, stick with Chat, Projects, or Cowork.
Debrief — 10 minutes
“In Tier I, you learned Claude as a tool you use. In Tier II, you learned Claude as a system you design. In this module, you learned Claude as infrastructure your business runs on. The API is where Claude stops being something you open when you need help and starts being something that processes work automatically, at scale, 24/7.”
“You do not need to write the code. You need to define the use case, calculate the economics, and commission the integration. That is what this module gave you: the vocabulary to have that conversation with your technical team, and the math to justify it to your leadership.”
For participants who want to explore further after the session:
Anthropic Console (console.anthropic.com): Create an account, get an API key, and experiment with the API Playground. The Playground lets you send requests and see responses without writing code.
Documentation (docs.anthropic.com): The full API reference with examples in Python, TypeScript, and cURL. Start with the “Getting Started” guide.
Anthropic Cookbook (github.com/anthropics/anthropic-cookbook): Practical code examples for common use cases: classification, extraction, tool use, RAG, content moderation. These can be shared with engineering teams as starting points.
Pricing calculator: The interactive exercise from this module can be recreated with real numbers. Have participants bring their actual task volumes and work with their engineering teams to build accurate cost models.
| Segment | Activity | Time |
|---|---|---|
| Conceptual Foundation | Restaurant analogy, API anatomy | 15 min |
| Models | Opus/Sonnet/Haiku, pricing table, selection framework | 10 min |
| Tool Use | Four-step flow, order status demo, code execution | 15 min |
| Pricing Exercise | Two worked examples, individual calculation | 15 min |
| Decision Framework | When API makes sense (volume, consistency, integration, speed) | 5 min |
| Debrief | From tool to infrastructure, next steps | 10 min |