An AI agent is a system that uses a language model to decide what actions to take, executes those actions, observes the results, and repeats until the task is done. Unlike a simple prompt-response interaction, agents maintain state across multiple steps and can use tools to interact with the outside world.
The Claude Agent SDK gives you the building blocks to create these agents in TypeScript. This post walks through the architecture, core concepts, and a working example you can adapt for your own projects.
What Makes an Agent Different from a Chatbot
A chatbot takes input, generates output, done. An agent runs in a loop:
- Receive a task or observation
- Decide what to do next (reason)
- Execute an action using a tool
- Observe the result
- Repeat from step 2 until the task is complete
The key difference is autonomy. You give an agent a goal, not step-by-step instructions. It figures out the path.
Claude Agent SDK Overview
The SDK provides:
- Agent class — The core runtime that manages the reasoning loop
- Tool definitions — Typed functions the agent can call
- Message history — Automatic conversation management
- Streaming — Real-time output as the agent works
- Guardrails — Input and output validation
Install it:
npm install @anthropic-ai/agent-sdkBuilding a Code Review Agent
Let's build something practical: an agent that reviews pull requests. Given a diff, it analyzes code quality, identifies potential bugs, checks for security issues, and provides actionable feedback.
Step 1: Define the Tools
Tools are functions the agent can call. Each tool has a name, description, input schema, and an execute function.
import { Tool } from "@anthropic-ai/agent-sdk";
import { z } from "zod";
import { execSync } from "child_process";
import { readFileSync } from "fs";
const readFileTool: Tool = {
name: "read_file",
description: "Read the contents of a file at the given path",
inputSchema: z.object({
path: z.string().describe("Absolute file path to read"),
}),
async execute({ path }) {
try {
const content = readFileSync(path, "utf-8");
return { content };
} catch (error) {
return { error: `Failed to read ${path}: ${error.message}` };
}
},
};
const gitDiffTool: Tool = {
name: "git_diff",
description: "Get the git diff for a branch compared to main",
inputSchema: z.object({
branch: z.string().describe("Branch name to diff against main"),
}),
async execute({ branch }) {
try {
const diff = execSync(`git diff main...${branch}`, {
encoding: "utf-8",
maxBuffer: 1024 * 1024 * 10,
});
return { diff };
} catch (error) {
return { error: `Failed to get diff: ${error.message}` };
}
},
};
const listChangedFilesTool: Tool = {
name: "list_changed_files",
description: "List files changed in a branch compared to main",
inputSchema: z.object({
branch: z.string().describe("Branch name"),
}),
async execute({ branch }) {
try {
const files = execSync(`git diff --name-only main...${branch}`, {
encoding: "utf-8",
});
return { files: files.trim().split("\n") };
} catch (error) {
return { error: `Failed to list files: ${error.message}` };
}
},
};
const searchCodeTool: Tool = {
name: "search_code",
description: "Search for a pattern across the codebase using grep",
inputSchema: z.object({
pattern: z.string().describe("Regex pattern to search for"),
fileGlob: z.string().optional().describe("File glob to filter (e.g. *.ts)"),
}),
async execute({ pattern, fileGlob }) {
try {
const globFlag = fileGlob ? `--include='${fileGlob}'` : "";
const results = execSync(
`grep -rn ${globFlag} '${pattern}' . --include='*.{ts,tsx,js,jsx}'`,
{ encoding: "utf-8", maxBuffer: 1024 * 1024 }
);
return { results };
} catch {
return { results: "No matches found" };
}
},
};Step 2: Create the Agent
import { Agent } from "@anthropic-ai/agent-sdk";
const codeReviewAgent = new Agent({
name: "Code Review Agent",
model: "claude-sonnet-4-20250514",
instructions: `You are a senior code reviewer. Given a branch name, you will:
1. List all changed files
2. Read the diff to understand the changes
3. Read full files when you need more context
4. Search the codebase for related patterns when needed
5. Provide a structured review with:
- Summary of changes
- Potential bugs or logic errors
- Security concerns
- Performance issues
- Style/convention violations
- Specific suggestions with code examples
Be direct and actionable. Don't praise obvious things. Focus on what could break or be improved.`,
tools: [readFileTool, gitDiffTool, listChangedFilesTool, searchCodeTool],
});
Step 3: Run the Agent
import { Runner } from "@anthropic-ai/agent-sdk";
async function reviewPR(branch: string) {
const runner = new Runner();
const result = await runner.run(codeReviewAgent, {
messages: [
{
role: "user",
content: `Review the code changes on branch "${branch}" compared to main. Be thorough but concise.`,
},
],
});
console.log(result.finalOutput);
}
reviewPR("feature/add-user-search");When you run this, the agent will:
- Call
list_changed_filesto see what's been modified - Call
git_diffto read the actual changes - Call
read_fileon specific files for full context - Call
search_codeto check for related patterns or potential impacts - Synthesize everything into a structured review
Each tool call is a decision the agent makes based on what it's learned so far.
Multi-Step Reasoning
The agent loop is where the magic happens. After each tool call, Claude sees the result and decides what to do next. This creates emergent behavior — the agent adapts its strategy based on what it finds.
For example, if the diff shows a change to a database query, the agent might:
- Read the full model file for context
- Search for other queries that use the same table
- Check if there's a migration that needs updating
- Look for tests that cover this code path
You didn't program this sequence. The agent figured it out because its instructions say to be thorough and it has the tools to investigate.
Error Handling
Agents will encounter errors — files that don't exist, commands that fail, unexpected data formats. Build resilience into your tools:
const robustTool: Tool = {
name: "read_file",
description: "Read a file's contents",
inputSchema: z.object({ path: z.string() }),
async execute({ path }) {
try {
const content = readFileSync(path, "utf-8");
// Truncate very large files to avoid context limits
if (content.length > 50000) {
return {
content: content.slice(0, 50000),
truncated: true,
totalLength: content.length,
note: "File was truncated. Request specific line ranges if needed.",
};
}
return { content };
} catch (error) {
if (error.code === "ENOENT") {
return { error: `File not found: ${path}` };
}
if (error.code === "EACCES") {
return { error: `Permission denied: ${path}` };
}
return { error: `Unexpected error reading ${path}: ${error.message}` };
}
},
};Return errors as data, not exceptions. The agent can read error messages and adjust its approach — maybe it'll try a different file path or ask for clarification.
Guardrails
The SDK supports input and output guardrails to keep agents on track:
const agent = new Agent({
name: "Code Review Agent",
model: "claude-sonnet-4-20250514",
instructions: "...",
tools: [...],
inputGuardrails: [
{
name: "branch_validation",
async execute({ input }) {
// Don't allow reviewing main directly
if (input.includes("main") && !input.includes("compared to main")) {
return { blocked: true, reason: "Cannot review the main branch itself." };
}
return { blocked: false };
},
},
],
outputGuardrails: [
{
name: "no_secrets",
async execute({ output }) {
const secretPatterns = [/API_KEY\s*=\s*\S+/, /password\s*[:=]\s*\S+/i];
for (const pattern of secretPatterns) {
if (pattern.test(output)) {
return { blocked: true, reason: "Output contains potential secrets." };
}
}
return { blocked: false };
},
},
],
});Handoffs Between Agents
For complex workflows, you can chain specialized agents. A triage agent decides what kind of review is needed, then hands off to specialized agents:
const securityReviewer = new Agent({
name: "Security Reviewer",
model: "claude-sonnet-4-20250514",
instructions: "Focus exclusively on security vulnerabilities...",
tools: [...],
});
const performanceReviewer = new Agent({
name: "Performance Reviewer",
model: "claude-sonnet-4-20250514",
instructions: "Focus exclusively on performance issues...",
tools: [...],
});
const triageAgent = new Agent({
name: "Review Triage",
model: "claude-haiku-4-20250514",
instructions: "Analyze the changed files and delegate to the appropriate reviewer.",
tools: [...],
handoffs: [securityReviewer, performanceReviewer],
});The triage agent uses a cheaper, faster model to decide routing, then hands off to specialized agents that do deep analysis. This saves cost and improves quality.
Deployment Tips
Keep tools focused. Each tool should do one thing well. An agent with 5 focused tools outperforms one with 20 kitchen-sink tools because the model makes better decisions with clear options.
Log everything. Record every tool call, input, output, and the agent's reasoning. You'll need this for debugging when the agent does something unexpected.
Set token limits. Agents can get into loops. Set a maximum token budget or step count to prevent runaway sessions.
Test with diverse inputs. Agents are non-deterministic. The same input might produce different tool call sequences. Test with varied scenarios and edge cases.
Start with Claude Sonnet. It's the best balance of capability and cost for agent workloads. Use Opus for tasks requiring deep reasoning. Use Haiku for triage and simple routing.
AI agents are the most practical application of language models today. They turn AI from a suggestion engine into a collaborator that takes action. The Claude Agent SDK makes building them straightforward — define your tools, write clear instructions, and let the reasoning loop handle the rest.
For a broader look at the agentic AI landscape — including Claude Code, MCP, and multi-agent systems — read Agentic AI: When Your Code Writes, Tests, and Deploys Itself.