Claude Code Subagents: How They Work and When to Use Them

Claude Code subagents run a side task in their own context window with their own tools. How to build one, when to use them, and when to skip.

Monday, June 22, 2026

Omid Saffari

Tools

Claude Code

Claude Code Subagents: How They Work and When to Use Them

Claude Code subagents are specialized assistants that do a side task in their own context window and hand back only the summary. The point is not "more agents." It is keeping your main session clean so a long build does not collapse under its own search results, logs, and half-read files.

A subagent is a function you delegate to: Claude hands it a job, it works in a separate window you never see, and it returns one clean answer. Used well, it is the single biggest lever for keeping Claude Code coherent across a multi-hour session. Used badly, it is a way to burn tokens and add latency for no gain. This guide covers both: the exact mechanics, the file that defines one, and the honest rule for when a subagent earns its keep.

The 20-second version

A subagent runs a self-contained side task in its own isolated context, then returns a short summary to your main conversation. That is the whole idea. It exists to solve one specific problem: long agentic sessions fill up with noise (grep results, build logs, files read once and never referenced) until the model is spending more attention managing clutter than doing your work.

Move that noisy work into a subagent and the clutter stays over there. Your main thread sees only the conclusion. So the right mental model is not "a team of AI engineers collaborating." It is context hygiene with a return value. Hold that and every other decision about subagents gets easier.

What a subagent actually is

A subagent runs in its own context window with a custom system prompt, specific tool access, and independent permissions. When Claude hits a task that matches a subagent's stated job, it delegates; the subagent works independently and returns its result. That is the official definition, and the parts that matter are own context window and returns its result.

Here is the piece most explanations get wrong. A subagent starts fresh. It does not see your conversation history, the files Claude already read, or the skills you already invoked. Claude composes a short delegation message that summarizes the task, and the subagent works from there with only its own system prompt and basic environment details like the working directory, not the full Claude Code system prompt. When it finishes, only its summary comes back.

That isolation cuts both ways, and understanding the trade is the whole game:

The win: all the verbose middle (the twelve files it read, the failed grep, the log it scanned) never touches your main context. You get the answer without the mess.
The cost: the subagent has to re-gather context you already had. If the task needs your conversation history to make sense, a from-scratch subagent will flounder.

The file that defines one

A custom subagent is a Markdown file with YAML frontmatter. The frontmatter is the configuration; the Markdown body becomes the subagent's system prompt. Only two fields are required: name and description.

Markdown

---
name: code-reviewer
description: Reviews code for quality and best practices. Use immediately after writing or modifying code.
tools: Read, Glob, Grep
model: sonnet
---

You are a senior code reviewer. When invoked, run git diff to see recent
changes, focus on modified files, and review for clarity, naming, error
handling, exposed secrets, input validation, and test coverage. Group your
feedback by priority: critical issues, warnings, then suggestions.

That is a complete, working subagent. The description is not decoration: Claude reads it to decide when to delegate, so "reviews code, use immediately after writing or modifying code" pulls the subagent in at the right moment, while a vague "code helper" leaves it sitting unused. Write the description like a job posting for a specialist.

You do not have to hand-write the file. Claude Code's /agents command opens a tabbed interface for managing subagents: a Running tab that lists live and recently finished subagents (where you can open or stop them) and a Library tab to create, edit, and organize them. One thing to know: subagents are loaded at session start, so if you edit a file directly on disk you need to restart your session to pick up the change. Subagents created or edited through the /agents interface take effect immediately.

Where subagents live, and who wins

The same subagent file behaves differently depending on where you put it, and the location decides who can use it and which definition wins when names collide.

Location	Scope	Check into git?
`.claude/agents/`	This project only	Yes, share it with the team
`~/.claude/agents/`	All your projects (personal)	No (it is your home dir)
Plugin `agents/`	Whoever installs the plugin	Via the plugin
Managed (org admin)	Everyone in the org	Org-controlled

When two subagents share a name, the higher-priority location wins: managed definitions beat project, project beats user. Project subagents are discovered by walking up from your working directory, and as of v2.1.178 the definition nearest the working directory wins on a name clash. There is also a session-only option: pass subagents as JSON with the --agents flag when launching Claude Code, and they exist for that session without ever being saved to disk, which is handy for automation scripts and quick tests.

The practical rule: a subagent that encodes how this codebase should be reviewed or tested belongs in .claude/agents/ and goes into version control so your whole team gets it. A subagent that reflects how you personally like to work goes in ~/.claude/agents/ and follows you across every project.

Model and tools: the two dials that matter

Two frontmatter fields do most of the real work, and getting them right is where subagents go from neat to genuinely cost-effective.

Model. The model field takes an alias (sonnet, opus, haiku, fable), a full model ID like claude-opus-4-8, or inherit. If you omit it, it defaults to inherit, meaning the subagent uses the same model as your main conversation. This is a budget lever hiding in plain sight. A subagent whose entire job is "grep the codebase and summarize what you find" does not need a frontier model. Set model: haiku and the grunt work runs on a cheaper, faster model while your main session stays on Opus for the thinking.

Tools. By default a subagent inherits every internal tool and MCP tool your main conversation has. You narrow that with one of two fields: tools is an allowlist (only these), disallowedTools is a denylist (everything except these). A research subagent that should never touch your files gets tools: Read, Grep, Glob, Bash and physically cannot edit anything. This is not just tidiness. It is a real safety boundary: a read-only reviewer cannot accidentally rewrite the file it is reviewing.

One more dial worth knowing: isolation: worktree runs the subagent in a temporary git worktree, an isolated copy of your repository, which is cleaned up automatically if the subagent makes no changes. Reach for it when you want a subagent to try something risky without it stepping on your working tree.

When to use a subagent, and when not to

This is the section the GitHub config-dumps skip, and it is the one that actually saves you time and money. A subagent is not free, so the question is never "could I use one here" but "does isolation pay for itself here."

Ask: is the output verbose and disposable?
The classic win is a task that generates a pile of intermediate output you will never reference again: scanning a large codebase, reading a dozen files to answer one question, triaging a noisy log. The subagent absorbs all of it and hands you the one-paragraph conclusion.
Ask: is the task self-contained?
A subagent starts fresh, so it shines when the job can be stated completely in the delegation message. "Find every place we call the billing API and list them" is self-contained. "Continue what we were just discussing" is not, because the subagent never saw the discussion.
Ask: do I need a hard tool boundary?
If you want to guarantee something stays read-only, a subagent with a tools allowlist enforces it at the system level rather than hoping the model behaves.

And the cases where a subagent is the wrong tool, straight from how the feature is meant to be used:

Iterative back-and-forth. If the task needs frequent refinement and your judgment in the loop, keep it in the main conversation. Delegating breaks the loop.
Phases that share context. A planning then implementation then testing flow where each phase builds on the last belongs in one thread, not three isolated subagents that each start blind.
Quick, targeted changes. Spinning up a subagent to fix one typo costs more than the fix.
Latency-sensitive work. Subagents start fresh and need time to gather context. When you want an answer now, inline is faster.

Subagent vs skill vs fork vs agent team

Claude Code now has four ways to structure work, and they are easy to confuse. The difference is entirely about context.

Mechanism	Context	Reach for it when
Skill	Runs in your main conversation	You want reusable instructions or a workflow that has your full context
Subagent	Fresh, isolated window; returns a summary	The work is verbose, self-contained, and you want it out of your main context
Fork	Inherits your entire conversation	You want a side task done with full context, without re-explaining the situation
Agent team	Each worker its own independent context	You need sustained parallelism that exceeds a single context window

The split that trips people up most is skill vs subagent, because both feel like "saved expertise." A skill is reusable instructions that run in your main conversation's context, so it sees everything you are doing. A subagent is the opposite: it is isolated by design and sees nothing but the task you hand it. If you want Claude to apply a house style while it keeps working with you, that is a skill. If you want a noisy investigation to happen out of sight and report back, that is a subagent.

The fork is the useful middle ground: it inherits the whole conversation so far (same system prompt, tools, model, and message history), so you do not re-explain anything, but its own tool calls still stay out of your main context and only its final result returns. Use a fork when a named subagent would need too much background to be useful. (Fork mode is gated behind the CLAUDE_CODE_FORK_SUBAGENT environment variable.)

The power features, and their limits

Two newer capabilities make subagents far more powerful, and each comes with a real edge to respect.

Nesting. As of Claude Code v2.1.172, a subagent can spawn its own subagents. A reviewer subagent can dispatch a verifier per finding, and all that intermediate output stays buried; only the top-level subagent's summary reaches you. The limit is fixed: nesting goes five levels deep, it is not configurable, and a subagent at depth five does not receive the Agent tool, so it cannot spawn further. That ceiling exists for a reason. Deeply nested fan-out is how you accidentally launch a swarm and watch your token spend spike, so treat depth as a budget, not a target.

Memory. Add memory: project (or user, or local) and a subagent gets a persistent memory directory it reads from and writes to across sessions. A code-reviewer with project memory builds up a record of your conventions and recurring issues over time, so it gets sharper the more you use it. project is the recommended default because it is shareable through version control. When memory is on, Claude Code injects the first 200 lines (or 25KB) of the subagent's MEMORY.md into its system prompt at startup, so keep that file curated.

The upside

What it does well

8 points

Keeps a long session coherent by quarantining verbose work
Hard tool boundaries you can actually trust (read-only means read-only)
Cheaper models on grunt work, frontier models on judgment
Memory turns a subagent into something that learns your codebase
Fresh-start isolation means a subagent re-gathers context, which costs latency and tokens
Summaries returning to the main thread still consume context; many at once defeats the purpose
It is delegation, not a collaborating team, so do not architect for cross-agent chatter
Nesting can fan out into real spend if you are not deliberate

A good first move: build one focused subagent (a read-only code-reviewer is the canonical example), check it into .claude/agents/, and let it run after each change. Once that proves its worth, add a research subagent on a cheaper model. The teams that get value here are not running thirty agents. They are running three that each do one job well. If you are choosing between coding tools in the first place, that is a different decision, covered in Codex vs Claude Code vs Cursor and the wider field in the best AI coding agents.

What is the difference between Claude Code subagents and skills?

A skill is reusable instructions or a workflow that runs inside your main conversation's context, so it sees everything you are doing. A subagent spins up a separate, isolated context window, does its task without your history, and returns only a summary. Use a skill to apply expertise in-line; use a subagent to push noisy, self-contained work out of your main context.

How deep can Claude Code subagents nest?

As of v2.1.172 a subagent can spawn its own subagents, up to a fixed depth of five levels. At depth five a subagent no longer receives the Agent tool and cannot spawn further. The limit is not configurable. Only the top-level subagent's summary returns to your main conversation.

Do Claude Code subagents cost extra?

There is no separate feature charge: subagents are built into Claude Code. But each one runs in its own context and consumes tokens, and its summary consumes context when it returns. Offloading a large investigation pays off; spinning up a subagent for a trivial change usually costs more than doing it inline.

Where do I store a subagent file?

Put project-specific subagents in .claude/agents/ and check them into version control so your team shares them. Put personal subagents in ~/.claude/agents/ to make them available across all your projects. On a name clash, the higher-priority location (managed, then project, then user) wins.

Can I make a subagent use a cheaper model?

Yes. Set the model field in the subagent's frontmatter to an alias like haiku, sonnet, opus, or fable, or a full model ID. It defaults to inherit (your main conversation's model). Pointing high-volume, low-judgment subagents at a smaller model is the main way to control cost.

Key Takeaways

A subagent runs a self-contained side task in its own isolated context window and returns only a summary, keeping your main session clean.
It is defined by a Markdown file with YAML frontmatter; only name and description are required, and the body becomes its system prompt.
Store project subagents in .claude/agents/ (check into git) and personal ones in ~/.claude/agents/; managed and project scopes win on name clashes.
The two dials that matter: model (use a cheaper one for grunt work) and tools (an allowlist or denylist that enforces real boundaries).
Reach for a subagent when the work is verbose, disposable, and self-contained; stay in the main thread for iterative, context-sharing, or quick tasks.
It is delegation, not a collaborating team. Nesting tops out at five fixed levels, and every subagent burns tokens, so use them where isolation genuinely pays.

If you are wiring Claude Code into how you actually ship, that is most of what I write about. Join the newsletter for the working setups, not the launch-day hype.

Last Updated

Jun 22, 2026

CategoryAI

Claude Code Subagents: How They Work and When to Use Them

The 20-second version

What a subagent actually is

The file that defines one

Where subagents live, and who wins

Model and tools: the two dials that matter

When to use a subagent, and when not to

Ask: is the output verbose and disposable?

Ask: is the task self-contained?

Ask: do I need a hard tool boundary?

Subagent vs skill vs fork vs agent team

The power features, and their limits

More from AI

Shopify Review (Verified August 2026)

ChatGPT Pricing (2026): Plus Is the $20 Sweet Spot

Best AI Wearables in 2026: Friend, Omi, Plaud, Bee, Meta and Oura (Compared)

Best AI Summarizer 2026: NotebookLM vs Claude vs ChatGPT vs QuillBot (Compared)

Best AI Research Tools in 2026: Elicit vs Consensus vs Scite vs NotebookLM vs Perplexity (Compared)

Best AI Transcription in 2026: Sonix vs Otter vs Descript vs Fireflies (Compared)

Ollama Alternatives in 2026: LM Studio, vLLM, llama.cpp, and Jan (Verified July 2026)

Best AI Search Engine 2026: Perplexity vs Google AI Mode vs ChatGPT Search vs Brave vs Consensus (Compared)

One letter, every Sunday. Working systems, not hot takes.