AIStackDesignWorkflowGrowthAbout
Work With Me
Work With Me

AI Blueprints to Leverage Your Business. Strategies. Systems. Execution.

hi@omidsaffari.com
Instagram·X·LinkedIn·GitHub
Navigation
  • HomeHome
  • AboutAbout
  • BlogBlog
  • NewsletterNewsletter
  • Work With MeWork With Me
  • ContactContact
Legal
  • PrivacyPrivacy
  • TermsTerms
  • DisclaimerDisclaimer
  • SitemapSitemap
  • RSS FeedRSS Feed
Categories
  • AIAI
  • StackStack
  • DesignDesign
  • WorkflowWorkflow
  • GrowthGrowth
Topics
  • AI AgentsAI Agents
  • PromptsPrompts
  • Next.jsNext.js
  • n8nn8n
  • NotionNotion
Formats
  • GuidesGuides
  • LabsLabs
  • ToolsTools
  • TrendsTrends
  • ResourcesResources
More Formats
  • TutorialsTutorials
  • Case StudiesCase Studies
  • ComparisonsComparisons
  • TemplatesTemplates
  • ChecklistsChecklists
Empire
  • DaVinci HorizonDaVinci Horizon
  • Imperfeqt AIImperfeqt AI
  • DVNC StudioDVNC Studio
  • DVNC.aeDVNC.ae
  • With LidaWith Lida
Connect
  • YouTubeYouTube
  • Twitter/XTwitter/X
  • LinkedInLinkedIn
  • GitHubGitHub
  • InstagramInstagram
© 2026 omidsaffari.comBuilt with Next.js · Vercel
  1. Blog
  2. Stack

Production Claude Code in May 2026: the Skills, Hooks, and Orchestration Layout That Actually Ships

Claude Code 2.1.130-138 shipped skills effort levels, plugin URL flags, and MCP OAuth fixes in nine days. Here is the production .claude/ directory layout, the hooks pattern that catches 80% of QA failures, and where the whole thing breaks.

LevelAdvanced
Tools
claude-code
C
mcp
M
typescript
T
Production Claude Code in May 2026: the Skills, Hooks, and Orchestration Layout That Actually Ships
Omid Saffari

Founder & CEO, AI Entrepreneur

Share
Stay updated

Get weekly AI blueprints and insights.

Claude Code 2.1.130 through 2.1.138 landed between May 1 and May 9, 2026. Nine days, one cluster of production-grade primitives: skills effort levels, the --plugin-url flag, gateway model discovery, and an MCP OAuth refresh fix that had been blocking serious deployments for six weeks. HN converged on "orchestration beats autonomy" as the correct mental model, which is right, but that thesis stays at the thesis level. Nobody published the actual repo layout. I run Claude Code stacks for paying clients and on the omidsaffari-admin project, and this is the directory, the hooks, the settings, and the places the whole thing falls apart.

The production directory layout, end to end

The .claude/ directory is where the orchestration lives. Everything outside it is your application. Inside it, five things matter: skills/, commands/, settings.json, settings.local.json, and CLAUDE.md. Here is the exact tree from a client engagement at dvnc.dev:

text
1.claude/
2 CLAUDE.md
3 settings.json
4 settings.local.json # gitignored
5 skills/
6 core.md
7 typescript.md
8 review.md
9 test.md
10 commands/
11 pr-ready.md
12 db-migrate.md
13 typecheck.md
14 sync-env.md

CLAUDE.md is the project constitution. It tells Claude Code what this codebase is, what it must never do, and which skills are in scope. Keep it under 400 words. The model reads it on every session start; padding it with policies nobody checks is how you get a 900-token context tax on every invocation.

settings.json is committed. settings.local.json is not. The split matters because settings.local.json carries developer-specific allowlists and local tool paths that have no business in version control. The pattern is:

.claude/settings.json
json
1{
2 "model": "claude-opus-4-5",
3 "skills": ["core", "typescript", "review", "test"],
4 "permissions": {
5 "allow": [
6 "Bash(npm run *)",
7 "Bash(git diff *)",
8 "Bash(git status)",
9 "Bash(git log --oneline *)",
10 "Read(**)",
11 "Write(src/**)",
12 "Write(tests/**)"
13 ],
14 "deny": [
15 "Bash(git push *)",
16 "Bash(rm -rf *)",
17 "Write(.env*)"
18 ]
19 },
20 "hooks": {
21 "PreToolUse": ".claude/hooks/pre-tool-use.ts",
22 "PostToolUse": ".claude/hooks/post-tool-use.ts",
23 "SessionStart": ".claude/hooks/session-start.ts"
24 }
25}

The deny list is not aspirational. git push stays out of the allow list because Claude Code pushing directly to main on a client repo is an incident, not a feature. .env* stays denied because I have watched a model helpfully write a .env.production with placeholder values over a real one.

settings.local.json typically looks like:

.claude/settings.local.json
json
1{
2 "permissions": {
3 "allow": [
4 "Bash(psql -U $PGUSER *)",
5 "Bash(open *)"
6 ]
7 }
8}

Permissions in settings.local.json merge with settings.json. The local file wins on conflict.

Skills versus slash commands: where each one wins

Skills and slash commands solve different problems and conflating them is the most common layout mistake.

A skill is a steering primitive. It tells the model what kind of thinker to be for this session. typescript.md does not list commands; it establishes priors: prefer unknown over any, never widen a union in a type guard, keep function signatures under four parameters. When a skill fires, it shapes every tool call the model makes for its duration. Skills are declarative.

.claude/skills/typescript.md
markdown
1---
2name: typescript
3effort: normal
4triggers:
5 - "*.ts"
6 - "*.tsx"
7---
8
9You are working in a strict TypeScript codebase.
10
11Rules:
12- No `any`. Use `unknown` and narrow explicitly.
13- Prefer `satisfies` over `as` for type assertions.
14- Keep functions under 40 lines. Extract if needed.
15- Every public function gets a JSDoc block with @param and @returns.
16- Discriminated unions over boolean flag parameters.

The effort field is new in 2.1.130. It accepts low, normal, and high. low means the model skips extended reasoning passes and returns faster at lower token cost. high means it reasons through edge cases before writing code. For a review.md skill I set effort: high. For test.md running unit test generation I set effort: normal. The practical difference on a complex type refactor is roughly 40 seconds and $0.08 per invocation. On a test stub generation pass, low is fine and saves both.

A slash command is a deterministic workflow. /pr-ready runs in sequence: typecheck, lint, test, generate a summary of changed files, and print the result. It does not make decisions; it executes steps. The model is the executor, not the planner.

.claude/commands/pr-ready.md
markdown
1Run the following in order. Stop and report if any step fails.
2
31. `npm run typecheck` -- report any errors verbatim
42. `npm run lint` -- report any warnings at error severity
53. `npm run test -- --coverage` -- report coverage delta vs main
64. `git diff --stat main` -- summarize files changed in 3 sentences
75. Print "READY" if all steps passed, "BLOCKED: <step>" if any failed

The boundary is clear: if you are giving the model a set of rules to apply when it thinks, use a skill. If you are giving it a sequence of commands to run mechanically, use a slash command.

One non-obvious interaction: skills can disable model invocation entirely for specific file patterns. A skill with effort: low and triggers: ["*.svg"] will handle SVG file reads without spinning up a reasoning pass. For projects with large asset trees this is a meaningful token reduction.

The hooks pattern that catches 80% of QA failures

Three hook events matter in practice: PreToolUse, PostToolUse, and SessionStart. The other events exist but they fire in situations where you either cannot recover (session crash) or do not need to act (tool cancel).

PreToolUse is your write guard. It fires before any tool executes. The job is to block destructive writes before they happen, not clean them up after.

.claude/hooks/pre-tool-use.ts
typescript
1import { HookContext } from "@anthropic/claude-code";
2
3export default async function preToolUse(ctx: HookContext) {
4 const { tool, input } = ctx;
5
6 if (tool === "Write") {
7 const path = input.file_path as string;
8
9 // Block writes to migration files unless the command is db-migrate
10 if (path.includes("/migrations/") && ctx.command !== "db-migrate") {
11 return ctx.block(
12 `Migration write blocked outside /db-migrate. Run /db-migrate to modify migrations.`
13 );
14 }
15
16 // Block writes to generated files
17 if (path.includes(".generated.") || path.includes("/__generated__/")) {
18 return ctx.block(
19 `Do not write generated files directly. Modify the source and run the generator.`
20 );
21 }
22 }
23
24 if (tool === "Bash") {
25 const cmd = (input.command as string).trim();
26 if (cmd.startsWith("git push")) {
27 return ctx.block(`git push is not allowed. Open a PR instead.`);
28 }
29 }
30}

PostToolUse runs after a tool completes. The job here is to catch the failure the model would otherwise silently move past. TypeScript errors are the main case: Claude Code will write a file, see the write succeed, and continue. It does not automatically re-run the type checker. You have to.

.claude/hooks/post-tool-use.ts
typescript
1import { HookContext } from "@anthropic/claude-code";
2import { execSync } from "child_process";
3
4export default async function postToolUse(ctx: HookContext) {
5 const { tool, input } = ctx;
6
7 if (tool === "Write") {
8 const path = input.file_path as string;
9
10 if (path.endsWith(".ts") || path.endsWith(".tsx")) {
11 try {
12 execSync(`npx tsc --noEmit --pretty false 2>&1 | head -30`, {
13 stdio: "pipe",
14 cwd: process.cwd(),
15 });
16 } catch (err) {
17 const output = (err as { stdout: Buffer }).stdout.toString();
18 ctx.warn(
19 `TypeScript errors after writing ${path}:\n${output}\nFix these before continuing.`
20 );
21 }
22 }
23 }
24}

ctx.warn does not block execution. It injects the error output into the model's context so the next prompt has the type errors visible. In practice the model fixes them on the next pass 80% of the time without a human prompt.

SessionStart is the hook that keeps breaking, and it broke harder after Claude Code on the Web launched. The Web client does not inherit your shell environment. It does not have your $PATH, your .nvmrc, or your $DATABASE_URL. If your SessionStart hook calls which psql and psql is not in the Web client's PATH, the hook throws, and Claude Code starts the session with no environment context at all.

The fix is to make SessionStart defensive about environment:

.claude/hooks/session-start.ts
typescript
1import { HookContext } from "@anthropic/claude-code";
2import { execSync } from "child_process";
3import { existsSync, readFileSync } from "fs";
4
5export default async function sessionStart(ctx: HookContext) {
6 const lines: string[] = [];
7
8 // Node version -- safe to check, node is always present
9 try {
10 const node = execSync("node --version", { stdio: "pipe" }).toString().trim();
11 lines.push(`Node: ${node}`);
12 } catch {
13 lines.push("Node: unknown");
14 }
15
16 // TypeScript version -- optional
17 try {
18 const tsc = execSync("npx tsc --version", { stdio: "pipe" }).toString().trim();
19 lines.push(`TypeScript: ${tsc}`);
20 } catch {
21 lines.push("TypeScript: not found in PATH");
22 }
23
24 // Database -- only report if env var exists, do not try to connect
25 const dbUrl = process.env.DATABASE_URL;
26 lines.push(`Database: ${dbUrl ? "env var present" : "NOT SET -- check .env"}`);
27
28 // Git branch
29 try {
30 const branch = execSync("git rev-parse --abbrev-ref HEAD", { stdio: "pipe" })
31 .toString()
32 .trim();
33 lines.push(`Branch: ${branch}`);
34 } catch {
35 lines.push("Branch: not a git repo");
36 }
37
38 // Active skill set from settings
39 if (existsSync(".claude/settings.json")) {
40 const settings = JSON.parse(readFileSync(".claude/settings.json", "utf-8"));
41 lines.push(`Skills: ${(settings.skills ?? []).join(", ")}`);
42 }
43
44 ctx.info(`Session context:\n${lines.join("\n")}`);
45}

Every external call is wrapped in try/catch. Every missing value has a graceful fallback. The hook reports what it finds rather than asserting what it expects to find. This version runs identically in the local CLI and in Claude Code on the Web.

What 2.1.130 through 2.1.138 actually changed in May 2026

Four changes landed that matter for production use. They are not equally important.

Effort levels (2.1.130) are the most immediately useful. Before this, skills had no cost control. Every skill invocation spun up the same reasoning budget regardless of task complexity. Setting effort: low on skills that handle mechanical tasks (test stub generation, import sorting, changelog formatting) dropped token spend by about 30% on the client stacks I measured. The non-obvious part: effort levels interact with max_thinking_tokens in settings.json. If you have both set, effort level caps the ceiling; max_thinking_tokens sets the hard limit. Set both or set neither; mixing them produces confusing behavior where a high effort skill silently underperforms because max_thinking_tokens is still at the default 1000.

The --plugin-url flag (2.1.132) lets you point Claude Code at a URL rather than a local MCP server binary. In practice this means you can run a shared MCP server on a Hetzner CX22 box and have every developer on a project connect to it without local installs. It also enabled gateway model discovery in the same release: if the server at your plugin URL exposes a /models endpoint, Claude Code will populate the model picker from it automatically. For teams routing through a LiteLLM gateway, this is significant; you no longer maintain a static model list in settings.json.

MCP OAuth refresh (2.1.137) fixed the bug that had been dropping authenticated MCP connections after 60 minutes. The fix is straightforward but the behavior before the fix was subtle: the connection would appear active, tool calls would succeed for cached resources, and only requests that required a fresh API call would silently 401. The session log showed nothing useful. The fix landed in 2.1.137 and is verified against the standard OAuth 2.0 token refresh flow. If you were working around this with a cron job that restarted your MCP server every 45 minutes, you can stop.

The changelog entry that looks minor but is not: 2.1.134 changed how Claude Code resolves model names when a gateway is present. Previously, if you set model: claude-opus-4-5 in settings.json and your gateway had that model available under a different name, Claude Code would fail silently and fall back to a default. Now it queries the gateway's /models endpoint and resolves aliases. The practical implication is that settings.json model names can now match your gateway's display names rather than Anthropic's canonical names, which matters when you are routing through a proxy that renames models for cost reporting.

Where this layout breaks

The directory and hooks pattern I described works well for solo engagements and small teams. It starts showing seams at around five developers and breaks outright at fifty.

The immediate failure mode for larger teams is settings.json merge conflicts. When everyone is adding allowlist entries, the permissions array becomes a conflict surface on every branch. The fix is to move permissions into per-developer settings.local.json files and commit only the minimal shared allow list in settings.json. This works until your team needs to audit what permissions are actually in use, at which point local files are invisible to any central audit.

The skills directory scales better than permissions, but skill authorship becomes contentious. Who owns typescript.md? When a senior engineer changes the union type rule, every open PR that has staged changes against that skill gets inconsistent behavior depending on when Claude Code loads the skill file. The answer is to version skill files (e.g., typescript-v2.md) and update settings.json as a deliberate migration, not an in-place edit. This is tedious and nobody does it until they get burned once.

The cost shape at scale is the deeper issue. At one developer, effort: high on review skills is fine at maybe $4 a day. At fifty developers, that number is $200 a day and the usage patterns are invisible without centralized logging. The daily cap pattern I use on client projects is a PreToolUse hook that reads from a shared Redis counter and blocks if the day's spend exceeds a threshold. The threshold is set per environment: $15 on local dev, $50 on staging. It is blunt, but a $612 bill at 2am from a runaway agent session is what it prevents.

The other place this layout breaks is Claude Code on the Web when your workflow depends on local filesystem state. Commands that call git diff, read from .env, or check lockfile changes all assume a local checkout. In a Web session you either need to provide that context explicitly in CLAUDE.md or accept that those commands will partially fail. SessionStart helps here, but it cannot fully substitute for a real checkout.

For a 50-engineer org, I would replace the flat skills/ directory with a registry served by the plugin URL endpoint, version skills explicitly with semver, add centralized spend logging to every PostToolUse hook, and require PR reviews on settings.json changes the same way you would require them on CI config. None of that is necessary for a solo engagement or a three-person client project, and adding it early is overengineering.

Next quarter I am going to look hard at moving the hooks to a shared MCP server rather than local TypeScript files. The --plugin-url flag makes that feasible now in a way it was not in April. The payoff is that hooks become deployable artifacts with their own test suite and version history, rather than files that live in the client repo and accumulate bespoke exceptions. The tradeoff is that every session requires a network round-trip for hook execution. At 5ms per call in the same region, that is acceptable.


The Claude workflow build-out on the Anime.js project covers how the two-phase approach and minimal-base principle apply when Claude Code is the primary builder rather than a reviewer. The hooks pattern described here is the verification layer that sits on top of that workflow: the build-out tells Claude Code what to make, the hooks tell it when it has made something wrong.

Key Takeaways

  • The .claude/ directory has five parts that matter: CLAUDE.md, settings.json, settings.local.json, skills/, and commands/. Permissions split between committed and local files.
  • Skills steer the model's reasoning style; slash commands run deterministic sequences. Do not conflate them.
  • The effort field in skill frontmatter (added in 2.1.130) controls reasoning budget. Set it on every skill file and pair it with max_thinking_tokens or neither.
  • SessionStart must be defensive: wrap every shell call in try/catch and report what it finds rather than asserting what it expects. This is the only way it survives Claude Code on the Web.
  • MCP OAuth refresh is fixed in 2.1.137. The --plugin-url flag enables shared MCP servers and gateway model discovery. Both change how serious orchestration stacks should be wired.
  • The layout breaks at team scale due to settings.json conflicts, invisible local permissions, and untracked spend. Version skills, centralize spend logging, and require reviews on settings changes before you grow past five developers.
Last Updated

May 10, 2026

Category

Stack

Omid Saffari

Founder & CEO, AI Entrepreneur

Digital marketing specialist with expertise in AI, automation, and web development. Helping businesses build strong online presences that drive results.

X.com
Instagram
LinkedIn
WhatsApp
Email

More from Stack

Three Weeks with Claude Opus 4.7 in Production Agent Loops: What Actually Changed vs Sonnet 4.6
Three Weeks with Claude Opus 4.7 in Production Agent Loops: What Actually Changed vs Sonnet 4.6

Real cost-per-completion numbers, retry rates, and tool-call accuracy from three weeks of Opus 4.7 vs Sonnet 4.6 inside a six-agent Cloudflare Durable Object stack with a hard $20/day cap and per-call USD logging.

May 10, 2026
Build a Premium Anime.js Interactive Site Using a Claude Workflow
Build a Premium Anime.js Interactive Site Using a Claude Workflow

Most solo devs skip motion entirely because hiring an animator is expensive and the Anime.js docs feel abstract. This is the exact Claude workflow that gets you scroll-triggered timelines, hover effects, and SVG animations in an afternoon.

May 9, 2026
View all Stack articles