The March 31, 2026 npm sourcemap incident revealed Claude Code internals. 4-phase execution, 7 modes, and the 11-step Agent Loop analyzed.
On March 31, 2026, security researcher Chaofan Shou noticed something unusual in an npm package. Anthropic had shipped @anthropic-ai/claude-code v2.1.88 with a .npmignore misconfiguration, bundling a 59.8MB sourcemap containing the entire TypeScript source. Within hours, the post on X accumulated 28.8 million views. The bell had rung, and it could not be un-rung.
What followed was extraordinary. Sigrid Jin launched Claw Code, a clean-room Python/Rust reimplementation, at 4 AM KST — it hit 50,000 GitHub stars in two hours. The site ccunpacked.dev went live with visual architecture breakdowns. Anthropic responded with DMCA takedowns against 8,100+ repository forks and switched from npm to a native installer.
In the same time window, a supply-chain attack compounded the situation. A malicious axios package containing a RAT (Remote Access Trojan) was distributed via npm, amplifying security concerns across the ecosystem. The sourcemap inclusion itself is suspected to stem from a Bun runtime bundling bug.
But the real story is what the code revealed about how a production agent system is actually built. This post is the first in the “Claude Code Anatomy” series, dissecting the full architecture from the leaked source.
Tech Stack
| Layer | Technology | Role |
|---|---|---|
| Language | TypeScript (strict) | Full codebase, Zod runtime validation |
| UI Framework | React + Ink | Terminal UI as React components (TUI) |
| State | Zustand | Immutable state tree, DeepImmutable type |
| Runtime | Bun | Build and execution |
| Layout | Yoga (flexbox) | Terminal flexbox engine |
| API | SSE (Server-Sent Events) | Token streaming |
| Shell parsing | Tree-sitter | 23-step AST security analysis |
The choice of React for a terminal application is the most surprising. Ink renders React components to ANSI escape codes via Yoga’s flexbox engine, with a custom reconciler, double-buffered output, and frame throttling. The entire application compiles to a single ~800KB bundle (main.tsx).
Source Directory Structure
File counts by directory from the leaked source:
| Directory | Files | Role |
|---|---|---|
utils/ | 564 | Utilities, helper functions |
components/ | 389 | React TUI components |
commands/ | 189 | Slash command definitions |
tools/ | 184 | 52 built-in tool implementations |
services/ | 130 | API, auth, telemetry |
hooks/ | 104 | React/state hooks |
ink/ | 96 | Ink TUI extensions |
bridge/ | 31 | claude.ai integration bridge |
| Other | 197+ | Tests, config, types |
| Total | ~1,884 |
utils/ accounts for 30% of all files. This reflects a production reality: complexity concentrates not in core logic, but in surrounding infrastructure.
The 4-Phase Execution Model
Every Claude Code session follows four phases.
flowchart LR
subgraph P1["Phase 1: Startup"]
A1["Parallel I/O\nPrefetch"]
A2["Auth\n5 methods"]
A3["Model Resolution\nTool Assembly"]
A1 --> A2 --> A3
end
subgraph P2["Phase 2: Query Loop"]
B1["Preprocessing\nsnip/compact"]
B2["API Streaming\nSSE"]
B3["Error Recovery\nWithhold"]
B4["Tool Execution"]
B1 --> B2 --> B3 --> B4
end
subgraph P3["Phase 3: Tool Execution"]
C1["52 Built-in Tools\n10-step pipeline"]
end
subgraph P4["Phase 4: Display"]
D1["React Reconciler\nYoga Flexbox\nDouble Buffer"]
end
P1 --> P2 --> P3 --> P4
P3 -->|"tool_use present"| P2
Phase 1: Startup
A 6-step optimized initialization:
- Parallel I/O Prefetch — MDM subprocess + macOS Keychain reads run concurrently (~65ms saved out of ~135ms)
- Conditional Module Loading — Feature-gated code (COORDINATOR_MODE, KAIROS) loads only when active; dead code removed at build time
- Early Settings — CLI flags, bare mode parsing
- Authentication — 5 sequential attempts: OAuth, API Key, AWS Bedrock, Google Vertex, Azure Foundry
- Model Resolution — Tier-based auto-selection (Max/Team Premium selects Opus, others select Sonnet)
- Initial State — REPL or Headless mode entry
Phase 2: Query Loop
Each turn handles five stages.
Preprocessing compresses the conversation before the API call. Four mechanisms in order: Snip Compact (remove old messages), Microcompact (shrink tool_use blocks), Context Collapse (abbreviate context), Auto-Compact (summarize when approaching context_window minus 13,000 tokens).
API Streaming sends messages + system prompt + tool schemas via SSE. On overload, the system falls back to alternate models automatically.
Error Withholding buffers recoverable errors instead of surfacing them. A 413 Prompt Too Long triggers collapse drain, then reactive compact, then user notification. Max Output Tokens escalates from 8K to 64K with up to 3 retries.
Tool Execution partitions tools by safety: read-only tools run up to 10 in parallel, write tools run sequentially. Large results save to disk with only a reference returned.
Post-processing runs stop hooks, checks token budget and max turns. If tool_use blocks exist, the loop repeats.
Phase 3: Tool Execution
52 built-in tools pass through a common 10-step pipeline: name lookup, interrupt check, input validation, PreToolUse hook, permission check, execution, result mapping, size check, PostToolUse hook, telemetry.
Phase 4: Display
Not console.log but a full rendering pipeline: React Reconciler, Yoga flexbox layout, double buffer (changed cells only), ANSI sequences. Int32Array-based ASCII buffers achieve 50x cache performance improvement. CharPool/StylePool interning pools optimize memory.
7 Execution Modes
A single binary, seven distinct modes with different tool access, UI behavior, and permission models.
| Mode | Description | UI | Key Characteristic |
|---|---|---|---|
| REPL | Interactive terminal | Full TUI | Default mode, full tools |
| Headless | Programmatic SDK | None | CI/CD and scripting |
| Coordinator | Multi-agent leader | Minimal | Each worker gets isolated git worktree |
| Bridge | claude.ai connection | Bidirectional | Up to 32 parallel sessions |
| Kairos | Always-on assistant | Background | Dream mode, GitHub webhooks, cron |
| Daemon | tmux background | None | Survives terminal close |
| Viewer | Read-only observation | Read-only | No tool execution |
graph TB
Q["query(deps)"]
Q --> R["REPL\nFull UI + all tools"]
Q --> H["Headless\nNo UI + full tools"]
Q --> C["Coordinator\nDelegation + workers"]
Q --> B["Bridge\nWeb sync"]
Q --> K["Kairos\nBackground + Dream"]
Q --> D["Daemon\ntmux + UDS"]
Q --> V["Viewer\nRead-only"]
All seven modes share the same query() loop. Differences are handled via Dependency Injection — query() receives a deps parameter that swaps tool pools, permission contexts, and rendering targets.
The 11-Step Agent Loop
The heart of the system. Every interaction follows this exact sequence.
flowchart TB
S1["1. User Input\nTextInput.tsx"]
S2["2. Message Creation\nmessages.ts"]
S3["3. History Append\nhistory.ts"]
S4["4. System Prompt Assembly\ncontext.ts"]
S5["5. API Streaming\nquery.ts SSE"]
S6["6. Token Parsing\nQueryEngine.ts"]
S7["7. Tool Detection\ntools.ts"]
S8["8. Tool Execution Loop\nStreamingToolExecutor.ts"]
S9["9. Response Rendering"]
S10["10. Post-Sampling Hooks\nautoCompact.ts"]
S11["11. Await Next Input"]
S1 --> S2 --> S3 --> S4 --> S5 --> S6 --> S7
S7 -->|"tool_use detected"| S8
S8 -->|"result -> history -> re-call API"| S5
S7 -->|"no tool_use"| S9 --> S10 --> S11
S11 -->|"new input"| S1
Steps 1-3: Input Processing. The user types a message in TextInput.tsx. It is wrapped into a structured message object by createUserMessage() (messages.ts, line 460), then pushed to the in-memory conversation array in history.ts. Before the API call, the transcript is saved to disk — crash-safety ensuring no conversation state is lost.
Step 4: System Prompt Assembly. context.ts merges two sources:
- System Context (memoized): Git branch, default branch, git status (max 2,000 chars), recent commits, git username
- User Context (memoized): CLAUDE.md files, today’s date
Both are computed once per session and reused across turns — a direct cost optimization since these sections benefit from API prompt caching (1-hour cache window).
Steps 5-6: API Call and Streaming. query.ts sends the payload via SSE. QueryEngine.ts receives tokens in real-time, maintaining mutableMessages[], totalUsage, and permissionDenials[] as mutable state. A continuous budget check halts the session if maxBudgetUsd is exceeded.
Steps 7-8: The Agentic Loop. When the model’s response contains tool_use blocks, tools.ts identifies the tool via findToolByName(), verifies access with canUseTool(), then hands off to StreamingToolExecutor.ts. Tool results are added to history and the API is re-called — creating the characteristic loop: AI decides, tool executes, result feeds back, AI decides again. This continues until the model responds with text only.
The StreamingToolExecutor overlaps API streaming with tool execution. While the AI is still generating, completed tool_use blocks begin execution immediately. If a tool fails, sibling executions are cancelled.
Steps 9-11: Completion. Response rendering via Ink + Yoga flexbox. Post-sampling hooks handle context compression (Auto-Compact), memory extraction, and dream mode processing. The REPL then awaits the next input.
State Management: DeepImmutable
All global state lives in a single AppState object typed as DeepImmutable. Managed through Zustand, enforcing no accidental mutations. State changes trigger automatic side effects: permission changes notify CCR/SDK listeners, model changes save to settings, settings changes invalidate auth caches.
Three Design Principles
1. Safety First
Bash commands parsed into ASTs (not string patterns), fail-closed defaults, 888KB/23-step security module. Permission validation on every tool execution.
2. Performance Through Parallelism and Caching
Parallel tool execution (up to 10), API streaming overlapped with tool runs, memoized system prompts for cache hits, double-buffered ANSI rendering, parallel I/O prefetch (65ms saved).
3. Extensibility Without Modification
52 tools share a common interface. 7 modes share one query() loop via DI. MCP supports 5 transports. 44 feature flags gate unreleased functionality without branching.
Sources & Limitations
This series synthesizes the following publicly available analyses and does not directly contain leaked source code.
| Source | URL | Focus |
|---|---|---|
| ccunpacked.dev | ccunpacked.dev | Visual architecture guide, tool/command catalog |
| Wikidocs Analysis | wikidocs.net/338204 | Detailed technical analysis (execution flow, state, rendering) |
| PyTorch KR | discuss.pytorch.kr | Community analysis + HN discussion synthesis |
| Claw Code | github.com/ultraworkers/claw-code | Clean-room reimplementation (Rust/Python), PARITY.md gap analysis |
Analysis date: April 2, 2026. Anthropic issued DMCA takedowns on 8,100+ forks and discontinued npm distribution shortly after the leak, so some sources may have changed accessibility. Features behind feature flags are unreleased and may be modified or deprecated before launch.
Related Posts

Agent System Design Canvas — 12 Production Patterns Proven by the Claude Code Leak
6-Layer agent system design canvas and 8+4 core patterns from the Claude Code source leak (512K lines TS). From the Circuit Breaker that stopped 250K/day wasted API calls with 3 lines, to 23-step bash AST security.

Anthropic's 96 Hours — Access, Capability, Execution Across Three Layers
Three Anthropic announcements across 96 hours in April 2026, broken down into Access, Capability, and Execution layers to reveal a coordinated vertical integration strategy.

52 Tools, 23-Step Security: Inside an Agent's Tool System
52 built-in tools' common interface, 10-step execution pipeline, safe=parallel/unsafe=sequential concurrency model, 5-stage permission pipeline, and 888KB Tree-sitter 23-step bash security.