Minbook
KO
What the Claude Code Leak Revealed: Anatomy of an AI Agent

What the Claude Code Leak Revealed: Anatomy of an AI Agent

MJ · · 5 min read

The March 31, 2026 npm sourcemap incident revealed Claude Code internals. 4-phase execution, 7 modes, and the 11-step Agent Loop analyzed.

On March 31, 2026, security researcher Chaofan Shou noticed something unusual in an npm package. Anthropic had shipped @anthropic-ai/claude-code v2.1.88 with a .npmignore misconfiguration, bundling a 59.8MB sourcemap containing the entire TypeScript source. Within hours, the post on X accumulated 28.8 million views. The bell had rung, and it could not be un-rung.

What followed was extraordinary. Sigrid Jin launched Claw Code, a clean-room Python/Rust reimplementation, at 4 AM KST — it hit 50,000 GitHub stars in two hours. The site ccunpacked.dev went live with visual architecture breakdowns. Anthropic responded with DMCA takedowns against 8,100+ repository forks and switched from npm to a native installer.

In the same time window, a supply-chain attack compounded the situation. A malicious axios package containing a RAT (Remote Access Trojan) was distributed via npm, amplifying security concerns across the ecosystem. The sourcemap inclusion itself is suspected to stem from a Bun runtime bundling bug.

But the real story is what the code revealed about how a production agent system is actually built. This post is the first in the “Claude Code Anatomy” series, dissecting the full architecture from the leaked source.


Tech Stack

LayerTechnologyRole
LanguageTypeScript (strict)Full codebase, Zod runtime validation
UI FrameworkReact + InkTerminal UI as React components (TUI)
StateZustandImmutable state tree, DeepImmutable type
RuntimeBunBuild and execution
LayoutYoga (flexbox)Terminal flexbox engine
APISSE (Server-Sent Events)Token streaming
Shell parsingTree-sitter23-step AST security analysis

The choice of React for a terminal application is the most surprising. Ink renders React components to ANSI escape codes via Yoga’s flexbox engine, with a custom reconciler, double-buffered output, and frame throttling. The entire application compiles to a single ~800KB bundle (main.tsx).


Source Directory Structure

File counts by directory from the leaked source:

DirectoryFilesRole
utils/564Utilities, helper functions
components/389React TUI components
commands/189Slash command definitions
tools/18452 built-in tool implementations
services/130API, auth, telemetry
hooks/104React/state hooks
ink/96Ink TUI extensions
bridge/31claude.ai integration bridge
Other197+Tests, config, types
Total~1,884

utils/ accounts for 30% of all files. This reflects a production reality: complexity concentrates not in core logic, but in surrounding infrastructure.


The 4-Phase Execution Model

Every Claude Code session follows four phases.

flowchart LR
    subgraph P1["Phase 1: Startup"]
        A1["Parallel I/O\nPrefetch"]
        A2["Auth\n5 methods"]
        A3["Model Resolution\nTool Assembly"]
        A1 --> A2 --> A3
    end
    subgraph P2["Phase 2: Query Loop"]
        B1["Preprocessing\nsnip/compact"]
        B2["API Streaming\nSSE"]
        B3["Error Recovery\nWithhold"]
        B4["Tool Execution"]
        B1 --> B2 --> B3 --> B4
    end
    subgraph P3["Phase 3: Tool Execution"]
        C1["52 Built-in Tools\n10-step pipeline"]
    end
    subgraph P4["Phase 4: Display"]
        D1["React Reconciler\nYoga Flexbox\nDouble Buffer"]
    end
    P1 --> P2 --> P3 --> P4
    P3 -->|"tool_use present"| P2

Phase 1: Startup

A 6-step optimized initialization:

  1. Parallel I/O Prefetch — MDM subprocess + macOS Keychain reads run concurrently (~65ms saved out of ~135ms)
  2. Conditional Module Loading — Feature-gated code (COORDINATOR_MODE, KAIROS) loads only when active; dead code removed at build time
  3. Early Settings — CLI flags, bare mode parsing
  4. Authentication — 5 sequential attempts: OAuth, API Key, AWS Bedrock, Google Vertex, Azure Foundry
  5. Model Resolution — Tier-based auto-selection (Max/Team Premium selects Opus, others select Sonnet)
  6. Initial State — REPL or Headless mode entry

Phase 2: Query Loop

Each turn handles five stages.

Preprocessing compresses the conversation before the API call. Four mechanisms in order: Snip Compact (remove old messages), Microcompact (shrink tool_use blocks), Context Collapse (abbreviate context), Auto-Compact (summarize when approaching context_window minus 13,000 tokens).

API Streaming sends messages + system prompt + tool schemas via SSE. On overload, the system falls back to alternate models automatically.

Error Withholding buffers recoverable errors instead of surfacing them. A 413 Prompt Too Long triggers collapse drain, then reactive compact, then user notification. Max Output Tokens escalates from 8K to 64K with up to 3 retries.

Tool Execution partitions tools by safety: read-only tools run up to 10 in parallel, write tools run sequentially. Large results save to disk with only a reference returned.

Post-processing runs stop hooks, checks token budget and max turns. If tool_use blocks exist, the loop repeats.

Phase 3: Tool Execution

52 built-in tools pass through a common 10-step pipeline: name lookup, interrupt check, input validation, PreToolUse hook, permission check, execution, result mapping, size check, PostToolUse hook, telemetry.

Phase 4: Display

Not console.log but a full rendering pipeline: React Reconciler, Yoga flexbox layout, double buffer (changed cells only), ANSI sequences. Int32Array-based ASCII buffers achieve 50x cache performance improvement. CharPool/StylePool interning pools optimize memory.


7 Execution Modes

A single binary, seven distinct modes with different tool access, UI behavior, and permission models.

ModeDescriptionUIKey Characteristic
REPLInteractive terminalFull TUIDefault mode, full tools
HeadlessProgrammatic SDKNoneCI/CD and scripting
CoordinatorMulti-agent leaderMinimalEach worker gets isolated git worktree
Bridgeclaude.ai connectionBidirectionalUp to 32 parallel sessions
KairosAlways-on assistantBackgroundDream mode, GitHub webhooks, cron
Daemontmux backgroundNoneSurvives terminal close
ViewerRead-only observationRead-onlyNo tool execution
graph TB
    Q["query(deps)"]
    Q --> R["REPL\nFull UI + all tools"]
    Q --> H["Headless\nNo UI + full tools"]
    Q --> C["Coordinator\nDelegation + workers"]
    Q --> B["Bridge\nWeb sync"]
    Q --> K["Kairos\nBackground + Dream"]
    Q --> D["Daemon\ntmux + UDS"]
    Q --> V["Viewer\nRead-only"]

All seven modes share the same query() loop. Differences are handled via Dependency Injection — query() receives a deps parameter that swaps tool pools, permission contexts, and rendering targets.


The 11-Step Agent Loop

The heart of the system. Every interaction follows this exact sequence.

flowchart TB
    S1["1. User Input\nTextInput.tsx"]
    S2["2. Message Creation\nmessages.ts"]
    S3["3. History Append\nhistory.ts"]
    S4["4. System Prompt Assembly\ncontext.ts"]
    S5["5. API Streaming\nquery.ts SSE"]
    S6["6. Token Parsing\nQueryEngine.ts"]
    S7["7. Tool Detection\ntools.ts"]
    S8["8. Tool Execution Loop\nStreamingToolExecutor.ts"]
    S9["9. Response Rendering"]
    S10["10. Post-Sampling Hooks\nautoCompact.ts"]
    S11["11. Await Next Input"]

    S1 --> S2 --> S3 --> S4 --> S5 --> S6 --> S7
    S7 -->|"tool_use detected"| S8
    S8 -->|"result -> history -> re-call API"| S5
    S7 -->|"no tool_use"| S9 --> S10 --> S11
    S11 -->|"new input"| S1

Steps 1-3: Input Processing. The user types a message in TextInput.tsx. It is wrapped into a structured message object by createUserMessage() (messages.ts, line 460), then pushed to the in-memory conversation array in history.ts. Before the API call, the transcript is saved to disk — crash-safety ensuring no conversation state is lost.

Step 4: System Prompt Assembly. context.ts merges two sources:

  • System Context (memoized): Git branch, default branch, git status (max 2,000 chars), recent commits, git username
  • User Context (memoized): CLAUDE.md files, today’s date

Both are computed once per session and reused across turns — a direct cost optimization since these sections benefit from API prompt caching (1-hour cache window).

Steps 5-6: API Call and Streaming. query.ts sends the payload via SSE. QueryEngine.ts receives tokens in real-time, maintaining mutableMessages[], totalUsage, and permissionDenials[] as mutable state. A continuous budget check halts the session if maxBudgetUsd is exceeded.

Steps 7-8: The Agentic Loop. When the model’s response contains tool_use blocks, tools.ts identifies the tool via findToolByName(), verifies access with canUseTool(), then hands off to StreamingToolExecutor.ts. Tool results are added to history and the API is re-called — creating the characteristic loop: AI decides, tool executes, result feeds back, AI decides again. This continues until the model responds with text only.

The StreamingToolExecutor overlaps API streaming with tool execution. While the AI is still generating, completed tool_use blocks begin execution immediately. If a tool fails, sibling executions are cancelled.

Steps 9-11: Completion. Response rendering via Ink + Yoga flexbox. Post-sampling hooks handle context compression (Auto-Compact), memory extraction, and dream mode processing. The REPL then awaits the next input.


State Management: DeepImmutable

All global state lives in a single AppState object typed as DeepImmutable. Managed through Zustand, enforcing no accidental mutations. State changes trigger automatic side effects: permission changes notify CCR/SDK listeners, model changes save to settings, settings changes invalidate auth caches.


Three Design Principles

1. Safety First

Bash commands parsed into ASTs (not string patterns), fail-closed defaults, 888KB/23-step security module. Permission validation on every tool execution.

2. Performance Through Parallelism and Caching

Parallel tool execution (up to 10), API streaming overlapped with tool runs, memoized system prompts for cache hits, double-buffered ANSI rendering, parallel I/O prefetch (65ms saved).

3. Extensibility Without Modification

52 tools share a common interface. 7 modes share one query() loop via DI. MCP supports 5 transports. 44 feature flags gate unreleased functionality without branching.



Sources & Limitations

This series synthesizes the following publicly available analyses and does not directly contain leaked source code.

SourceURLFocus
ccunpacked.devccunpacked.devVisual architecture guide, tool/command catalog
Wikidocs Analysiswikidocs.net/338204Detailed technical analysis (execution flow, state, rendering)
PyTorch KRdiscuss.pytorch.krCommunity analysis + HN discussion synthesis
Claw Codegithub.com/ultraworkers/claw-codeClean-room reimplementation (Rust/Python), PARITY.md gap analysis

Analysis date: April 2, 2026. Anthropic issued DMCA takedowns on 8,100+ forks and discontinued npm distribution shortly after the leak, so some sources may have changed accessibility. Features behind feature flags are unreleased and may be modified or deprecated before launch.

Share

Related Posts