How does Claude Code's memory system work?

Claude Code stores four memory types (user, feedback, project, reference) as Markdown files under a per-user directory and manages them with two background engines: Auto-Dream (post-session fact extraction) and Auto-Compact (context compression). The internal architecture became publicly visible in March 2026 after a source-map leak exposed roughly 512,000 lines of TypeScript.

Auto-Dream is the background job that runs after a session ends, extracting facts worth persisting and writing them to the appropriate memory files. It includes typed classification and deduplication, and is the core mechanism that lets the next session start with prior context restored without user intervention.

How is Claude Code's memory system different from ordinary LLM context?

An ordinary LLM only remembers what's inside the current context window. Claude Code's memory is persisted to disk and survives across sessions — an external memory layer designed to sidestep context rot and the hard window limit.

Can I build a memory system like Claude Code's?

Yes. The article documents a three-layer build (documents, index, semantic search) shipped independently over six months. Search, validation, and logging are split into MCP (Model Context Protocol) servers; features like validate_placement() are additions Claude Code's internal system does not provide.

How does Memory Hub MCP compare with Claude Code's internal memory?

Claude Code's memory is strongest at per-user isolation, automatic classification, and pruning; Memory Hub MCP is strongest at semantic search, external tool integration, and placement validation. Running both together combines CC's automation with Hub's search and validation.

The Memory System I Built Looked Like Claude Code's Internal Design

I spent six months building a memory system for my Claude Code workflow. Three layers: structured documents, a cross-project index, and a semantic search MCP server backed by vector embeddings. Session lifecycle hooks. A placement validator that tells the AI where information belongs.

Then Claude Code’s source leaked on March 31, 2026 — 512,000 lines of TypeScript via an accidental source map inclusion in the npm package. And inside that source, I found Auto-Dream: a memory consolidation engine with four typed memory categories, post-session extraction, and automatic pruning.

The architectures converge in ways that cannot be coincidence. They also diverge in ways that reveal what each side optimized for. This post maps both systems side by side.

Part 1: Claude Code’s Internal Memory System

The memory/ Directory

Claude Code stores persistent memory at ~/.claude/projects/{slug}/memory/. The structure is deliberately simple:

~/.claude/projects/{slug}/
├── MEMORY.md           # Index (max 200 lines / 25KB)
└── memory/
    ├── user-prefs.md   # User type
    ├── feedback-*.md   # Feedback type
    ├── project-*.md    # Project type
    └── ref-*.md        # Reference type

MEMORY.md serves as the index. It has a hard cap of 200 lines / 25KB — because it gets injected into the context window at every session start. Size is cost.

Four Memory Types

Claude Code classifies memory into four types:

Type	Content	Lifecycle	Example
user	Role, expertise, preferences	Long-term (rarely changes)	“Prefers Python, concise code, Korean responses”
feedback	Corrections + WHY + how to apply	Medium-term (merged when patterns emerge)	See structured example below
project	Goals, deadlines, decisions	Medium-term (project lifespan)	“Phase 2 deadline: 2026-05-15”
reference	External system pointers	Long-term (while reference exists)	“API spec: docs/api-v2.md”

The feedback type has a particularly precise structure. It is not just “do this instead.” It leads with the rule, then explains why, then shows how to apply:

Rule: Specify file names explicitly when using git add
Why: git add . or -A may include .env, credentials, or other sensitive files
How to apply: Use git add specific-file.ts format. List multiple files individually

The project type converts relative dates to absolute dates. “Due next week” becomes “Due 2026-04-13.” The reference point for “next week” shifts between sessions; an absolute date does not.

What NOT to save is defined just as explicitly:

Excluded	Reason
Code patterns	Already in the codebase — search for it
Architecture details	Belongs in docs/ directory
Git history	Available via `git log`
Debugging recipes	One-off — if recurring, codify it
Anything in CLAUDE.md	Duplicate storage causes inconsistency

This exclusion list matters more than the inclusion list. A memory system’s quality is determined by what it refuses to remember, not what it stores.

Auto-Dream: Post-Session Memory Consolidation

Auto-Dream is Claude Code’s most significant memory innovation. It remains unreleased but is fully implemented in the source.

When a session ends:

A forked sub-agent spawns (independent of the main context)
The sub-agent reviews the entire conversation from the beginning
It extracts content matching the four memory types
It writes organized entries to memory/
It cleans up stale context — entries that are no longer valid
It strengthens associations between related memory entries

The “sub-agent” design is the key insight. Memory consolidation does not consume the main session’s context window. It mirrors sleep-dependent memory consolidation in neuroscience — the brain processes and organizes memories during sleep, not during waking hours.

Auto-Dream connects to KAIROS, a proactive monitoring system that runs on a 5-minute cron. KAIROS watches for filesystem changes and can trigger memory consolidation independently of session boundaries.

Auto-Compact: Runtime Context Management

Where Auto-Dream handles post-session long-term memory, Auto-Compact manages in-session short-term context.

When the context window fills:

Remove images first (low information density per token)
Group by API round
Generate summary via forked sub-agent
Replace previous messages with summary
Restore top 5 referenced files (50K token budget)
Re-inject active skills (25K budget, 5K per skill)

Steps 5 and 6 are the sophistication. Context is reduced, but the most-referenced files and active skills are restored. Discard the bulk; preserve the essentials.

The circuit breaker: MAX_CONSECUTIVE_AUTOCOMPACT_FAILURES = 3. If compaction fails three times in a row, it stops trying. Before these three lines were added, approximately 250,000 API calls per day were wasted in failure loops.

Context Injection and Post-Sampling Hooks

At every session start, two functions fire:

Function	Injected Content	Caching
`getSystemContext()`	System prompt, tool definitions, policies	Memoized per session
`getUserContext()`	CLAUDE.md rules, MEMORY.md, memory files	Memoized per session

Memoization prevents redundant reads within the same session. It checks file mtime — if unchanged, the previous result is reused. This integrates with API-side prompt caching (1-hour cache), reducing token costs significantly.

Post-Sampling Hooks execute after every response generation:

Auto-Compact trigger (context limit check)
Memory extraction (write session findings to memory/)
Dream Mode trigger (at session end, launch Auto-Dream)

flowchart TB
    subgraph "Claude Code Memory Architecture"
        direction TB
        subgraph "Session Start"
            SC["getSystemContext()\nSystem prompt + tools\n(memoized)"]
            UC["getUserContext()\nCLAUDE.md + MEMORY.md\n(memoized)"]
        end
        subgraph "During Session"
            AC["Auto-Compact\nContext compression\nTop 5 files restored (50K)\nSkills re-injected (25K)"]
            CB["Circuit Breaker\nMAX_FAILURES = 3"]
        end
        subgraph "Post-Sampling Hooks"
            PSH["auto-compact check\nmemory extraction\ndream mode trigger"]
        end
        subgraph "Post Session"
            AD["Auto-Dream\nSub-agent consolidation\nKAIROS integration"]
        end
        subgraph "Persistent Storage"
            MD["memory/ 4 types\nuser / feedback / project / reference"]
            CM["CLAUDE.md\nProject rules"]
            MM["MEMORY.md\nIndex (max 200 lines / 25KB)"]
        end
    end
    SC --> AC
    UC --> AC
    AC --> CB
    AC --> PSH
    PSH --> AD
    AD -->|"extract/cleanup/strengthen"| MD
    MD --> MM
    CM --> UC
    MM --> UC

Part 2: My 3-Layer Memory System

The Problem

I have been using Claude Code as my primary development tool for six months. I run seven or more projects simultaneously — a SaaS product, a technical blog, open-source tools, consulting engagements. The core problems:

Session amnesia: Yesterday’s decisions need re-explaining today
Project isolation: Experience from Project A never transfers to Project B
State tracking gaps: “What was I working on?” requires manual reconstruction every session
Information placement confusion: Does this belong in CLAUDE.md, STATUS.md, or memory/?

In the Korean AI community, where Claude Code adoption has been particularly intense among solo builders and consultants, these problems are compounded by the sheer pace of project switching. The tooling ecosystem is evolving weekly, and losing track of context across sessions means losing competitive advantage.

Layer 1: Document Tier (Per-Project)

project-root/
├── CLAUDE.md      # Rules only (coding conventions, safety rules, env)
├── STATUS.md      # Current state (in-progress, blockers, next steps)
└── docs/          # Architecture, decisions (Tree tier only)
    └── architecture.md

CLAUDE.md contains rules only. “Use TypeScript”, “Never force push”, “Show SQL before schema changes.” No current state, no roadmap.

STATUS.md contains current state. What is in progress, what is blocked, what comes next. Updated at every session end.

Projects graduate through three documentation tiers:

Tier	Condition	Required Documents	Graduation Trigger
Seed	Initial idea in `_ideas/`	README.md only	First meaningful commit
Sapling	First deploy, real usage	CLAUDE.md + STATUS.md	Multi-component, revenue connection
Tree	Long-term operation	+ docs/architecture.md	—

This graduation path naturally scales documentation with project maturity. An early idea project has only a README (Seed). A technical blog with deployments is a Sapling. A revenue-generating SaaS product is a Tree.

Layer 2: Memory Index (Cross-Project)

~/.claude/projects/{workspace}/memory/
├── MEMORY.md              # Router: links to all files + project table
├── project-a-status.md    # Per-project status files
├── project-b-status.md
├── project-c-status.md
├── doc-standards.md        # Documentation standards
├── claude-md-design.md     # CLAUDE.md design principles
├── decision-frameworks.md  # Decision-making frameworks
├── skills-guide.md         # /wrap, /dashboard skill specs
├── feedback-*.md           # Working style feedback
└── ... (16+ files)

MEMORY.md is the router. It contains links to every memory file plus a project status table. The critical difference from Claude Code’s memory: mine is cross-project while CC’s is per-project isolated.

My memory types extend beyond Claude Code’s four:

Type	CC Equivalent	My Addition
user	user	Same (role, preferences, working style)
feedback	feedback	Same (corrections + reasoning)
project status	project	Split into per-project files
standards	(none)	Documentation standards, design principles
patterns	(none)	Technical patterns (Next.js static, content scaling)
references	reference	Same

Layer 3: Memory Hub MCP (Semantic Search)

Where Layers 1 and 2 are structured text-based memory, Layer 3 adds vector-based semantic search.

Backend architecture:

Mem0 OSS: Memory management framework
Qdrant: Local vector DB (embedding storage and retrieval)
SQLite: History and session journal storage

Six MCP tools:

Tool	Purpose	When Used
`search_memory`	Vector search over memory/ files	Session start, context exploration
`log_session`	Write session journal (summary, decisions, unfinished, related)	Session end (/wrap)
`extract_facts`	Auto-extract facts from session	Session end (/wrap)
`check_stale`	Detect N-day inactive items	Session start (/dashboard)
`validate_placement`	Verify where info should be stored	When saving information
`index_markdown`	Index memory/ files into vector DB	After memory file updates

Two-tier data model:

Tier 1 (Confirmed): Markdown-based, human-curated, memory/*.md files
Tier 2 (Supplementary): Mem0 auto-managed, vector DB stored, auxiliary reference

To date: over a hundred past sessions backfill-indexed, dozens of facts extracted from existing memory files into the vector DB.

Session Lifecycle and Custom Skills

flowchart TB
    subgraph "My 3-Layer Memory System"
        direction TB
        subgraph "/dashboard (Session Start)"
            D1["STATUS.md + last 10 git commits"]
            D2["search_memory(project_name)"]
            D3["check_stale(14 days)"]
            D4["Generate briefing"]
        end
        subgraph "During Session"
            W1["Reference memory/ files"]
            W2["Update memory files as needed"]
        end
        subgraph "/wrap (Session End)"
            E1["Update STATUS.md"]
            E2["Update memory/ status files"]
            E3["log_session() - journal entry"]
            E4["extract_facts() - fact extraction"]
            E5["Update Doc Sync dates"]
        end
    end
    D1 --> D2 --> D3 --> D4
    D4 --> W1 --> W2
    W2 --> E1 --> E2 --> E3 --> E4 --> E5

/dashboard starts a session: reads STATUS.md and recent git log, searches Memory Hub for related context, detects 14-day inactive items, and generates a briefing.

/wrap ends a session. In Full mode, four parallel agents execute simultaneously — STATUS.md update, memory file update, session journal logging, fact extraction. This parallel execution is structurally similar to Claude Code’s Coordinator Mode (leader-worker pattern).

/monthly runs a monthly review: check_stale(30) across all projects for health assessment.

Part 3: Structural Comparison

This is the centerpiece. Line by line, where the two systems correspond and where they diverge.

My System	CC Internal	Role
CLAUDE.md (rules)	CLAUDE.md (same concept)	Per-project permanent rules
STATUS.md (state)	Session History (conversation array)	Current work state tracking
memory/*.md (index)	memory/ (4 types: user/feedback/project/reference)	Persistent memory across sessions
MEMORY.md (router)	MEMORY.md (index, max 200 lines)	Memory file navigation
Memory Hub semantic search	Auto-Dream (memory consolidation engine)	Automated memory processing
/wrap (session end)	Post-Sampling Hooks (auto-compact, dream)	Session-end cleanup
/dashboard (session start)	Context Injection (getSystemContext + getUserContext)	Session-start context loading
check_stale(14 days)	KAIROS (proactive monitoring, 5-min cron)	Stale information detection
validate_placement()	(Not in CC — my differentiator)	Information placement validation
extract_facts()	Auto-Dream extract (similar goal)	Fact extraction from sessions
log_session()	Session JSONL files (auto-saved)	Session journal recording
3-tier docs (Seed/Sapling/Tree)	(no equivalent — CC is a single product)	Documentation scaling with maturity
/wrap 4-agent parallel	Coordinator Mode (leader-worker pattern)	Parallel task execution

flowchart LR
    subgraph "My System"
        direction TB
        MC["CLAUDE.md\nRules"]
        MS["STATUS.md\nState"]
        MM["memory/*.md\nPersistent memory (16+)"]
        MH["Memory Hub MCP\nMem0 + Qdrant\nSemantic search"]
        MW["/wrap + /dashboard\nSession lifecycle\n4-agent parallel"]
        MV["validate_placement()\nPlacement validation"]
    end
    subgraph "Claude Code Internal"
        direction TB
        CC["CLAUDE.md\nRules"]
        CH["Session History\nConversation array"]
        CM["memory/ 4 types\nuser/feedback/project/ref"]
        CA["Auto-Dream\nSub-agent consolidation\nKAIROS integration"]
        CP["Context Injection\n+ Post-Sampling Hooks"]
        CX["(not present)"]
    end
    MC <-.->|"identical"| CC
    MS <-.->|"analogous"| CH
    MM <-.->|"structural match"| CM
    MH <-.->|"functional match"| CA
    MW <-.->|"role match"| CP
    MV <-.->|"differentiator"| CX

Part 4: What I Got Right, What CC Does Better

What I Got Right

The 3-layer approach: documents + index + semantic

Text files alone are limited (keyword-only search). A vector DB alone has no structure (no categorization). Claude Code reached the same conclusion: CLAUDE.md (rules) + memory/ (index) + Auto-Dream (automation).

Session lifecycle hooks

“Load context at session start, persist state at session end” is a foundational pattern for agent memory. My /dashboard and /wrap map directly to CC’s Context Injection and Post-Sampling Hooks.

Stale detection

Outdated information polluting current context is a core memory system problem. My check_stale(14) was solving the same problem as KAIROS’s periodic monitoring.

Explicit memory types with structured frontmatter

Each memory file declares name, description, and type in frontmatter. The same principle as CC’s four-type classification — structured metadata enables search and management.

Defining what NOT to save

Not storing code patterns, architecture details, or git history in memory. Nearly identical to CC’s “What NOT to save” list. Good memory systems are defined by their exclusion criteria, not their inclusion criteria.

What CC Does Better

Circuit Breaker

MAX_CONSECUTIVE_AUTOCOMPACT_FAILURES = 3 in Auto-Compact. My system has no failure isolation. If Memory Hub search fails, it just fails. Three lines of code preventing 250,000 wasted API calls per day. This pattern needs immediate adoption.

Sub-agent summarization

Both Auto-Compact and Auto-Dream use forked sub-agents — independent contexts that do not consume the main session’s context window. My /wrap uses four parallel agents but does not fork for independent summarization. After long sessions, context is already tight when cleanup work begins.

14 cache-break vector management

Claude Code aggressively uses prompt caching. System prompt, tool definitions, memory — if any of these change, the cache breaks and costs spike. The source identifies 14 cache-break vectors with stabilization strategies for each. My system has no cache management at this level.

Prompt caching integration

getSystemContext() and getUserContext() are memoized per session and integrated with the API’s 1-hour prompt cache. My MCP setup has no equivalent caching layer.

Immutable state

Zustand + DeepImmutable types prevent state mutation at compile time. My system is file-based and inherently mutable — if two sessions simultaneously modify the same memory file, conflicts can occur.

My Advantages (Things CC Does Not Have)

`validate_placement()` — Information placement validation

validate_placement("Decided to use Next.js instead of Astro 5", "CLAUDE.md")
-> "Inappropriate for CLAUDE.md. Decision records belong in STATUS.md or docs/decisions/"

Claude Code classifies memory into four types, but which specific file within a type is left to model judgment. My system validates this explicitly. As the memory system grows, this function’s value compounds.

Explicit session journal

Claude Code’s Auto-Dream extracts memory automatically. My system explicitly records decisions made, incomplete items, and related projects. Incomplete items in particular are difficult to detect reliably through automatic extraction.

Cross-project stale detection

check_stale(14) runs across all projects. Claude Code’s memory is per-project isolated, so while working on Project B, you cannot know that Project A has been neglected for two weeks. My system surfaces this.

3-tier documentation graduation (Seed to Sapling to Tree)

Documentation scales with project maturity. Claude Code has no need for this (it is a single product), but when operating seven or more simultaneous projects, graduated documentation is essential to avoid over-engineering young ideas or under-documenting mature systems.

Semantic search (vector-based)

Claude Code’s memory is file-scan based — it reads MEMORY.md, follows links, reads files. My system uses Qdrant vector DB to search “past experiences similar to this problem” semantically. Auto-Dream may close this gap in the future, but the current source shows file-based processing, not vector search.

Why This Convergence Is Not Coincidence

Two systems built independently — one by Anthropic’s engineering team inside a production CLI serving millions, one by a solo AI consultant iterating through months of daily use — arrived at strikingly similar architectures.

This points to convergent constraints: the problem space of persistent agent memory is narrow enough that independent implementations converge.

graph TD
    P1["Problem: LLMs have no persistent memory"] --> S1["Solution: File-based persistent storage"]
    P2["Problem: Loading everything every time is cost-prohibitive"] --> S2["Solution: Index + detail two-stage approach"]
    P3["Problem: Mixing rules and state causes staleness"] --> S3["Solution: Hierarchical separation"]
    P4["Problem: Mid-session cleanup breaks flow"] --> S4["Solution: Process at session boundaries"]
    P5["Problem: Storing everything degrades search quality"] --> S5["Solution: Explicit exclusion criteria"]

    S1 --> C["Convergence: Same architecture"]
    S2 --> C
    S3 --> C
    S4 --> C
    S5 --> C

Both systems start from these five constraints and arrive at the same architecture.

But the divergence is equally instructive:

Dimension	My System Optimized For	CC Optimized For
Reliability	Human oversight (explicit commands)	Automation (no human dependency)
Placement accuracy	Validation (validate_placement)	Model judgment (no validation)
Cross-project visibility	Global router (MEMORY.md)	Per-project isolation
Context management	Trust the model	Active circuit breaking
Memory cleanup	On-demand (check_stale)	Automatic (retention policies)

I optimized for accuracy and control. Claude Code optimized for reliability and automation. Neither is universally better.

The ideal system would combine both: automatic lifecycle hooks with explicit placement validation. Automatic memory consolidation with human-auditable session journals. Per-project isolation with cross-project semantic routing.

Neither system is complete. Claude Code lacks placement validation and cross-project awareness. My system lacks circuit breakers and cache optimization. The future is a synthesis — and the convergence itself is the strongest validation that both approaches are on the right track.

Sources & Limitations

This series synthesizes the following publicly available analyses and does not directly contain leaked source code.

Source	URL	Focus
ccunpacked.dev	ccunpacked.dev	Visual architecture guide, tool/command catalog
Wikidocs Analysis	wikidocs.net/338204	Detailed technical analysis (execution flow, state, rendering)
PyTorch KR	discuss.pytorch.kr	Community analysis + HN discussion synthesis
Claw Code	github.com/ultraworkers/claw-code	Clean-room reimplementation (Rust/Python), PARITY.md gap analysis

Analysis date: April 2, 2026. Anthropic issued DMCA takedowns on 8,100+ forks and discontinued npm distribution shortly after the leak, so some sources may have changed accessibility. Features behind feature flags are unreleased and may be modified or deprecated before launch.