Unmarkdown
AI Tools

Hub-and-Spoke Memory: How I Gave Claude Code Persistent Context Across 200+ Sessions

Updated Mar 8, 2026 · 10 min read

I've been building a SaaS product with Claude Code for three months. Over 200 sessions. Hundreds of files across a Next.js app, Supabase backend, Stripe integration, MCP server, Chrome extension, Obsidian plugin, 875 tests.

Around session 30, I hit a wall. Claude would start a new session, read CLAUDE.md, and still not know enough. It didn't know which features were shipped, which were in progress, what technical decisions we'd made, or what bugs we'd already fixed. I'd spend the first 15 minutes of every session re-explaining context. Multiply that by 200 sessions and you've wasted 50 hours just on context recovery.

So I built a memory system. Not a database. Not a vector store. Just files. Structured files that Claude reads at the start of every session, organized in a hub-and-spoke pattern.

It solved the problem completely. Here's how it works.

The problem with a single CLAUDE.md

Most people put everything in one CLAUDE.md file. Project setup, architecture notes, coding standards, current tasks, historical decisions, debugging tips. The file grows to 500+ lines. Then two things happen:

It blows up the context window. CLAUDE.md loads into every conversation. A 500-line file with detailed architecture notes, pricing logic, and editor internals consumes thousands of tokens before you've said a word. That's context budget you'll need later for actual code.

Claude can't prioritize. When everything is in one file, nothing stands out. The instruction to "always run tests before committing" sits next to a paragraph about the Stripe webhook retry logic from two months ago. Claude gives equal weight to all of it, which means it gives appropriate weight to none of it.

The insight is simple: Claude doesn't need to know everything at once. It needs to know what's relevant right now, and where to find the rest.

The hub-and-spoke pattern

The architecture has two parts:

The hub is a single file (MEMORY.md) that's auto-loaded into every conversation. It stays under 200 lines. It contains standing instructions, the current sprint status, and an index of spoke files with "when to read" guidance.

The spokes are topic-specific files that hold the real detail. Architecture decisions, pricing logic, editor internals, publishing infrastructure, analytics setup, distribution status. Each spoke owns one domain of knowledge.

Claude reads the hub automatically. It reads spokes on demand, only when the current task touches that domain.

Here's what my hub file looks like (simplified):

# Project Memory

## Standing Instructions
- Never make assumptions. Always ask when unsure.
- Tests are mandatory for every code change.
- Always update docs after completing each phase.

## After Context Loss / New Session Recovery
MANDATORY: Do NOT write any code until ALL steps are completed.
1. Read this file (auto-loaded)
2. Run git log --oneline -20 to verify latest commits
3. Read the relevant spoke file(s) from the index below
4. Ask the user what they want to work on

## Current Sprint
### Active
- Obsidian plugin: PR #10520, waiting for review
- ChatGPT App: Submitted, awaiting review

### Recently Completed
- Blog: 119 posts published
- PostHog: 25 events, 3 dashboards
- v1.1-v1.5 features: all shipped

## Spoke File Index
| File | When to read |
|------|-------------|
| architecture.md | Every session |
| product-status.md | "What's shipped?", phase status |
| pricing-and-billing.md | Pricing, Stripe, auth |
| publishing.md | Published pages, sharing, folders |
| editor-technical.md | Editor bugs, toolbar, sync |
| extensions-technical.md | KaTeX, Mermaid, Chart.js |
| distribution.md | MCP, Chrome ext, Obsidian plugin |
| analytics.md | PostHog, Sentry, SEO |

The spoke index is the key design element. It tells Claude which file to read based on what you're working on. If I say "let's fix a publishing bug," Claude knows to read publishing.md before touching any code.

What goes in spoke files

Each spoke file is a self-contained reference for one domain. Here's the structure I've found works:

Current state. What's deployed, what's working, what's broken. Not what we planned to build six weeks ago. What's true right now.

Key technical decisions. Not every decision. The ones that would cause problems if Claude didn't know them. "The template engine uses scoped CSS via generateTemplateCSS() in a .template-preview container" saves 20 minutes of debugging. "We discussed three options for the sidebar" does not.

File paths and architecture. Where things live in the codebase. Claude can find files on its own, but telling it "the conversion pipeline is in src/lib/engine/ with per-destination rehype plugins in src/lib/engine/plugins/rehype/" means it reads the right files first instead of searching.

Known gotchas. Things that have bitten us. "The Turndown service is a singleton with 14 custom rules. If you create a new instance, you lose all the rules." This is the kind of detail that gets lost in compacting and costs an hour to rediscover.

What NOT to include. Session-specific details, in-progress work, speculative plans. If it changes every week, it doesn't belong in a spoke file. Spokes should be stable knowledge that's true across many sessions.

Here's a trimmed example from my architecture spoke:

# Architecture

## Project Setup
- Framework: Next.js 16.1 with Turbopack, App Router
- Styling: Tailwind CSS v4 with @theme inline
- Package manager: npm (run from /app)
- Build: npx next build
- Tests: Vitest 4.x, 875 tests across 59 files

## Directory Structure
- src/components/paste-engine/ — Editor UI (9 sub-components)
- src/lib/engine/ — Conversion pipeline and plugins
- src/lib/templates/ — Template engine: types, CSS, fonts, registry
- src/lib/supabase/ — DB client, types, helpers
- src/lib/stripe/ — Stripe config, checkout, sync

## Key Decisions
- Conversion runs client-side through rehype plugins
- Each destination (Docs, Word, Slack) has its own plugin
- Template CSS is scoped to .template-preview container
- All file downloads gated behind Pro; clipboard copy is free

The recovery protocol

The most important part of the system is the recovery protocol in the hub. When Claude starts a new session or recovers from compacting, it follows these steps in order:

  1. Read MEMORY.md (happens automatically)
  2. Check git log to verify what actually shipped (not what the memory says shipped)
  3. Read the relevant spoke file(s) for the current task
  4. Ask the user what to work on (don't assume from the memory)

Step 2 matters more than it seems. Memory files can drift from reality. Maybe I shipped a feature but forgot to update the spoke file. Maybe a bug was fixed but the "known gotchas" still lists it. The git log is the ground truth. The memory system is the map. The map is useful, but you verify it against the territory.

Step 4 prevents a subtle failure mode. Without it, Claude reads the memory, sees "Obsidian plugin: waiting for review," and starts working on the Obsidian plugin. But maybe I'm done thinking about the plugin today and want to work on something completely different. The recovery protocol forces a handoff: Claude gets the context, then asks for direction.

How it performs in practice

Before this system, I'd lose 10-15 minutes per session to context recovery. After implementing it, new sessions start productively within 1-2 minutes. Claude reads the hub, reads the relevant spoke, and we're working.

Some specific improvements:

No more re-explaining architecture. Claude reads architecture.md and knows the project structure, conventions, and key file paths. It navigates the codebase correctly from the first command.

Decisions stick. When we decided that all in-app buttons use a specific set of Tailwind classes, that went into the architecture spoke. Claude has applied those classes correctly across 200+ sessions without me repeating the rule.

Debugging is faster. The "known gotchas" in each spoke file prevent Claude from hitting the same issues twice. The singleton Turndown instance, the template CSS scoping, the proxy middleware ordering: these are documented once and never re-discovered.

Sprint tracking works. The hub's "current sprint" section means Claude always knows what's in flight. When I start a session, it already knows which PRs are open, which features shipped, and what's blocked.

Keeping the memory accurate

The system only works if the files are accurate. Stale memory is worse than no memory, because Claude will confidently act on outdated information.

My rule: update the memory files as part of completing each task. Finished a feature? Update product-status.md. Fixed a bug? Update the relevant spoke's gotchas section. Shipped a release? Update the hub's sprint section.

I also have a standing instruction in the hub: "Only update docs with verified information. Verify via git log or file reads before writing." This prevents Claude from writing to memory files based on assumptions. If Claude thinks a feature was shipped, it checks git log first.

The archive spoke handles historical information. When sprint items are completed, they move from the hub's "active" section to the archive spoke. This keeps the hub small while preserving the history.

Why files and not a database

You might wonder why this is just markdown files in a directory instead of something more sophisticated. A few reasons:

Claude reads markdown natively. No parsing layer, no API calls, no serialization. Claude opens the file and understands it. Markdown is the native language of LLMs.

Git tracks changes. Every memory update is a commit. If a spoke file gets corrupted or a bad update goes in, git log shows when it happened and git diff shows what changed. Version control for AI memory comes free.

Human-readable and editable. I can open any spoke file in my editor and update it myself. I can review what Claude wrote to the memory. There's no opaque database or vector store between me and the knowledge.

No infrastructure. No server, no embedding model, no retrieval pipeline. Just files on disk that Claude reads with the Read tool. The system works offline, has zero latency, and never goes down.

Token-efficient. A spoke file is typically 50-100 lines. That's maybe 500-1,000 tokens. Claude loads only the spokes it needs for the current task. Compare this to a 500-line monolithic CLAUDE.md that loads 5,000 tokens into every conversation whether it's relevant or not.

How to set this up for your project

If you want to try this, here's the minimum viable setup:

1. Create the memory directory.

.claude/memory/
├── MEMORY.md          # The hub (auto-loaded)
├── architecture.md    # Project setup and structure
└── decisions.md       # Key technical decisions

Put this in your .claude/ project directory so it's loaded automatically. You can add more spoke files as your project grows.

2. Write the hub.

Keep it under 200 lines. Include: standing instructions, recovery protocol, current status, and the spoke index with "when to read" guidance.

3. Write your first spoke.

Start with architecture. Document your tech stack, directory structure, key file paths, and coding conventions. This single file will save you more time than anything else.

4. Add the recovery protocol.

Put this in your hub:

## New Session Recovery
1. Read this file (auto-loaded)
2. Run git log --oneline -10
3. Read the relevant spoke file(s) from the index
4. Ask what to work on

5. Grow organically.

Don't try to document everything upfront. Add spokes as you encounter repeated context loss. If you find yourself re-explaining the billing logic for the third time, that's your signal to create a pricing spoke.

After a month, you'll likely have 5-8 spoke files covering the domains you actually work in. The ones you never read will become obvious candidates for archival or deletion.

The constraint that makes it work

The entire system rests on one constraint: the hub must stay small. Under 200 lines. If the hub grows beyond that, you've defeated the purpose. The hub is a routing table, not a knowledge base.

When I'm tempted to add detail to the hub, I ask: "Does Claude need this in every session, or only when working in a specific area?" If the answer is "specific area," it goes in a spoke. The hub gets a one-line entry in the index pointing to the spoke.

This constraint is what makes the system scale. My project has 12 spoke files covering architecture, pricing, publishing, editor internals, extensions, distribution, analytics, launch planning, social media, blogging, and an archive. The hub is still 57 lines. Claude loads 57 lines of hub context plus maybe 80 lines from the relevant spoke. That's roughly 1,500 tokens of memory overhead per session, which is a fraction of what a monolithic CLAUDE.md would consume.

The knowledge is all there. It's just organized so that Claude only loads what it needs.

Your markdown deserves a beautiful home.

Start publishing for free. Upgrade when you need more.

View pricing