Discovery
Back to browse
GitHubHackFeatured

Claude Code Analysis - architectural reverse-engineering of the leaked source

82 docs and 15 diagrams mapping every major subsystem of Claude Code's accidentally exposed 512K-line TypeScript source - YOLO classifier, 93% context compaction, prompt-cache layout, 88+ feature flags, the custom React-Fiber terminal renderer.

4 min readView source ↗

On March 31 2026, Claude Code's npm package shipped with an unobfuscated .map file that pointed at the unminified TypeScript sources sitting on Anthropic's R2 bucket. Security researcher Chaofan Shou pulled the lot - 512,000+ lines across 1,902 files - and turned the result into 82 architectural docs and 15 diagrams covering every major subsystem. The repo is the closest thing the rest of us have to reading Claude Code's actual source.

It is not affiliated with Anthropic. It is also not a leak in the usual sense: the package was public, the sourcemap was public, the R2 bucket was public. The "discovery" was noticing.

What's in the 82 docs

Eight sections, roughly:

SectionDocsLinesWhat it covers
Architecture & Bootstrap12~14.2kStartup, entry points
Master Extraction8~9.1kHow the source was recovered
Prompts & Instructions11~18.5kSystem prompt hierarchy
Core Systems19~24.8kAPI, context, query engine
Security & Permissions10~11.2kSafety machinery
Tools & Plugins14~19.6kCommand execution
Agent Orchestration8~11.2kMulti-agent coordination
Feature Evolution6~4.1kUnreleased capabilities

The bits worth pulling out of that pile:

The YOLO classifier. Two-stage. Stage 1 is a 64-token fast scan; if anything looks suspicious, stage 2 runs a 4,096-token reasoning pass at zero temperature and decides allow / deny / prompt. The system "errs on the side of blocking." Backed by 44 gitleaks rules for secret scanning and a custom bash AST parser that flags 15 dangerous node types before the command ever runs.

93% context compression. When pressure crosses 93%, one of six compaction strategies fires. The lightweight one drops old tool results; the aggressive one forks a subprocess that summarises the conversation into a 9-section standardised format. The user sees nothing. If you've ever wondered why a long session "forgot" something earlier in the turn, this is why.

The prompt cache layout. 7 static sections cached globally, 13 dynamic sections that bust the cache. There's deliberate cache-busting after MCP tool instructions to force a new cache block - which is the subtlety most third-party MCP cost analyses miss.

88+ build-time feature flags. experimental_agents, enable_voice_input, unsafe_bash_allowed, plugin_marketplace, plus 600+ runtime gates prefixed tengu_ evaluated through GrowthBook. A surprisingly clear map of where the product is heading next.

The terminal renderer. 7,743 lines of custom React Fiber reconciler with packed-Int32Array screen buffers and frame-diffing that drops idle output from ~10KB to ~50 bytes per frame. Worth reading if you've ever built a TUI and wondered what the upper bound looks like.

Multi-agent IPC. File-based, at ~/.claude/work/ipc/, polled at 500ms. The repo documents a race where a malicious task could hijack the permission bridge - good context if you're letting agents spawn other agents.

Install as a Claude Code skill

The repo ships a skill that wires the docs into Claude Code as a knowledge source for code audits:

mkdir -p ~/.claude/skills/internals && curl -sL \
  https://raw.githubusercontent.com/thtskaran/claude-code-analysis/master/.claude/skills/internals/SKILL.md \
  -o ~/.claude/skills/internals/SKILL.md

The skill runs a ReAct loop: read code, fetch the matching doc, scratchpad, re-analyse with the new context, follow doc chains until "every open question is answered." Useful if you're auditing your own agent-style codebase against patterns Anthropic actually shipped.

When to reach for it

  • You're building a coding agent and want a reference for how a production one is wired (prompt caching, compaction, tool permission gates, multi-agent IPC).
  • You're reviewing your own MCP server or skill and want to compare against the YOLO/permission gate patterns Claude Code uses internally.
  • You write about Claude Code and want primary-source citations rather than guesswork.

When not to

  • You want quick answers about using Claude Code. The repo is for understanding internals, not for everyday usage.
  • You assume the docs reflect current behaviour. They reflect the code that was recovered on Mar 31 - feature flags flip, prompt sections move, and Anthropic ships fast.

What to be careful about

The repo is a derivative of code Anthropic accidentally exposed; copying its source verbatim into a competing product is a different conversation than reading the docs. The author is clear that the artifact is research, not a redistribution. Treat it as you would a leaked-then-publicly-discussed corporate codebase: read, learn, don't paste.

The "every subsystem mapped" framing is also a slight overclaim. The docs are dense and accurate where they exist, but they cover the codebase as it stood on a single day. If you need a current behaviour, verify against your own session traces before trusting the writeup.

Featured in

Related entries