design-extract - pull a design language from any site

npx CLI plus Claude Code plugin that extracts colors, typography, spacing, and shadows from a live site into a structured design-token report.

Saved Apr 25, 20265 min readView source ↗

#design-tokens #claude-code #css #scraping #design-system

A nuance up front: the project lives at Manavarya09/design-extract on GitHub but ships as designlang on npm. Same tool. The pitch is unusually concrete - point a headless browser at any URL, get back the design system the way a developer would write it: tokens, Tailwind config, shadcn theme, Figma variables, motion tokens, brand voice, and a paste-ready prompt pack you can drop into v0, Lovable, Cursor, or Claude Artifacts.

What separates it from the long list of "extract colors and fonts from a website" tools is what comes after the tokens: layout patterns, responsive behaviour at four breakpoints, hover and focus states, WCAG contrast scoring, multi-page consistency checks, drift checks against a live source-of-truth, visual diffs, and a shareable graded report card.

Quick start

One command, no install, runs the headless browser and writes 17+ output files:

npx designlang https://stripe.com           # extract everything
npx designlang grade https://stripe.com     # shareable HTML report card
npx designlang clone https://stripe.com     # working Next.js starter
npx designlang --full https://stripe.com    # screenshots + responsive + interactions

If you'd rather have it always available:

npm i -g designlang

Or as an agent skill:

npx skills add Manavarya09/design-extract

The skill version registers the tool with 40+ supported agents (Cursor, Claude Code, Codex, etc.) so the agent can invoke it directly during a coding session.

What you actually get

A run produces these files in ./design-extract-output/:

File	What it is
`*-design-language.md`	19-section markdown - feed any LLM to recreate the design
`*-design-tokens.json`	W3C DTCG tokens (primitive + semantic + composite layers)
`*-tailwind.config.js`	Drop-in Tailwind theme
`*-shadcn-theme.css`	shadcn/ui `globals.css` variables
`*-figma-variables.json`	Figma Variables import (light + dark)
`*-variables.css`	CSS custom properties
`*-anatomy.tsx`	Typed React stubs for every detected component + variants
`*-motion-tokens.json`	Durations, easings, springs, scroll-linked flag
`*-voice.json`	Brand voice - tone, pronoun posture, CTA verbs
`*-prompts/`	Paste-ready prompts for v0, Lovable, Cursor, Claude Artifacts
`*-mcp.json`	Disk-backed MCP server payload
`*-grade.html`	Shareable Design Report Card - letter grade + evidence

The *-anatomy.tsx and *-design-language.md are the two outputs most worth your attention. The first gives an agent typed component stubs to extend. The second is a 19-section description of the design system in prose, which any LLM can use as a starting point for "build me a page in this style."

What it captures beyond colors and fonts

This is the part that distinguishes designlang from the dozens of similar extractors:

Layout system - grids, flex containers, container widths, gaps. Not just spacing tokens.
Responsive - crawls 4 breakpoints, reports what changes (--responsive).
Interaction states - programmatically hovers and focuses, captures the deltas (--interactions, --deep-interact).
Motion language - durations, easing families, spring detection, scroll-linked flag, plus a "feel" fingerprint (springy / smooth / mechanical / mixed).
Component anatomy - slot trees with variant × size × state matrices, emitted as typed .tsx.
Brand voice - tone, pronoun posture, heading style, CTA verb inventory.
Page intent + section roles - landing / pricing / docs / etc., with semantic regions (hero, feature-grid, pricing-table, cta).
WCAG - every fg/bg pair scored, with a remediation palette suggesting nearest passing colours.
Multi-page consistency - auto-discovers canonical pages, reconciles shared vs per-route tokens.

The CI-ready commands

The interesting part for ongoing use is the second tier of commands - the ones that run in CI rather than once at extraction time:

designlang grade https://stripe.com           # shareable report card
designlang clone https://stripe.com           # working Next.js app
designlang apply https://stripe.com -d ./app  # auto-detect framework, write tokens
designlang brands stripe.com vercel.com linear.app   # N-brand matrix
designlang drift https://yourapp.com --tokens ./src/tokens.json   # CI drift check
designlang lint ./src/tokens/design-tokens.json     # CI-ready linter
designlang visual-diff https://staging.app https://app   # single-file HTML diff
designlang mcp                                # stdio MCP server for Cursor / Claude Code

The drift and lint commands exit non-zero on failure - drop them into a CI step and you get an actual gate against design regression.

The mcp command is the agent-facing surface: launch designlang as an MCP server and any MCP-aware client (Cursor, Claude Code, Claude Desktop) can query tokens, semantic regions, components, and contrast pairs directly without you copy-pasting JSON.

How the extraction actually works

The mechanism is more disciplined than typical scrape-and-guess approaches:

Crawl - headless Chromium via Playwright, waits for network idle and fonts.
Extract - one page.evaluate() walks up to 5,000 DOM elements, collects 25+ computed properties, inline SVGs, font sources, image metadata.
Process - 17 extractor modules parse, deduplicate, cluster, and classify the raw data into tokens, components, and themes.

The headless-browser approach means the extractor sees the rendered page, not the source HTML. That's the difference between picking up the actual design system and picking up whatever the marketing CMS happened to author this morning.

Auth, multi-platform, agent rules

The CLI flags reward reading once:

--cookie, --cookie-file, --header for protected pages (supports JSON, Playwright storageState, and Netscape cookies.txt formats).
--insecure for self-signed certs (dev environments, internal staging).
--selector ".pricing-card" to extract only from a portion of the page.
--platforms web,ios,android,flutter,wordpress,all to emit native targets in addition to web.
--emit-agent-rules to write Cursor, Claude Code, and generic agent rule files alongside the tokens.
--system-chrome to use the installed Chrome instead of bundled Chromium (saves the 150 MB download).

When to reach for it

Cloning the look-and-feel of a reference site as the starting point for a new project.
Auditing your own deployed app for design drift against your tokens.
Onboarding an agent to a codebase by giving it typed component anatomy and a brand-voice file as context.
Multi-brand comparisons or pitch-deck analysis where you need a structured view of how different sites are built.

When not to

Sites that are heavily client-rendered with content that loads only after auth flows the headless browser can't complete cleanly. The cookie support helps, but it's still a constraint.
Workloads where you need pixel-perfect visual cloning. designlang reads the system (tokens and patterns), not the exact composition of any one page.

Caveats worth flagging

Headless extraction is a moving target. Sites that lazy-load critical CSS, gate content behind interaction, or fingerprint headless browsers can produce incomplete output - use --full and the auth flags to mitigate, but expect to verify on important targets.

The output is a strong starting point, not a finished design system. The *-design-language.md is the file to hand to an LLM as the source of truth; *-design-tokens.json is the file to commit. The rest are intermediate artifacts you'll prune.

Apache 2.0 / MIT-style license per repo. Active development - the README's "v12.1" version reference suggests fast iteration, so check the changelog before pinning hard.

Recent discussion

From the wider web

Featured in

Claude Code tools, plugins, and integrations
The best tools, MCP servers, and harnesses for getting more out of Claude Code - orchestration, observability, telemetry, and remote control.

Related entries

GitHub Tool

stealth-browser-mcp - anti-bot browser automation MCP

MCP server that drives a Chrome devtools session past anti-bot systems. Lets agents write network hooks and clone UIs from natural language.

#mcp #browser-automation #python #scraping #claude-code

GitHub ToolFeatured

trace-mcp - framework-aware codebase MCP for coding agents

MCP server with 138 tools and cross-language framework awareness (58 integrations across 81 languages). Indexes Laravel/Inertia/Vue, Rails/Hotwire, Django/HTMX edges so agents skip re-deriving call graphs. Decision memory links architectural choices to the code they're about. Local-first ONNX embeddings, optional LSP enrichment.

Why I saved this - Distinct from Qartez - Qartez is structural (PageRank, blast radius), trace-mcp is framework-semantic. The cross-language edges (Laravel controller -> Vue page via Inertia) are the differentiated bit.

#mcp #claude-code #code-intelligence #typescript #static-analysis

GitHub ToolFeatured

wanman - worktree-isolated multi-agent runtime for Claude Code and Codex

Multi-agent runtime that spawns each Claude Code or Codex agent in its own git worktree and home directory. JSON-RPC subprocess control, task pooling, artifact storage. Solves the share-a-directory failure mode that breaks most multi-agent harnesses.

Why I saved this - The 'one-man train' framing is load-bearing: humans observe rather than approve every step. Worktree-per-agent isolation is the upgrade most multi-agent harnesses skip.

#claude-code #codex #multi-agent #orchestration #typescript

GitHub ToolFeatured

PostTrainBench - can a CLI agent post-train a base LLM in 10 hours?

Benchmark measuring whether Claude Code, Codex CLI, Gemini CLI, and OpenCode can autonomously improve 4 small base models (Qwen3-1.7B/4B, SmolLM3-3B, Gemma-3-4B) on 7 evals (AIME, BFCL, GPQA, GSM8K, HealthBench, HumanEval, Arena Hard) within a single H100 GPU and 10 hours. Includes agent-as-judge anti-reward-hacking and baseline-replacement penalties for tampering.

Why I saved this - Current leader: Opus 4.6 via Claude Code at 23.2 average. The reward-hacking safeguards (eval tampering and model-substitution detection, baseline-replacement penalty) are the part most agent benchmarks skip.

#evals #claude-code #codex #gemini-cli #opencode