design-extract - pull a design language from any site
npx CLI plus Claude Code plugin that extracts colors, typography, spacing, and shadows from a live site into a structured design-token report.
A nuance up front: the project lives at Manavarya09/design-extract on GitHub but ships as designlang on npm. Same tool. The pitch is unusually concrete - point a headless browser at any URL, get back the design system the way a developer would write it: tokens, Tailwind config, shadcn theme, Figma variables, motion tokens, brand voice, and a paste-ready prompt pack you can drop into v0, Lovable, Cursor, or Claude Artifacts.
What separates it from the long list of "extract colors and fonts from a website" tools is what comes after the tokens: layout patterns, responsive behaviour at four breakpoints, hover and focus states, WCAG contrast scoring, multi-page consistency checks, drift checks against a live source-of-truth, visual diffs, and a shareable graded report card.
Quick start
One command, no install, runs the headless browser and writes 17+ output files:
npx designlang https://stripe.com # extract everything
npx designlang grade https://stripe.com # shareable HTML report card
npx designlang clone https://stripe.com # working Next.js starter
npx designlang --full https://stripe.com # screenshots + responsive + interactions
If you'd rather have it always available:
npm i -g designlang
Or as an agent skill:
npx skills add Manavarya09/design-extract
The skill version registers the tool with 40+ supported agents (Cursor, Claude Code, Codex, etc.) so the agent can invoke it directly during a coding session.
What you actually get
A run produces these files in ./design-extract-output/:
| File | What it is |
|---|---|
*-design-language.md | 19-section markdown - feed any LLM to recreate the design |
*-design-tokens.json | W3C DTCG tokens (primitive + semantic + composite layers) |
*-tailwind.config.js | Drop-in Tailwind theme |
*-shadcn-theme.css | shadcn/ui globals.css variables |
*-figma-variables.json | Figma Variables import (light + dark) |
*-variables.css | CSS custom properties |
*-anatomy.tsx | Typed React stubs for every detected component + variants |
*-motion-tokens.json | Durations, easings, springs, scroll-linked flag |
*-voice.json | Brand voice - tone, pronoun posture, CTA verbs |
*-prompts/ | Paste-ready prompts for v0, Lovable, Cursor, Claude Artifacts |
*-mcp.json | Disk-backed MCP server payload |
*-grade.html | Shareable Design Report Card - letter grade + evidence |
The *-anatomy.tsx and *-design-language.md are the two outputs most worth your attention. The first gives an agent typed component stubs to extend. The second is a 19-section description of the design system in prose, which any LLM can use as a starting point for "build me a page in this style."
What it captures beyond colors and fonts
This is the part that distinguishes designlang from the dozens of similar extractors:
- Layout system - grids, flex containers, container widths, gaps. Not just spacing tokens.
- Responsive - crawls 4 breakpoints, reports what changes (
--responsive). - Interaction states - programmatically hovers and focuses, captures the deltas (
--interactions,--deep-interact). - Motion language - durations, easing families, spring detection, scroll-linked flag, plus a "feel" fingerprint (springy / smooth / mechanical / mixed).
- Component anatomy - slot trees with variant × size × state matrices, emitted as typed
.tsx. - Brand voice - tone, pronoun posture, heading style, CTA verb inventory.
- Page intent + section roles - landing / pricing / docs / etc., with semantic regions (hero, feature-grid, pricing-table, cta).
- WCAG - every fg/bg pair scored, with a remediation palette suggesting nearest passing colours.
- Multi-page consistency - auto-discovers canonical pages, reconciles shared vs per-route tokens.
The CI-ready commands
The interesting part for ongoing use is the second tier of commands - the ones that run in CI rather than once at extraction time:
designlang grade https://stripe.com # shareable report card
designlang clone https://stripe.com # working Next.js app
designlang apply https://stripe.com -d ./app # auto-detect framework, write tokens
designlang brands stripe.com vercel.com linear.app # N-brand matrix
designlang drift https://yourapp.com --tokens ./src/tokens.json # CI drift check
designlang lint ./src/tokens/design-tokens.json # CI-ready linter
designlang visual-diff https://staging.app https://app # single-file HTML diff
designlang mcp # stdio MCP server for Cursor / Claude Code
The drift and lint commands exit non-zero on failure - drop them into a CI step and you get an actual gate against design regression.
The mcp command is the agent-facing surface: launch designlang as an MCP server and any MCP-aware client (Cursor, Claude Code, Claude Desktop) can query tokens, semantic regions, components, and contrast pairs directly without you copy-pasting JSON.
How the extraction actually works
The mechanism is more disciplined than typical scrape-and-guess approaches:
- Crawl - headless Chromium via Playwright, waits for network idle and fonts.
- Extract - one
page.evaluate()walks up to 5,000 DOM elements, collects 25+ computed properties, inline SVGs, font sources, image metadata. - Process - 17 extractor modules parse, deduplicate, cluster, and classify the raw data into tokens, components, and themes.
The headless-browser approach means the extractor sees the rendered page, not the source HTML. That's the difference between picking up the actual design system and picking up whatever the marketing CMS happened to author this morning.
Auth, multi-platform, agent rules
The CLI flags reward reading once:
--cookie,--cookie-file,--headerfor protected pages (supports JSON, PlaywrightstorageState, and Netscapecookies.txtformats).--insecurefor self-signed certs (dev environments, internal staging).--selector ".pricing-card"to extract only from a portion of the page.--platforms web,ios,android,flutter,wordpress,allto emit native targets in addition to web.--emit-agent-rulesto write Cursor, Claude Code, and generic agent rule files alongside the tokens.--system-chrometo use the installed Chrome instead of bundled Chromium (saves the 150 MB download).
When to reach for it
- Cloning the look-and-feel of a reference site as the starting point for a new project.
- Auditing your own deployed app for design drift against your tokens.
- Onboarding an agent to a codebase by giving it typed component anatomy and a brand-voice file as context.
- Multi-brand comparisons or pitch-deck analysis where you need a structured view of how different sites are built.
When not to
- Sites that are heavily client-rendered with content that loads only after auth flows the headless browser can't complete cleanly. The cookie support helps, but it's still a constraint.
- Workloads where you need pixel-perfect visual cloning. designlang reads the system (tokens and patterns), not the exact composition of any one page.
Caveats worth flagging
Headless extraction is a moving target. Sites that lazy-load critical CSS, gate content behind interaction, or fingerprint headless browsers can produce incomplete output - use --full and the auth flags to mitigate, but expect to verify on important targets.
The output is a strong starting point, not a finished design system. The *-design-language.md is the file to hand to an LLM as the source of truth; *-design-tokens.json is the file to commit. The rest are intermediate artifacts you'll prune.
Apache 2.0 / MIT-style license per repo. Active development - the README's "v12.1" version reference suggests fast iteration, so check the changelog before pinning hard.
Recent discussion
From the wider webGoogle’s DESIGN.md Is the Design System AI Coding Agents Were Missing
medium.com · Apr 30, 2026
The Contract You Can’t Break: On API Design and Interfaces
medium.com · Apr 30, 2026
zeke/swiss-design-skill
github.com · Apr 29, 2026
Julpygo/Claude-Code-AI-Design
github.com · Apr 29, 2026
Show HN: Label Design App for BT Thermal Printers – Niimbot, "Cat Printers"
github.com · Apr 29, 2026
Featured in
Related entries
stealth-browser-mcp - anti-bot browser automation MCP
MCP server that drives a Chrome devtools session past anti-bot systems. Lets agents write network hooks and clone UIs from natural language.
trace-mcp - framework-aware codebase MCP for coding agents
MCP server with 138 tools and cross-language framework awareness (58 integrations across 81 languages). Indexes Laravel/Inertia/Vue, Rails/Hotwire, Django/HTMX edges so agents skip re-deriving call graphs. Decision memory links architectural choices to the code they're about. Local-first ONNX embeddings, optional LSP enrichment.
wanman - worktree-isolated multi-agent runtime for Claude Code and Codex
Multi-agent runtime that spawns each Claude Code or Codex agent in its own git worktree and home directory. JSON-RPC subprocess control, task pooling, artifact storage. Solves the share-a-directory failure mode that breaks most multi-agent harnesses.
PostTrainBench - can a CLI agent post-train a base LLM in 10 hours?
Benchmark measuring whether Claude Code, Codex CLI, Gemini CLI, and OpenCode can autonomously improve 4 small base models (Qwen3-1.7B/4B, SmolLM3-3B, Gemma-3-4B) on 7 evals (AIME, BFCL, GPQA, GSM8K, HealthBench, HumanEval, Arena Hard) within a single H100 GPU and 10 hours. Includes agent-as-judge anti-reward-hacking and baseline-replacement penalties for tampering.