Discovery
Back to browse

dario - use Claude Max/Pro as an API

Local proxy that exposes your Claude Max or Pro subscription as an Anthropic-compatible API. No API key, no per-token billing - just point your tools at localhost.

7 min readView source ↗

Dario is a local LLM router with one job: every tool on your machine should point at the same URL, and dario figures out which provider the request was for. It speaks both the Anthropic Messages API and the OpenAI Chat Completions API on http://localhost:3456, so the same proxy serves Cursor, Aider, Continue, Zed, Claude Code, OpenHands, Cline, the Claude Agent SDK, and your own scripts without any of them caring which vendor is upstream.

The headline that earns the project its stars: dario can route a Claude Max subscription through any of those tools by sending requests in the exact shape Claude Code itself sends - so the upstream subscription-billing path is the one that handles the request, instead of API per-token billing. Cursor, Aider, Zed, and Continue can all share your $200/mo plan instead of each demanding its own API key.

Disclaimer up front: dario is independent, unofficial, third-party. The README links to a DISCLAIMER.md that's worth reading before installing. This is not an Anthropic-supported integration.

30 seconds to running

# 1. Install
npm install -g @askalf/dario

# 2. Log in to your Claude Max subscription
dario login                      # or `dario login --manual` for SSH / headless

# 3. Start the local proxy
dario proxy

# 4. Point any Anthropic-compat tool at it
export ANTHROPIC_BASE_URL=http://localhost:3456
export ANTHROPIC_API_KEY=dario

For OpenAI, Groq, OpenRouter, Ollama, vLLM, LiteLLM, etc., add a backend line and reuse the same proxy port:

dario backend add openai     --key=sk-proj-...
dario backend add groq       --key=gsk_...    --base-url=https://api.groq.com/openai/v1
dario backend add openrouter --key=sk-or-...  --base-url=https://openrouter.ai/api/v1
dario backend add local      --key=anything   --base-url=http://127.0.0.1:11434/v1

export OPENAI_BASE_URL=http://localhost:3456/v1
export OPENAI_API_KEY=dario

Switching providers becomes a model-name change in your tool: claude-opus-4-7, gpt-4o, llama-3.3-70b. Force a specific backend with a prefix when names are ambiguous: openai:gpt-4o, claude:opus, groq:llama-3.3-70b, local:qwen-coder.

What it actually does on a request

Dario reads each request, decides which backend owns it based on the model name (or an explicit prefix), and forwards in that backend's native protocol:

Client speaksModelRoutes toBehaviour
Anthropic Messagesclaude-* / opus / sonnet / haikuClaude backendOAuth swap + Claude Code template replay -> api.anthropic.com
Anthropic Messagesgpt-*, llama-*, etc.OpenAI-compat backendAnthropic -> OpenAI translation, forwarded
OpenAI Chat Completionsgpt-* / o1-* / o3-*OpenAI-compat backendpassthrough: auth swap, body byte-for-byte
OpenAI Chat Completionsclaude-*Claude backendOpenAI -> Anthropic translation, then Claude path

The tool doesn't know it's being translated. The backend doesn't know there's a proxy. Dario is the seam.

Wire fidelity is the part most projects skip

The Claude backend isn't just an OAuth swap. It's a full Claude Code wire-level template: every observable axis of an outbound request - body field order, header insertion order, static header values, the anthropic-beta flag set, TLS ClientHello, inter-request timing, stream-consumption shape, session-id lifecycle - is captured from your installed Claude Code binary and mirrored on outbound requests.

Six axes the project has tightened (v3.22 through v3.28) and how they're tunable:

AxisWhat it doesHow to tune
Body key ordertop-level JSON key order replayed byte-for-byte from CC's wire serializationautomatic once a live capture exists
TLS ClientHelloclassifies runtime as bun-match / bun-bypassed / node-only (Bun = BoringSSL; Node = OpenSSL, distinct JA3)--strict-tls refuses to start unless bun-match
Inter-request timingconfigurable floor + uniform jitter to dissolve the 500ms minimum-inter-arrival edge--pace-min, --pace-jitter
Stream-consumption shapedrain upstream to EOF when the consumer disconnects mid-stream--drain-on-close (off by default - don't silently burn tokens)
Session-id lifecycletunable SessionRegistry with jitter, max-age, per-client bucketing--session-idle-rotate, --session-rotate-jitter, --session-max-age
MCP / sub-agent reachdario itself addressable from CC and MCP clientsdario subagent install, dario mcp

The Claude binary itself is interrogated on startup: dario spawns it against a loopback capture endpoint, reads the outbound request, extracts the live template, and caches it at ~/.dario/cc-template.live.json with a 24-hour TTL. When upstream Claude Code releases a new version, dario picks it up automatically rather than going stale for 48 hours.

Multi-account pool mode

If you have multiple Claude subscriptions (e.g. work + personal), pool mode is the workflow that makes them feel like one resource:

dario accounts add work
dario accounts add personal

Routing logic prefers the account with the most headroom. Two specifically useful behaviours:

  • Session stickiness pins a multi-turn conversation to one account so the Anthropic prompt cache survives the run.
  • In-flight 429 failover retries the same request against a different account before your client sees an error.

For long agent runs that occasionally bounce off a per-account rate limit, this is the difference between "agent died at hour 4" and "agent kept going."

Shim mode

If you don't want a proxy on the wire at all, shim is an in-process globalThis.fetch patch injected via NODE_OPTIONS=--require. No HTTP hop, no port to bind, no BASE_URL to set:

dario shim -- claude --print "hi"

Claude Code thinks it's talking to api.anthropic.com directly. The translation happens in the Node process itself.

Agent compatibility

The other class of problem dario solves: Cline, Roo Code, Cursor, Windsurf, Continue.dev, GitHub Copilot, OpenHands, OpenClaw, Hermes - they each ship their own tool schemas and validators. A naive proxy passes them through and the upstream rejects them, or accepts them but charges you the API rate because the request shape isn't recognisably Claude Code.

Dario carries a universal TOOL_MAP (~66 schema-verified entries) that pre-maps every major coding agent's tool names to Claude Code's native set on the outbound path and rebuilds to your agent's exact expected shape on the inbound path. No --preserve-tools flag, no wire-shape loss, no validator errors.

dario as MCP server, dario as sub-agent

Two integrations introduced in v3.26 and v3.27 that close the loop:

dario subagent install   # registers a first-party sub-agent at ~/.claude/agents/dario.md
dario mcp                # turns dario into a read-only MCP server

The first lets Claude Code delegate dario diagnostics and template refreshes in-session. The second exposes dario's state - auth, pool, backends, template, runtime - to any MCP-aware editor (Claude Desktop, Cursor, Zed) so you can introspect the proxy without leaving the IDE.

When to reach for it

  • You pay for Claude Max but only use it in Claude Code, while every other tool on your machine wants its own API key.
  • You run multiple AI coding tools and the per-tool provider config has become its own job.
  • You hit Claude rate limits on long agent runs and want pooled subscriptions with sticky sessions.
  • You want one stable local endpoint every script and editor can target, regardless of which provider is actually upstream.
  • You care about wire-level fidelity and want each axis of CC compatibility to be tunable rather than implicit.

When not to

  • Production deployments needing vendor-managed SLAs. Use the provider APIs directly.
  • Hosted, multi-tenant, managed routing platforms. Dario is a local, single-user tool.
  • Workloads where you specifically need a chat UI. Use claude.ai or chatgpt.com.
  • Team or organisation-wide governance with audit trails and RBAC. That's a ThinkWatch or Lunar shape, not a dario shape.

Where it sits in the landscape

Three nearby projects worth comparing:

  • llm-openai-via-codex - the Simon Willison plugin that does the same trick for Codex subscriptions through the llm CLI. Smaller scope, simpler shape, Apache-2.0.
  • ThinkWatch - enterprise gateway. Multi-user, RBAC, audit logs, ClickHouse. Different problem.
  • Lunar - AI-native gateway focused on outbound traffic. Closer to ThinkWatch in deployment shape than to dario.

Dario is the local, single-user, wire-faithful one. The other three solve adjacent problems for different audiences.

Trade-offs and audit story

The README is unusually direct about being auditable: ~10,750 lines of TypeScript across ~24 files, zero runtime dependencies (npm ls --production confirms), credentials at ~/.dario/ with 0600 permissions, 127.0.0.1-only by default, every release SLSA-attested via GitHub Actions, nothing phones home.

That's a small enough surface to read in a weekend, which is the right answer for a tool sitting between every model request and every provider API.

dario doctor is the single paste-ready health report - run it before filing any issue. The discussions linked from the README (issues #68, #13, #39, #1, #14) are the deepest reading on why the project works and where its current edges are.

License: included in the repo. Independent, unofficial, third-party - those words are in the README's first sentence for a reason.

Recent discussion

From the wider web

Featured in

Related entries

GitHubToolFeatured

Claudraband - persistent, resumable Claude Code sessions over HTTP and ACP

Wraps the real Claude Code TUI with a session lifecycle layer. Resumable non-interactive workflows, HTTP daemon for remote/headless control, ACP server for editor integrations (Zed, Toad). Drives your existing Claude Code install rather than reimplementing it - keeps skills, hooks, MCPs, and approvals intact.

Why I saved this - Different from the Claude SDK - Claudraband drives the real CLI from outside, so user-installed skills/hooks/MCPs all still work. The ACP support is the easy path to editor integrations.
GitHubToolFeatured

wanman - worktree-isolated multi-agent runtime for Claude Code and Codex

Multi-agent runtime that spawns each Claude Code or Codex agent in its own git worktree and home directory. JSON-RPC subprocess control, task pooling, artifact storage. Solves the share-a-directory failure mode that breaks most multi-agent harnesses.

Why I saved this - The 'one-man train' framing is load-bearing: humans observe rather than approve every step. Worktree-per-agent isolation is the upgrade most multi-agent harnesses skip.