ESP-Claw - AI agent framework for IoT

Espressif's chat-coding agent framework for ESP32 devices. Brings tool-calling LLM agents to embedded targets with C-level memory budgets.

Saved Apr 25, 20264 min readView source ↗

#iot #esp32 #embedded #ai-agent #c

ESP-Claw is Espressif's chat-coding agent framework for ESP32-class microcontrollers. The premise is bold and the execution is unusual: take the OpenClaw "agents talking to tools, evolving over time" pattern, reimplement it in C with embedded-friendly memory budgets, and run the whole loop - sensing, decision-making, and execution - locally on a $5 chip.

The framing the project leads with is the right one: traditional IoT stops at connectivity. Devices can connect, but they cannot think; they can execute commands, but they cannot decide. ESP-Claw moves the agent runtime down onto the device itself, turning passive executors into active decision-making nodes.

Four design ideas

The README organizes the project around four pillars worth understanding before you flash a board:

Chat as Creation - device behavior is defined through conversation. IM chat plus dynamic Lua loading lets non-programmers describe what they want and have the device adopt the new behavior live.
Event Driven - any sensor reading, MQTT message, or external event can trigger the agent loop. Response latency is in the milliseconds, not seconds, because the model call isn't always in the loop.
Structured Memory - memories are organized in a structured form, not dumped into a prompt. Privacy stays on-device.
MCP Communication - speaks the standard Model Context Protocol both as server (exposes the device's tools to upstream agents) and as client (consumes tools from other MCP servers).

Quick start: flash from the browser

Several supported boards (M5Stack CoreS3, common breadboards, and others under application/edge_agent/boards/) can be flashed entirely from the browser - configuration and flash, no toolchain install required.

The flow:

Visit the project's online flashing page.
Plug your ESP32-S3 board in via USB.
Pick the right board profile.
Configure your LLM provider and IM platform.
Flash. Done.

For unsupported boards or for ESP32-P4 (and friends), build locally from ./application/edge_agent/ using the standard ESP-IDF flow. The README points to the project docs for board adaptation specifics.

Supported LLMs and IMs

LLM providers - OpenAI-style and Anthropic-style APIs both work natively. Tested providers include OpenAI's GPT models, Alibaba Cloud Bailian's Qwen, Anthropic's Claude, and DeepSeek. Custom endpoints are supported.

The README's recommendation for the self-programming loop to actually work: a model with strong tool use and instruction following. They suggest gpt-5.4, qwen3.6-plus, claude4.6-sonnet, deepseek-v4-pro, or anything of comparable capability. Smaller models can drive the simpler event loops but won't reliably do the chat-as-creation Lua-modification path.

IM platforms - Telegram, QQ, Feishu, and WeChat are supported out of the box. The IM layer is extensible, so adding others is a known path.

What "Chat as Creation" actually means

The interesting bit. Most IoT-meets-LLM projects use the LLM to translate intent into pre-baked function calls. ESP-Claw goes further: when the user describes a behavior the device doesn't already implement, the agent generates new Lua to define it, and that Lua is loaded into the device runtime live - no firmware reflash, no compile cycle.

That's the part that's hard to do in C on an ESP32-class device with a few hundred KB of RAM. The Lua VM, the agent loop, the memory store, and the IM transport all have to fit in budget while still leaving headroom for whatever the user invents in chat.

Why this is interesting beyond IoT

It's a real-world demonstration that the "agent as runtime" pattern scales down to embedded targets. If it works on an ESP32-S3, the same architecture is viable on every embedded surface that has enough flash to hold a Lua VM and a network stack - which is most of them, at this point.

The companion projects worth knowing about:

OpenClaw - the original concept ESP-Claw inherits the vocabulary and design from.
MimiClaw - separate ESP32 implementation that ESP-Claw drew on for agent loop, IM, and related capabilities on-chip.

When to reach for it

You're building consumer IoT products where the LLM-as-controller experience is the differentiator.
You want a research platform for "agents on the edge" without building a firmware framework from scratch.
You're already in the Espressif ecosystem and want first-party agent tooling.

When not to

Devices that aren't ESP32 series. The implementation is C plus Espressif-specific drivers; porting is non-trivial.
Workloads where you need the agent to operate without a network. The LLM call still goes out to a cloud provider unless you front-end it with a local model gateway.
Anything where guaranteed latency or hard-real-time behavior matters. The agent loop fires fast in normal operation, but a model call is still a model call.

Trade-offs and rough edges

The project is under active development - the README warns that the demo is migrating from basic_demo to edge_agent, and PRs against the new path are preferred. Expect the API surface to keep moving.

There's a privacy story baked in (structured memory stays on-device), but the LLM call itself is whatever the configured provider sees. If you care about not exposing prompts to a cloud provider, point ESP-Claw at a self-hosted endpoint. The custom-endpoint support is exactly that escape hatch.

Espressif maintains it. License is in-repo (Apache-style based on the badge). For a category that's ~98% prototypes, having a chip vendor ship a real, supported framework with browser-flashing and chat-as-creation working out of the box is the part that makes this worth taking seriously.

Related entries

GitHub Library

Foundry - foundation layer for agentic intelligence

Python agent runtime and framework aimed at production agentic systems. Early but already has 800+ stars and a clear shape around runtime primitives.

#agent-framework #agent-runtime #ai-agent #python #multi-agent

GitHub Library

OQP - verification protocol for AI agents

MCP-compatible spec defining four endpoints (capabilities, workflows, execute, assess-risk) so agents can prove a shipped change satisfies business requirements before it goes live.

#mcp #agent-security #evals #verification #ai-agent

GitHub Library

LABE - legal action boundary eval

Public benchmark that tests an agent at the moment it's about to take a high-impact legal action. Same harness, baseline vs verified, measures unjustified action drops and goal-completion gains.

#evals #agent-security #ai-agent #benchmark

GitHub Library

a2a-java - Java SDK for Agent2Agent protocol

Official Java SDK implementing the Agent2Agent (A2A) protocol for inter-agent messaging and capability discovery. Provides client and server implementations for JVM agent stacks.

#multi-agent #java #protocol #ai-agent