SmolVM - one-command sandbox for Claude Code and Codex

Pre-installed sandboxed VM with Claude and Codex ready to run, plus git credentials wired up. Removes the 'press enter to accept' loop without exposing the host.

Saved Apr 28, 2026View source ↗

#claude-code #codex #sandbox #security #vm

This entry doesn't have a long-form writeup yet. Follow the source link above for the full context.

Recent discussion

From the wider web

How to Safely Run AI-Generated Code with SmolVM (Open-Source MicroVM Sandbox)
dev.to · Apr 21, 2026
GitHub - smol-machines/smolvm: Tool to build & run portable, lightweight, self-contained virtual machines.
github.com · Apr 18, 2026

Featured in

Related entries

GitHub LibraryFeatured

AgentBox - SDK to run coding agents in any sandbox

One SDK to run Claude Code, Codex, or OpenCode inside Docker, E2B, Modal, Daytona, or Vercel sandboxes - boots each agent's native server (JSON-RPC, HTTP/SSE) instead of using non-interactive --print mode.

#claude-code #codex #opencode #sandbox #agent-security

GitHub ToolFeatured

wanman - worktree-isolated multi-agent runtime for Claude Code and Codex

Multi-agent runtime that spawns each Claude Code or Codex agent in its own git worktree and home directory. JSON-RPC subprocess control, task pooling, artifact storage. Solves the share-a-directory failure mode that breaks most multi-agent harnesses.

Why I saved this - The 'one-man train' framing is load-bearing: humans observe rather than approve every step. Worktree-per-agent isolation is the upgrade most multi-agent harnesses skip.

#claude-code #codex #multi-agent #orchestration #typescript

GitHub ToolFeatured

PostTrainBench - can a CLI agent post-train a base LLM in 10 hours?

Benchmark measuring whether Claude Code, Codex CLI, Gemini CLI, and OpenCode can autonomously improve 4 small base models (Qwen3-1.7B/4B, SmolLM3-3B, Gemma-3-4B) on 7 evals (AIME, BFCL, GPQA, GSM8K, HealthBench, HumanEval, Arena Hard) within a single H100 GPU and 10 hours. Includes agent-as-judge anti-reward-hacking and baseline-replacement penalties for tampering.

Why I saved this - Current leader: Opus 4.6 via Claude Code at 23.2 average. The reward-hacking safeguards (eval tampering and model-substitution detection, baseline-replacement penalty) are the part most agent benchmarks skip.

#evals #claude-code #codex #gemini-cli #opencode

GitHub ToolFeatured

pentest-ai-agents - Claude Code subagents for offensive security

Specialized Claude Code subagents that turn the CLI into a pentest assistant: plan engagements, analyze recon, research exploits, build detections, audit STIGs, and write reports.

Why I saved this - Authorized-use scope is explicit in the README - it is a research harness, not a 'jailbreak the agent' kit.

#claude-code #security #pentest #ctf #ai-agent