Collection ยท 52 entries
Security tools for AI coding agents
Sandboxes, scanners, proxies, and governance toolkits that keep autonomous agents from doing damage.
The 'agent security' problem is really three problems stacked: input (prompt injection, untrusted data crossing the agent loop), execution (what the agent can run, where, with what permissions), and output (data leaving the system through tool calls). The tools below tackle different layers - Destructive Command Guard and Zerobox at execution, AgentShield and the Microsoft Governance Toolkit at config and policy, CrabTrap and LLM-Anonymization at the output / network boundary.
pentest-ai-agents - Claude Code subagents for offensive security
Specialized Claude Code subagents that turn the CLI into a pentest assistant: plan engagements, analyze recon, research exploits, build detections, audit STIGs, and write reports.
SmolVM - one-command sandbox for Claude Code and Codex
Pre-installed sandboxed VM with Claude and Codex ready to run, plus git credentials wired up. Removes the 'press enter to accept' loop without exposing the host.
ThinkWatch - enterprise AI and MCP bastion host
Rust gateway in front of OpenAI, Anthropic, Gemini, and self-hosted LLMs (plus MCP servers) with RBAC, audit logs, rate limits, and cost tracking. The boring layer enterprises actually need.
Security-Detections-MCP - detection engineering over MCP
MCP server tailored for defenders - exposes detection-engineering primitives so agents can author, refactor, and validate SIEM/EDR detections.
skill-doctor - inspector for coding-agent skills
Local tool that audits installed agent skills for conflicts, precedence issues, and risk. Helps surface why a particular skill is (or isn't) firing.
OQP - verification protocol for AI agents
MCP-compatible spec defining four endpoints (capabilities, workflows, execute, assess-risk) so agents can prove a shipped change satisfies business requirements before it goes live.
Kontext CLI - credential broker for AI coding agents
Go CLI that brokers GitHub, Stripe, and database credentials to coding agents per-session with audit trails, replacing copy-pasted .env keys with scoped tokens.
smolvm - portable lightweight VMs in a single file
Open-source CLI for sub-second VMs on macOS (Hypervisor.framework) and Linux (KVM) via libkrun. Sandboxes untrusted code with hardware isolation and packs stateful environments into a single .smolmachine file.
mcp-shark - Wireshark for Model Context Protocol
Electron capture and inspection tool for Model Context Protocol traffic. Records every HTTP request/response between an IDE and its MCP servers for forensic analysis.
AgentBox - SDK to run coding agents in any sandbox
One SDK to run Claude Code, Codex, or OpenCode inside Docker, E2B, Modal, Daytona, or Vercel sandboxes - boots each agent's native server (JSON-RPC, HTTP/SSE) instead of using non-interactive --print mode.
LABE - legal action boundary eval
Public benchmark that tests an agent at the moment it's about to take a high-impact legal action. Same harness, baseline vs verified, measures unjustified action drops and goal-completion gains.
CubeSandbox - sub-60ms self-hosted E2B alternative
Open-source sandbox runtime for LLM-generated code built on RustVMM and KVM. Targets sub-60ms cold starts with full kernel isolation, designed as a self-hostable replacement for closed E2B-style services.
mcp-shodan - Shodan MCP server for AI agents
MCP server exposing Shodan APIs for IP reconnaissance, DNS lookups, and CVE/CPE vulnerability intelligence. Plugs into Claude Code, Codex, Gemini CLI, and Claude Desktop.
ccmd - TUI to audit and clean developer caches
Rust terminal UI for exploring cache directories on macOS and Linux. Scans cached packages for known CVEs, finds outdated deps, and reclaims disk space.
claude-code-organizer - dashboard for CC configs
npx dashboard to manage Claude Code memories, configs, and MCP servers. Includes a tool-poisoning scanner, context token budget tracker, duplicate cleanup, and scope management.
lilith-zero - Rust security runtime for MCP
Transport-layer security middleware for LLM agent systems that enforces deterministic policies to mitigate data exfiltration and unauthorized tool calls. OS, language, and framework agnostic.
sandboxed.sh - self-hosted agent sandbox orchestrator
Self-hosted Rust orchestrator that runs Claude Code and OpenCode inside isolated Linux workspaces, with skills, configs, and encrypted secrets stored in a git repo.
code-on-incus - per-agent isolated VMs with active defense
Gives each AI agent its own Incus machine with root, Docker, and systemd. Built-in detector stops threats automatically when an agent goes off-script.
sandstorm - run Claude agents in cloud sandboxes
FastAPI service for running Claude Code agents in secure E2B cloud sandboxes via API, CLI, or Slack. Single call, full agent, no infrastructure.
pipelock - MCP firewall for AI agents
Go-based agent firewall that controls egress from MCP servers, blocking SSRF, DLP leaks, and prompt-injection vectors at the network layer. Acts as a fetch proxy for tool calls.
Agentjail - self-hosted Linux sandbox for AI code
Minimal Linux sandbox for running untrusted code, designed for AI agents and build systems. Self-hosted alternative to Freestyle.sh-style code execution services.
Mulder - MCP server for digital forensics
Containerized MCP server that exposes Volatility, Sleuthkit, Plaso, and other forensic tools as typed tool calls, with append-only audit log and citation-backed findings.
Aegis - runtime policy enforcement for AI agents
TypeScript policy engine that wraps agent execution with cryptographic audit trails, human-in-the-loop approvals, and a kill switch, with no code changes to the agent itself.
Trace-core - linter for AI-written failure patterns
Static analyzer that catches the 24 specific failure patterns characteristic of AI-generated code, like phantom imports and almost-right API usage.
forgemax - sandboxed local MCP gateway
Rust MCP gateway that collapses N servers and M tools into two tools by following the Code Mode pattern, cutting tool-list overhead to roughly 1,000 tokens.
agent-vault - credential proxy for AI agents
Infisical's HTTP credential proxy and vault that brokers secrets to AI agents without ever exposing them in the prompt or environment.
skylos - PR gate for AI-generated code
CLI that gates pull requests by detecting dead code, leaked secrets, and AI-code regressions across Python, TS/JS, Java, and Go. Designed to catch the failure modes of AI-generated PRs.
Arcjet JS - AI security building blocks for Node
JS/TS SDK for runtime AI security: prompt-injection defense, bot blocking, rate limits, and budget protection wired into Next.js, Bun, and Node servers. Aimed at apps where agents call your tools.
RedAI - validate vulnerabilities in live targets
Security agent that runs scanner agents to surface candidate vulnerabilities, then has validator agents reproduce each one against a running instance. Outputs only confirmed exploitable findings.
cloudlist - cloud asset inventory CLI
ProjectDiscovery CLI that lists assets across multiple cloud providers from one config. Useful as a recon and inventory step in security workflows.
pinact - pin GitHub Actions to commit SHAs
CLI that edits GitHub workflow and composite action files to pin every action and reusable workflow to a commit SHA, with version annotations and update support. Plugs the supply-chain hole left by tag-only references.
proxelar - programmable MITM proxy in Rust
Rust MITM proxy that intercepts HTTP/HTTPS in forward and reverse modes with TLS interception, a TUI, and a web GUI. Useful for inspecting what agents and apps are actually sending over the wire.
qtap - eBPF agent for pre-encrypted egress
eBPF agent that captures pre-encrypted network traffic from containers and processes, attributing every egress to its originating process. Aimed at observability and exfiltration detection in agent runtimes.
stereOS - hardened Linux for AI agents
Nix-based Linux distribution purpose-built for running AI agents. Hardened defaults and an immutable base aimed at sandboxing autonomous coding agents.
routiium - LLM gateway with tool-result guard
Self-hosted, OpenAI-compatible LLM gateway that guards not just user prompts but tool return values - the harder injection surface in agent loops with web fetch, MCP, and shell tools.
gpg-tui - terminal UI for GnuPG keys
Rust TUI that wraps gpg key management: list, generate, sign, export, and edit trust without memorizing flag combinations.
jwt-cli - fast CLI to decode and encode JWTs
Single Rust binary that decodes, encodes, and validates JWTs from the command line. Standard tool when debugging auth flows.
Nexus - governance gateway for LLM and MCP traffic
Rust gateway that fronts LLMs and MCP servers with policy enforcement and observability. Aimed at securing agent traffic in larger deployments.
vibe-check-mcp - mentor feedback for agents
MCP server that gives agents mid-task mentor feedback to break tunnel-vision, over-engineering, and reasoning lock-in. Targets long-horizon coding workflows.
cordum - agent control plane with policy gates
Open agent control plane in Go that enforces pre-execution policy, approval gates, and audit trails over LangChain, CrewAI, MCP, or any framework.
secure-exec - npm-compatible Node sandboxing
Lightweight library for sandboxing Node.js code execution from agents without containers or VMs, using runtime isolation. Built for code interpreter use cases.
onecli - credential vault for AI agents
Open-source credential vault that gives AI agents access to third-party services without exposing raw API keys. MCP integration for Claude Code and similar agents.
DeepZero - automated kernel driver vuln research
Vulnerability research framework that parses, decompiles, and analyzes Windows kernel drivers for exploitable IOCTLs using AI agents. Sleep through fuzzing campaigns.
vulnhawk - AI-powered SAST scanner
Static analysis scanner that finds auth bypass, IDOR, and business logic bugs that Semgrep and CodeQL miss. Ships as a free GitHub Action covering Python, JS/TS, Go, PHP, and Ruby.
lunar - agent-native MCP gateway
MCP gateway focused on governance and security: policy enforcement, request inspection, and rate-limiting between agents and MCP servers. Sits between the model and the tool surface.
Keeper - embeddable secret store for Go
Embeddable Go secret store using Argon2id and XChaCha20-Poly1305 by default, with four security levels, audit chains, and crash-safe rotation. Vault when Vault is overkill.
AgentShield - security scanner for AI agents
CLI, GitHub Action, and GitHub App that scan agent configs, MCP servers, and tool permissions for vulnerabilities. Detects skill poisoning and prompt-injection vectors.
LLM Anonymization - pentest data scrubber
Reverse proxy for Claude Code that strips IPs, hashes, credentials, and PII before requests hit Anthropic. Dual-layer detection: local Ollama LLM plus regex.
CrabTrap - LLM-as-a-judge proxy for agent security
Brex's HTTP proxy that uses an LLM judge to vet agent traffic in production. Drop it in front of any agent runtime to block exfiltration and jailbreaks.
Agent Governance Toolkit
Microsoft's policy engine for autonomous agents: zero-trust identity, execution sandboxing, and reliability checks. Maps to all 10 OWASP Agentic Top 10 categories.
Destructive Command Guard
Rust CLI that blocks dangerous git and shell commands before an agent can run them. Pattern-matched safety net for autonomous coding agents.
Zerobox - process sandboxing on the Codex runtime
Lightweight, cross-platform process sandbox in Rust. Wraps any command with file, network, and credential controls - built on OpenAI Codex's runtime primitives.
Frequently asked
Where should I start with agent security?
If you're running a coding agent locally, Destructive Command Guard or Zerobox give you immediate execution-layer guardrails. If you're shipping an agent product, look at CrabTrap (LLM-as-a-judge proxy) and the Microsoft Agent Governance Toolkit for the policy layer.
How do these compare to OWASP's Agentic Top 10?
The Microsoft Agent Governance Toolkit explicitly maps to all 10 OWASP Agentic categories. Most other tools here cover specific risks - sandboxing for excessive agency, anonymization for sensitive data exposure, MCP scanners for supply chain compromise.
Related collections
Claude Code tools, plugins, and integrations
The best tools, MCP servers, and harnesses for getting more out of Claude Code - orchestration, observability, telemetry, and remote control.
MCP servers and Model Context Protocol tools
Production MCP servers, gateways, frameworks, and clients - everything in this directory that speaks the Model Context Protocol.