Sub-Millisecond VM Sandboxes for AI Agents, 200x Faster
Zeroboot uses copy-on-write VM forking to spin up isolated sandboxes in 0.79ms with just 265KB RAM — 200x faster and 500x lighter than alternatives.
What Is Zeroboot?
Zeroboot is an open-source Rust project that creates isolated virtual machine sandboxes for AI agent code execution in under 1 millisecond. Instead of booting a new VM from scratch every time an AI agent needs to run code, Zeroboot snapshots a fully-loaded template VM once, then forks it using copy-on-write memory mapping — the same principle behind Unix fork(), but applied to entire KVM virtual machines.
The result: each sandbox is a real hardware-isolated VM (not a container), yet it spawns in 0.79ms and uses only ~265KB of resident memory. That's roughly 200x faster and 500x more memory-efficient than existing solutions.
Why This Matters
AI agents increasingly need to execute arbitrary code — running data analysis, calling APIs, installing packages, or testing generated scripts. The standard approach is to spin up a container or lightweight VM for each execution. But when an AI agent is making dozens of tool calls per conversation, even 150ms of startup latency adds up fast, and 128MB of RAM per sandbox limits how many you can run concurrently.
Zeroboot changes the math entirely. With sub-millisecond spawn times and kilobyte-level memory overhead, you can realistically run thousands of concurrent sandboxes on a single machine. The benchmarks show 1,000 forks completing in just 815ms.
Benchmarks: How It Compares
| Metric | Zeroboot | E2B | Microsandbox | Daytona |
|---|---|---|---|---|
| Spawn p50 | 0.79ms | ~150ms | ~200ms | ~27ms |
| Spawn p99 | 1.74ms | ~300ms | ~400ms | ~90ms |
| Memory per instance | ~265KB | ~128MB | ~50MB | ~50MB |
| Fork + Execute (Python) | ~8ms | — | — | — |
| 1000 concurrent forks | 815ms | — | — | — |
How It Works: Three Steps
1. Template Creation (one-time, ~15 seconds)
Firecracker boots a micro-VM with your chosen runtime — Python with numpy/pandas, Node.js, or whatever you need. Once the runtime is warm and modules are loaded, Zeroboot captures the entire memory and CPU state as a snapshot.
2. Fork (~0.8ms)
When a sandbox is needed, Zeroboot creates a new KVM virtual machine and maps the template's memory snapshot using mmap(MAP_PRIVATE). This is the copy-on-write magic: reads hit the shared snapshot file, while writes trigger per-fork page faults that allocate fresh pages only for the data that changes. CPU state — segment registers, XSAVE state, LAPIC, MSRs — is restored in a precise sequence.
3. Isolation (hardware-enforced)
Each fork runs as a separate KVM virtual machine with Intel VT-x/AMD-V hardware isolation. This isn't container-level separation — it's the same isolation boundary used by cloud providers between tenants.
Try It Right Now
Zeroboot provides a demo API key so you can test immediately:
curl -X POST https://api.zeroboot.dev/v1/exec \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer zb_demo_hn2026' \
-d '{"code":"import numpy as np; print(np.random.rand(3))'}'
Or use the Python SDK:
from zeroboot import Sandbox
sb = Sandbox("zb_live_your_key")
result = sb.run("print(1 + 1)")
print(result.stdout) # "2"
TypeScript is also supported:
import { Sandbox } from "@zeroboot/sdk";
const result = await new Sandbox("zb_live_your_key").run("console.log(1+1)");
console.log(result.stdout); // "2"
API Endpoints
The API surface is minimal and well-designed:
- POST /v1/exec — Execute code in a freshly forked sandbox. Supports
python,node, andjavascriptruntimes with configurable timeouts. - POST /v1/exec/batch — Run multiple code snippets in parallel, each in its own isolated VM.
- GET /v1/health — Template status and readiness check.
- GET /v1/metrics — Prometheus-format performance metrics.
Every response includes detailed timing: fork_time_ms, exec_time_ms, and total_time_ms, so you can see exactly where time is spent.
Under the Hood: Clever Engineering
Several implementation details stand out in the architecture documentation:
- Vmstate parsing: Firecracker's binary snapshot format has variable-length sections with version-dependent offsets. Instead of hardcoding offsets, Zeroboot uses the IOAPIC base address (
0xFEC00000) as an anchor point to auto-detect field positions. - Entropy seeding: Guest VMs need entropy, but
getrandom()blocks in Firecracker VMs until the CRNG is initialized. Zeroboot injects entropy viaRNDADDENTROPYioctl. - CPU feature masking: Firecracker filters CPUID, which can cause numpy to crash with SIGILL. Zeroboot disables runtime CPU detection with
NPY_DISABLE_CPU_FEATURES.
Who Should Care
- AI agent builders: If your agents execute code (tool use, data analysis, code generation), this is the fastest isolation primitive available.
- Platform engineers: Running code execution as a service? 265KB per sandbox means dramatically lower infrastructure costs.
- Security-conscious teams: Hardware-level VM isolation is a stronger boundary than containers, with negligible performance penalty.
Current Status
Zeroboot is a working prototype — the fork primitive, benchmarks, and API are real and validated, but the project is not yet production-hardened. It's written in Rust (80.7% of the codebase), licensed under Apache 2.0, and actively accepting contributions. With 443 stars already since its March 15 release, it's clearly resonating with the developer community.
The GitHub repo includes full source code, SDKs for Python and TypeScript, deployment guides, and architecture documentation.
Related Content — Get Started with Easy Claude Code | Free Learning Guides | More AI News
Stay updated on AI news
Simple explanations of the latest AI developments