Business

SaaS

Technology

•

Jan 29, 2026

Firecracker, gVisor, Containers, and WebAssembly – Comparing Isolation Technologies for AI Agents

You’re building AI agents that execute code in the broader context of AI agent sandboxing challenges. Luis Cardoso’s field guide breaks down four ways to isolate them. Each approach has tradeoffs. Pick the wrong one and you’re either shipping security holes or tanking performance.

The decision matters. E2B chose Firecracker and gets 125ms cold starts. Modal chose gVisor. Understanding why they chose differently will save you from making expensive mistakes.

This article walks through hardware virtualisation, userspace kernels, shared kernels, and runtime sandboxes. By the end you’ll have a decision framework that maps your threat model to the right technology.

Why Does Isolation Technology Choice Matter for AI Agent Security?

Your choice of isolation tech sets your security boundary. Hardware virtualisation like Firecracker and Kata Containers stops kernel exploits dead. Userspace kernels like gVisor shrink the syscall attack surface. Shared kernels—that’s regular containers—leave every tenant vulnerable to kernel bugs. Runtime sandboxes like WebAssembly lock down capabilities through WASI interfaces.

Here’s why this matters: the Linux kernel gets hit with 300+ CVEs annually. When you’re running containers, one kernel compromise hits every container on that host. This isn’t theoretical—CVE-2019-5736 demonstrated exactly this kind of runc escape in live production systems.

AI agents run arbitrary code from LLMs. An adversarial prompt could generate malicious code specifically designed to exploit these vulnerabilities. That’s why AI agents executing arbitrary code require isolation boundaries way stronger than what standard containers give you.

The security hierarchy is simple: hardware virtualisation at the top, userspace kernels next, then shared kernels, then nothing at the bottom. Stronger isolation costs you cold start time and memory overhead. That’s the tradeoff you’ll encounter when understanding the sandboxing problem for production deployment.

How Does Hardware Virtualisation Work in Firecracker and Kata Containers?

Hardware virtualisation uses CPU extensions—Intel VT-x and AMD-V—to carve out isolated kernel spaces. Firecracker launches lightweight VMs with dedicated kernels in 125ms using KVM. Kata Containers wraps OCI containers in VMs, using Firecracker, QEMU, or Cloud Hypervisor as the VMM backend.

KVM is the Linux kernel module that makes hardware virtualisation work. MicroVMs throw away everything you don’t need—no graphics, no USB, no sound. Firecracker supports only three things: VirtIO network, VirtIO block storage, and serial console. That minimal attack surface is the whole point.

Firecracker’s binary is only 3MB. It gives you VM-grade isolation at container-like speeds. Memory overhead is 5MB per microVM instead of gigabytes for traditional VMs.

Kata Containers takes a different approach. It’s OCI-compliant, which means each container runs in its own VM while keeping your Docker and Kubernetes workflows intact. When you start a Kata container, it spins up a lightweight VM using trimmed QEMU or Cloud Hypervisor, boots a minimal Linux guest kernel, and runs kata-agent to handle container runtime instructions.

Each MicroVM has its own virtual CPU and memory. Even if someone compromises a guest, they can’t touch the host or other VMs.

Cold start performance tells you everything: Firecracker takes 125ms, traditional VMs take 1-2 seconds, containers take around 50ms. But then there’s snapshot restoration—preload your kernel and filesystem, then restore in microseconds for instant scaling. These performance characteristics determine which technology fits your latency budgets.

AWS Lambda uses Firecracker to run thousands of microVMs per host. That’s production-scale proof right there.

What Is a Userspace Kernel and How Does gVisor Provide Isolation?

A userspace kernel intercepts your application’s system calls and reimplements kernel functionality in userspace. gVisor’s core is Sentry, written in Go, which grabs syscalls via ptrace or KVM platforms before they hit the host kernel. This shrinks the kernel attack surface by exposing only a minimal subset of syscalls.

Think of it as a software firewall between applications and the kernel. Sentry reimplements Linux system calls in Go, simulating how they behave in user space. Only when absolutely necessary does Sentry make a tightly controlled syscall to the host kernel.

The security model is different from hardware virtualisation. If malicious code compromises a gVisor process, it doesn’t compromise the host kernel—you’re still at process-level isolation. With gVisor, the attack surface is much smaller: that malicious code has to exploit gVisor’s userspace implementation first.

gVisor gives you two modes: ptrace for debugging and compatibility, KVM for production performance. The tradeoff is performance—syscall overhead runs 20-50% slower than native containers for I/O-heavy workloads.

Compatibility is the other catch. Sentry implements about 70-80% of Linux syscalls. Applications needing special or low-level system calls—advanced ioctl usage, eBPF—will hit unsupported errors. You’re not running systemd or Docker-in-Docker with gVisor.

But gVisor is OCI-compatible. It works with standard container images as a drop-in replacement for runc. Modal’s gVisor implementation provides ML and AI workload isolation with network filesystem support.

Why Are Standard Containers Insufficient for Hostile AI-Generated Code?

Containers share the host kernel. Any kernel vulnerability allows container escape that affects all tenants. That makes shared-kernel architecture a non-starter for executing untrusted AI-generated code.

Because containers share the host kernel through Linux Namespaces and cgroups, those vulnerabilities we mentioned earlier hit all tenants at once. CVE-2022-0492 was a cgroup escape. These aren’t edge cases.

LLM-generated code could be adversarially crafted to exploit known kernel vulnerabilities. In multi-tenant environments, one compromised container can pivot to the host, then attack other tenants.

Docker achieves speed because it doesn’t boot a separate kernel. Containers start in milliseconds with minimal memory use. That’s the tradeoff—speed for security.

seccomp system call filtering helps. It reduces the kernel attack surface but doesn’t eliminate it. Same with AppArmor and SELinux—they give you mandatory access control but don’t prevent kernel exploits.

Containers are fine for internal tools, trusted code, and non-production environments. But production environments running AI agents should use isolation stronger than containers.

Where Does WebAssembly Fit in the Isolation Landscape?

WebAssembly provides runtime-based isolation through capability-based security. WASM modules execute in sandboxed environments with no default system access. They need explicit capabilities through WASI interfaces—read this file, open that socket.

The comparison matters: a WebAssembly instance emulates a process while a container emulates a private operating system. WASM has no ambient authority. You explicitly grant each capability.

Performance characteristics are different too. WebAssembly runtime performance via AOT is within 10% of native. Cold start is microseconds. Disk footprint is several MBs compared to several GBs for containers.

Cross-platform portability is high—WASM works across CPUs while containers aren’t portable across architectures. Security-wise, WASM gives you capability-based security with sandbox and protected memory. Containers depend on host OS user privilege.

WASM has important limitations for AI agents though. No persistent filesystem. Limited syscall support. Requires application rewrite.

Use cases are specific: stateless functions, edge computing like Cloudflare Workers, portable sandboxed execution. Not suitable for AI workloads needing persistent filesystem and full OS integration.

WebAssembly runtimes include WasmEdge, wasmtime, and V8 isolates. If your workload needs OS integration, use containers, microVMs, or gVisor. If you need portability-focused stateless functions, WASM makes sense.

How Do Cold Start Times Compare Across Isolation Technologies?

Containers start in around 50ms. Firecracker microVMs start in 125ms. gVisor-wrapped containers add 20-50% overhead. Traditional VMs take 1-2 seconds. WebAssembly starts in microseconds.

For high-throughput serverless workloads processing thousands of requests per second, these millisecond differences add up fast. At 1000 req/sec with a 100ms latency budget, 50ms overhead eats half your budget. Understanding cold start time implications helps you set realistic performance targets.

Containers baseline at 50ms cold start because they share the host kernel—no kernel boot overhead. Firecracker averages 125ms, including minimal kernel boot and device initialisation.

But snapshot restoration changes this calculation. Firecracker can preload and restore in microseconds for hot path scaling. That’s how you get instant scaling when you need it.

gVisor overhead is 20-50% slower than native containers due to syscall interception. Traditional VMs need 1-2 seconds for full kernel boot and device emulation. WebAssembly has no OS overhead—runtime-only means microsecond instantiation.

Beyond cold start times, memory footprint also affects how densely you can pack deployments. Firecracker uses 5MB. gVisor uses around 30MB. Containers use roughly 10MB. Traditional VMs use hundreds of MB.

AWS Lambda enables sub-second cold starts at massive scale—thousands of concurrent invocations. That’s Firecracker in production.

Which Isolation Technology Should I Choose for My Threat Model?

Map your threat level to the technology. Low-threat internal tools? Use containers. Medium-threat multi-tenant SaaS? Use gVisor. High-threat untrusted code execution? Use Firecracker or Kata. Portability-focused stateless functions? Use WebAssembly.

The decision framework starts with your threat model. Define your adversary—internal dev vs external user vs hostile AI. Assess attack sophistication—opportunistic vs targeted. Evaluate data sensitivity—public vs regulated.

Low threat scenarios work with containers. Internal development tools, trusted user code, non-production environments. You’re trading maximum isolation for performance and simplicity.

Medium threat scenarios suit gVisor. Multi-tenant SaaS with untrusted user code, cost-sensitive deployments, Kubernetes integration. Modal chose this path for ML workloads—the performance-security tradeoff works for them.

High threat scenarios need Firecracker or Kata. AI agents executing LLM-generated code, financial and healthcare data, compliance requirements like HIPAA and PCI-DSS. E2B’s Firecracker foundation for AI code execution assumes hostile intent in their threat model.

Portability focus points to WebAssembly. Stateless edge functions, cross-platform execution, microsecond cold start requirements.

Compliance matters. Some regulations mandate hardware isolation. Others allow userspace approaches. Check your requirements.

MicroVMs typically cost 10-20% more than containers at scale. But you’re justifying the cost with risk reduction—multi-tenant security breaches cost millions.

The choice matrix is: threat level times performance requirements times compatibility needs equals recommended technology.

How Do I Migrate From Containers to MicroVMs Without Downtime?

Use Kata Containers for zero-downtime migration. It’s OCI-compliant, so your existing container images work unchanged. It integrates with Kubernetes via CRI. You gradually shift workloads from runc to kata-runtime without touching your applications.

OCI compatibility is what makes this easy. Kata accepts standard Docker images without modification. No rewrite required.

Kubernetes integration uses RuntimeClass. Install Kata RuntimeClass, label pods to use kata-runtime via runtimeClassName. That’s your migration mechanism.

The gradual strategy minimises risk. Start with your least-critical workloads. Monitor performance—track cold start times, memory usage, syscall overhead, failure rates. Expand to production once you’re confident. Keep runc available as fallback. Use Kubernetes RuntimeClass to toggle per-pod if you need to roll back.

You need performance benchmarks before full migration. Benchmark cold start, I/O throughput, memory overhead. You need these numbers to make informed decisions.

Firecracker direct migration is different. It requires containerd-firecracker or Firecracker-containerd shim. More application changes are needed.

gVisor is another OCI-compliant option. You can use runsc as a drop-in runc replacement for gradual migration. Same pattern—gradual rollout, performance monitoring, rollback capability. When you’re ready to implement Firecracker isolation in production, these migration patterns provide a safe path forward.

Northflank processes 2M+ microVMs monthly with both Kata and gVisor options. That’s production-scale migration in action.

FAQ Section

What is the difference between microVMs and traditional virtual machines?

MicroVMs strip away unnecessary device emulation—graphics, USB, sound—to boot in milliseconds with minimal memory overhead. Firecracker supports only network, block storage, and serial console compared to QEMU’s hundreds of emulated devices. This gets you 125ms cold start vs 1-2 seconds for traditional VMs while keeping hardware isolation.

Can I run Docker containers inside Firecracker microVMs?

Yes, through Kata Containers. It runs a minimal VM using Firecracker as VMM, then executes your Docker container inside the VM. This gives you OCI compatibility so existing images work without modification while gaining hardware isolation security benefits.

Does gVisor support all Linux system calls?

No. gVisor implements about 70-80% of Linux syscalls in userspace, focusing on common application needs. Advanced features like systemd, Docker-in-Docker, and certain networking capabilities may not work. Check gVisor’s syscall compatibility list before you migrate workloads.

How much does switching to microVMs increase infrastructure costs?

Firecracker adds roughly 5MB memory overhead per instance vs containers, plus slight CPU overhead for hardware virtualisation. At scale—thousands of instances—this adds up to 10-20% cost increase. But this prevents multi-tenant security breaches that could cost millions.

Which isolation technology does AWS Lambda use?

AWS Lambda uses Firecracker microVMs to isolate customer functions. This lets them run thousands of untrusted user-submitted functions per host with VM-grade security and sub-second cold starts—production-scale microVM deployment.

Can WebAssembly replace containers for AI agent workloads?

Not for most AI workloads. WASM lacks persistent filesystem and full OS integration, and requires application rewrite. Use WASM for stateless edge functions with microsecond cold starts. For AI agents needing file persistence and system access, use containers, microVMs, or gVisor.

What are the main security differences between gVisor and Firecracker?

Firecracker uses hardware virtualisation for kernel-level isolation—the strongest boundary you can get. gVisor uses userspace syscall interception, which reduces but doesn’t eliminate the host kernel attack surface. Compromise of a Firecracker VM can’t reach the host kernel. Compromise of gVisor is a process-level escape.

How do I choose between Kata Containers and pure Firecracker?

Choose Kata if you need OCI compatibility and Kubernetes integration with zero application changes. Choose pure Firecracker if you can modify applications and want maximum control over VMM configuration. Kata uses Firecracker as backend, adding orchestration convenience.

Does gVisor work with Kubernetes?

Yes. gVisor integrates via Kubernetes RuntimeClass using runsc (gVisor’s OCI runtime). Install gVisor on nodes, create RuntimeClass definition, specify runtimeClassName in pod spec. Modal uses this approach for ML workload isolation.

What is the cold start difference between Firecracker and full VMs?

Firecracker boots in roughly 125ms by minimising device emulation. Traditional VMs (QEMU with full device support) take 1-2 seconds. Both provide hardware isolation, but Firecracker’s minimal design trades device compatibility for speed.

Can I use Firecracker with Docker Compose?

Not directly. Firecracker requires an integration layer like containerd-firecracker. For Docker workflow compatibility with microVM security, use Kata Containers, which provides an OCI-compliant runtime that works with Docker and Docker Compose.

What syscall overhead does gVisor add compared to native containers?

gVisor adds 20-50% overhead for I/O-heavy workloads due to syscall interception and userspace handling. CPU-bound workloads see less impact. Profile your specific workload before choosing gVisor—Modal found an acceptable tradeoff for ML workloads.