Insights Business| SaaS| Technology AI Agent Sandboxing Explained — Why Docker Is Not Enough and What Actually Works
Business
|
SaaS
|
Technology
Apr 28, 2026

AI Agent Sandboxing Explained — Why Docker Is Not Enough and What Actually Works

AUTHOR

James A. Wondrasek James A. Wondrasek
Graphic representation of the topic AI Agent Sandboxing Explained

If your AI agents are generating and executing code inside standard Docker containers, you have a real architectural problem. Langflow CVE-2025-3248 (CVSS 9.8, added to the CISA Known Exploited Vulnerabilities catalogue) and the Cursor MCP RCE are not hypothetical future scenarios. They happened to real teams running the same tools your engineers are probably using right now.

The core problem is structural. Docker containers share the host OS kernel. A container escape — formally classified as MITRE ATT&CK technique T1611 — does not give an attacker access to a container. It gives them access to your host. On Docker-only deployments, the blast radius of a successful AI agent exploit is completely unconstrained.

Three isolation technologies actually solve this: Firecracker microVMs, gVisor (user-space kernel), and Kata Containers. Each takes a different architectural approach with different trade-offs on isolation strength, overhead, and operational complexity. The Kubernetes Agent Sandbox project (github.com/kubernetes-sigs/agent-sandbox) provides the orchestration layer that ties them together for production AI agent workloads.

This article covers the threat model, the isolation technologies, a concrete comparison, and a build-vs-buy framework. For broader context, see the complete guide to securing AI agents.

Why are standard Docker containers not secure enough for AI agents?

Docker containers share the host OS kernel. That single fact is the entire argument.

Docker provides isolation through Linux namespaces, cgroups, and capabilities. These mechanisms work fine for their intended purpose — isolating trusted, vetted application code. The problem is using them for something they were never designed to handle.

When an AI agent executes code at runtime — code produced by an LLM in response to user input — that code was not reviewed by your engineers. It was generated by a system that can be manipulated through tool chain attacks as the primary threat sandboxing contains. You have to treat the isolation boundary as adversarial. Docker treats it as a resource partition. That is a fundamentally different thing.

All containers on a host share the same Linux kernel. A kernel vulnerability provides a direct path to the host. These are not exotic research artefacts — November 2025 brought three runc vulnerabilities (CVE-2025-31133, CVE-2025-52565, CVE-2025-52881) affecting Docker, Kubernetes, containerd, and CRI-O simultaneously. A 2025 Veracode report found 45% of AI-generated code fails security tests. That code needs to be treated as an execution surface, not a trusted application environment.

The Langflow CVE-2025-3248 shows exactly what happens when it is not. Horizon3.ai discovered that Langflow’s /api/v1/validate/code endpoint accepted a Python code string and passed it to exec() — no authentication, no sandboxing. Reverse shells, file exfiltration, lateral movement — all achievable through a single HTTP request. CISA added it to the KEV catalogue on 5 May 2025. On a Docker-only deployment, that means unauthenticated host-level code execution.

What is AI agent sandboxing and how do Firecracker, gVisor, and Kata Containers each approach it?

AI agent sandboxing means executing AI-generated or AI-directed code inside an isolated environment with strict limits on filesystem access, network communication, and system calls. The goal is containment — stopping a compromised agent from affecting the host or other workloads when other controls fail. See the complete guide to securing AI agents for the full control stack.

Three isolation technologies have reached production maturity in 2026. They differ in where the boundary sits and what that costs you.

Firecracker microVM

Firecracker, built by AWS for Lambda and Fargate, creates a dedicated Linux kernel for each execution environment via KVM hardware virtualisation. A kernel exploit inside a Firecracker microVM cannot reach the host kernel — they are separated by hardware virtualisation boundaries enforced by the CPU itself.

The numbers: approximately 125ms boot time, less than 5 MiB memory overhead per microVM. It achieves this through a minimalist VMM written in Rust that exposes only what the workload needs and strips everything else — no BIOS emulation, no legacy hardware. AWS Lambda and Fargate run on Firecracker in production.

gVisor

gVisor is a user-space kernel — the “Sentry” process — developed by Google. It intercepts all application system calls before they reach the host kernel and re-implements them in user space. No dedicated VM per workload; instead, the system call surface exposed to the host kernel is dramatically reduced.

The trade-off: gVisor does not provide full kernel-level isolation — a sophisticated exploit targeting gVisor itself is theoretically possible, though rare in practice. Overhead runs around 10–30% on I/O-heavy workloads, with minimal impact on compute-heavy tasks. gVisor integrates natively into Kubernetes via the runsc RuntimeClass.

Kata Containers

Kata Containers combines OCI-compatible container APIs with hardware virtualisation via pluggable VMMs — Firecracker, Cloud Hypervisor, or QEMU. VM-grade isolation, Kubernetes-native API. The kata-runtime RuntimeClass drops into existing Kubernetes deployments without any API changes. And Kata and Firecracker are not competing — Kata can use Firecracker as its VMM, giving you hardware isolation with Kubernetes-native APIs. For teams already running Kubernetes, that combination is the most practical production choice.

Firecracker vs gVisor vs Kata Containers: which provides the right isolation for production AI agents?

Here is how the three compare on the dimensions that actually matter for a production decision.

Isolation level: Firecracker and Kata both provide hardware isolation — dedicated kernels via KVM. gVisor provides syscall interception via a user-space kernel. For workloads executing user-supplied or high-risk LLM-generated code, hardware isolation is the right call. gVisor suits lower-risk workloads where overhead matters.

Boot time and overhead: gVisor wins — no VM boot means near-instant startup. Firecracker boots in approximately 125ms with less than 5 MiB overhead. Kata comes in at approximately 150–300ms with 100–300 MiB per pod. At scale, Kata’s per-pod memory footprint needs to be planned for.

Kubernetes compatibility: All three integrate via RuntimeClass. Switching between gVisor and Kata requires changing one field (runtimeClassName) in a pod spec — the API change is minimal; infrastructure setup is where the real work sits.

Production maturity: Firecracker — very high (AWS Lambda, Fargate). gVisor — high (Google GKE Sandbox). Kata Containers — high (CNCF project, widely adopted).

Here is the decision guidance:

This maps to prompt injection as the injection-side motivation for runtime containment — calibrate your isolation technology to the injection risk profile of the workload.

What does the Kubernetes Agent Sandbox project provide and how do you get started?

The kubernetes-sigs/agent-sandbox project (github.com/kubernetes-sigs/agent-sandbox) is a SIG Apps initiative launched at KubeCon Atlanta in November 2025. It adds a Sandbox CRD — a new Kubernetes primitive built specifically for AI agent workloads.

Standard Kubernetes Pods and Deployments were designed for stateless, replicated services. AI agents are stateful singletons that need stable identity, persistent storage, and fine-grained lifecycle controls. Wiring together a StatefulSet, a headless Service, and a PVC for every agent instance is a substantial operational burden. The Sandbox CRD replaces that pattern.

Three CRDs do the work. Sandbox is the core resource — a stateful agent workload with stable identity and lifecycle management. SandboxTemplate is cluster-scoped, defined by platform engineers — think StorageClass for compute isolation. It specifies isolation technology (RuntimeClass), resource limits, storage, and network egress policy. SandboxClaim is namespace-scoped, submitted by agent controllers — think PersistentVolumeClaim. Developers declare what they need; the platform provides it.

The project also includes WarmPools (pre-warmed pods that reduce cold-start latency to under a second), scheduled shutdown, and a Python SDK for agent frameworks. For sandboxing telemetry as an observability input for your SOC, every lifecycle event generates attributable audit trails.

To get started: Kubernetes cluster with gVisor or Kata Containers installed as RuntimeClasses, then kubectl apply the CRD manifests from the repository.

Worth noting — it is not yet GA. This is right for teams with Kubernetes operations experience and a tolerance for early-adopter friction.

The Langflow and Cursor incidents: what production RCEs tell us about the cost of skipping sandboxing

These incidents are not hypothetical. They are named, attributed, actively exploited vulnerabilities in tools used by sophisticated development teams.

Langflow CVE-2025-3248 (CVSS 9.8)

Langflow’s /api/v1/validate/code endpoint accepted a Python code string, passed it to exec(), and required no authentication. Network access was the only prerequisite. This is the Configuration-as-Code-Execution anti-pattern — and it is not unique to Langflow. At least eight critical RCE CVEs were published against LangFlow, LangChain, and n8n between 2024 and early 2026, all sharing the same root cause. n8n received CVE-2025-3455 (CVSS 9.8) for exactly the same pattern.

On Docker-only: unauthenticated host-level code execution. On microVM isolation: a contained incident. That is the difference.

Cursor MCP RCE (CVE-2025-54135 / CVE-2025-54136)

AIM Security discovered CurXecute: prompt injection through any external content source — Slack messages, GitHub issues, search results — could instruct Cursor to modify mcp.json, with the malicious config executing before the user could reject it. Check Point Research discovered MCPoison: a rug pull attack where an attacker commits a benign MCP config, gets it approved, then swaps in a malicious payload. Cursor trusted the approved key name, not the command content.

The attack vector here is not a traditional software vulnerability — it is the designed behaviour of an AI agent executing tool calls, exploited through injected instructions. Cursor, VS Code, Windsurf, and Gemini-CLI are all affected.

What these incidents confirm

Both incidents would have been significantly mitigated by microVM-level isolation. If the RCE succeeds but the execution environment is a Kata Container or Firecracker microVM with deny-by-default egress, the attacker’s code executes inside an isolated kernel — contained and recoverable rather than a full host compromise. That is sandboxing limits the blast radius when tool chain attacks succeed.

Build vs buy: how do you choose between Cloudflare Sandbox SDK, Northflank, Kubernetes Agent Sandbox, and open-source alternatives?

Three factors shape this decision: your existing infrastructure, your DevOps capacity, and the severity of your threat profile.

Option 1 — Self-hosted Kubernetes Agent Sandbox

Right for: Teams already running Kubernetes with dedicated platform engineering capacity. High-isolation requirements. Cost-sensitive at volume.

Operational cost: Not trivial. RuntimeClass configuration, KVM host setup, CRD installation, upgrade management. Building from scratch takes months. Budget for it honestly.

Not right for: Teams without dedicated Kubernetes operations. The complexity absorbs engineering time that should go to product.

Option 2 — Cloudflare Sandbox SDK

Cloudflare Sandbox SDK provides APIs for executing commands, managing files, and running background processes from Workers applications.

Right for: Cloud-native teams without Kubernetes. Teams building agents as Workers or edge functions. Startups prioritising speed to production.

Isolation model: Container-based isolation on Cloudflare’s edge network. Not equivalent to dedicated-kernel hardware virtualisation. Well-suited to bounded, short-lived execution.

Operational cost: Near-zero. Pay-per-execution. Trade-off: Not suitable for workloads requiring full Linux syscall access or long-running sessions.

Option 3 — Northflank (managed Kata + gVisor)

Right for: Teams that want Kata Containers and gVisor isolation without running their own Kubernetes sandbox infrastructure. Strong isolation requirements, insufficient DevOps headcount.

Isolation model: Identical to self-hosted Kata + gVisor — the difference is who operates the infrastructure. True BYOC across AWS, GCP, Azure, and bare-metal.

Operational cost: Higher per-unit cost at scale; zero infrastructure management; onboarding in hours. For a 200-sandbox deployment, Northflank PaaS runs approximately $7,200/month versus $16,819/month (E2B) or $24,491/month (Modal).

Option 4 — Dagger/container-use and Lightning AI/litsandbox

container-use (github.com/dagger/container-use) provides containerised agent execution via Dagger pipelines. litsandbox (Lightning AI) provides sandboxed Python execution. Both are useful stepping stones for teams building proof-of-concept infrastructure without Kubernetes. Neither provides microVM-grade isolation for high-risk production workloads.

SMB recommendation

Start with the managed option that fits your cloud provider. Northflank if your agents run Python, shell, or arbitrary Linux processes. Cloudflare Sandbox SDK if your agents run as Workers or edge functions.

Operational simplicity matters more than per-unit cost at this scale. A self-hosted Kubernetes Agent Sandbox that absorbs two engineers’ time is not cheaper than Northflank — it is more expensive, with more risk attached.

Graduate to self-hosted only when you have dedicated Kubernetes operations capability and the volume to justify the cost difference. For teams currently on Docker, the incremental hardening path is sensible — harden what you have, add gVisor, then move to Kata Containers for your highest-risk workloads. For the complete picture see AI agent security across supply chain, identity, and SOC.

Where does sandboxing fit in your defence-in-depth stack?

Sandboxing does not prevent prompt injection, tool poisoning, or identity credential theft. These attacks operate at layers above or below the sandbox boundary.

What sandboxing provides is containment. When prompt injection as the injection-side motivation for runtime containment succeeds and an agent is manipulated into running attacker-controlled code — sandboxing determines whether the result is a contained, recoverable incident or a host-level compromise.

The full defence-in-depth stack for AI agent workloads:

  1. Identity and authentication: Scoped, revocable credentials. No broad-access service accounts. Short-lived tokens with limited scope per task.

  2. Prompt injection defences: Input sanitisation, output validation, trust boundaries between agent tiers. Treat all incoming data as potentially hostile.

  3. Tool chain security: Supply chain controls on MCP servers and external tools.

  4. Sandboxing: microVM or gVisor isolation for code execution. Limits consequences when the layers above fail.

  5. Network egress controls: Deny-by-default outbound. Allowlist only required API endpoints.

  6. Runtime monitoring: Sandboxing telemetry as an observability input for your SOC. Sandbox environments generate high-quality signals — unexpected syscalls, blocked network connections, unusual resource consumption. Define baseline behaviour per agent and alert on deviation.

  7. Least-privilege RBAC: Sandbox service accounts with no Kubernetes API access beyond what the agent actually requires.

Sandboxing alone is not security. But it is the layer that turns a catastrophic incident into a contained, recoverable one. When other controls fail — and they will — microVM isolation determines whether you are handling a contained sandbox breach or a full cluster compromise. See AI agent security across supply chain, identity, and SOC for the complete control stack.

Frequently asked questions

Is Docker safe enough to sandbox an AI agent that writes and runs its own code?

No. Docker containers share the host OS kernel — a kernel exploit in the agent’s generated code gives the attacker host-level access, not just container access. Use Docker as your base image format, but deploy through a Kata or gVisor RuntimeClass in Kubernetes.

What is the difference between gVisor and Firecracker?

gVisor intercepts system calls in user space via the Sentry process — no dedicated VM kernel, lower overhead, slightly weaker isolation. Firecracker creates a dedicated VM kernel via KVM — strongest isolation, approximately 125ms boot, less than 5 MiB overhead, higher operational complexity. Kata Containers can use Firecracker as its VMM, giving you hardware isolation with Kubernetes-native APIs — and that is the most practical production combination.

What is a kernel escape (container escape) and why does it matter for AI agents?

A kernel escape is an attack where code inside a container exploits a vulnerability in the container runtime or Linux kernel to break out and access the host OS. MITRE ATT&CK classifies it as T1611. It matters for AI agents because this is exactly what Langflow CVE-2025-3248 and the Cursor MCP RCE demonstrated in production. MicroVM isolation (Kata, Firecracker) prevents kernel escapes by giving each workload its own dedicated kernel.

What is the Langflow CVE-2025-3248 vulnerability and what does it mean for my team?

Langflow CVE-2025-3248 is a pre-authentication RCE (CVSS 9.8) discovered by Horizon3.ai. The code validation endpoint executed user-supplied Python via exec() with no authentication. CISA added it to the KEV catalogue on 5 May 2025, confirming active exploitation. If your team runs Langflow or similar workflow tools on Docker without sandbox isolation, an unauthenticated attacker has a path to host-level code execution. Patch immediately and add microVM-level isolation to your AI workflow runtime.

Where is the Kubernetes Agent Sandbox project and what will I find there?

github.com/kubernetes-sigs/agent-sandbox — a kubernetes-sigs project under SIG Apps, launched November 2025 at KubeCon Atlanta. The repository has CRD manifests for Sandbox, SandboxTemplate, and SandboxClaim; example templates with gVisor and Kata RuntimeClass configurations; integration docs; and a Python SDK. Not yet GA — right for teams with Kubernetes operations experience and tolerance for early-adopter friction.

What is a RuntimeClass in Kubernetes and how do I use it for sandboxing?

RuntimeClass is the Kubernetes API object that specifies an alternate container runtime via the runtimeClassName field in the pod spec. It is the common entry point for both gVisor and Kata Containers — switching between the two requires changing that one field. The Kubernetes API change is minimal; infrastructure setup (KVM access, runtime installation) is where the operational work sits.

Should I use Northflank or Cloudflare Sandbox SDK, and what is the key difference?

Northflank provides managed Kata Containers and gVisor isolation — VM-grade, Linux-native, suitable for full-process AI agents requiring full Linux syscall access. Cloudflare Sandbox SDK uses container-based isolation on the Cloudflare edge — managed, serverless, zero infrastructure, but suited to Workers and edge function models with bounded execution time. Use Northflank if your agents run Python, shell, or arbitrary Linux processes. Use Cloudflare Sandbox SDK if your agents run as Workers or edge functions.

What are Dagger/container-use and Lightning AI/litsandbox?

container-use (github.com/dagger/container-use) provides containerised execution for AI agents via Dagger pipelines — for teams not yet running Kubernetes who want structured isolation without cluster management. Lightning AI’s litsandbox provides sandboxed Python execution — open-source, Python-focused, lower infrastructure cost. Both are stepping stones for budget-constrained teams building proof-of-concept infrastructure. Neither provides microVM-grade isolation for high-risk production workloads.

What is the incremental hardening path for teams already running AI agents on Docker?

Three stages. First, harden what you have — drop unnecessary capabilities, apply seccomp profiles, run as non-root. Second, add gVisor as the lowest-friction upgrade — full Docker image compatibility, syscall interception, no rearchitecture required. Third, upgrade to Kata Containers for your highest-risk workloads — hardware-level isolation for user-supplied or high-risk LLM-generated code. Each stage is additive.

Does sandboxing protect against prompt injection attacks?

No. Prompt injection operates at the input/context layer before code execution reaches the sandbox. What sandboxing provides is containment — when a successful injection causes the agent to execute malicious code, that code runs inside an isolated kernel, not on the host. Prompt injection defences prevent the attack; sandboxing limits the damage if those defences fail.

What is the SandboxTemplate / SandboxClaim pattern in Kubernetes Agent Sandbox?

Think StorageClass / PersistentVolumeClaim. SandboxTemplate is cluster-scoped, defined by platform engineers — it specifies isolation technology (RuntimeClass), resource limits, storage, and network egress policy. SandboxClaim is namespace-scoped, submitted by agent controllers — it references a template and requests an isolated environment. Platform teams define the policies; workloads request instances declaratively.

How does Firecracker achieve 125ms boot times with full VM isolation?

Firecracker uses a minimalist VMM written in Rust that exposes only what the workload needs — virtio block, virtio net, serial — and strips everything else. Each microVM boots a minimal Linux kernel from a pre-built snapshot. Memory stays under 5 MiB because Firecracker does not emulate BIOS, legacy hardware, or the unnecessary devices traditional VMs carry.

AUTHOR

James A. Wondrasek James A. Wondrasek

SHARE ARTICLE

Share
Copy Link

Related Articles

Need a reliable team to help achieve your software goals?

Drop us a line! We'd love to discuss your project.

Offices Dots
Offices

BUSINESS HOURS

Monday - Friday
9 AM - 9 PM (Sydney Time)
9 AM - 5 PM (Yogyakarta Time)

Monday - Friday
9 AM - 9 PM (Sydney Time)
9 AM - 5 PM (Yogyakarta Time)

Sydney

SYDNEY

55 Pyrmont Bridge Road
Pyrmont, NSW, 2009
Australia

55 Pyrmont Bridge Road, Pyrmont, NSW, 2009, Australia

+61 2-8123-0997

Yogyakarta

YOGYAKARTA

Unit A & B
Jl. Prof. Herman Yohanes No.1125, Terban, Gondokusuman, Yogyakarta,
Daerah Istimewa Yogyakarta 55223
Indonesia

Unit A & B Jl. Prof. Herman Yohanes No.1125, Yogyakarta, Daerah Istimewa Yogyakarta 55223, Indonesia

+62 274-4539660
Bandung

BANDUNG

JL. Banda No. 30
Bandung 40115
Indonesia

JL. Banda No. 30, Bandung 40115, Indonesia

+62 858-6514-9577

Subscribe to our newsletter