Business

SaaS

Technology

•

May 26, 2026

The 35% Problem — One-Third of Organisations Can’t Kill a Rogue Agent

A Writer enterprise survey from April–May 2026, widely cited in the practitioner community, found that 35% of organisations cannot shut down a rogue AI agent once it’s deployed. The methodology is unverified — but it’s corroborated: Kiteworks’ 2026 Data Security Forecast, drawn from 225 security leaders, found 60% lack the basic containment controls to stop a misbehaving agent rapidly.

Both figures point to the same structural problem. Organisations are deploying agents that act autonomously and operate at machine speed — while assuming they have the same stop mechanism they’d use for a software bug. They don’t. An AI agent kill switch is a three-layer control architecture. Most organisations have, at best, one layer. For broader context on how we got here, see the agentic governance gap.

What Does It Mean That 35% of Organisations Can’t Kill a Rogue Agent?

Here’s what “can’t kill” actually means in practice. One-third of organisations lack a functioning operational control — the ability to halt a deployed agent’s execution and revoke its access in a controlled, auditable way. That breaks down into three distinct failure modes: no documented shutdown procedure exists at all; a procedure exists but only stops the front-end agent while sub-agents keep running with live credentials; or credential revocation is manual and simply too slow for the speed at which agents operate.

Three reasons your organisation might be in the 35%: no shutdown procedure exists at all; a procedure exists but only terminates the orchestrator, leaving sub-agents running; or credential revocation is built for quarterly access reviews, not real-time threat response.

That second failure mode has a name: the ghost agent problem. When a multi-agent orchestrator is stopped, the sub-agents it dispatched retain whatever credentials they were granted at dispatch time. They keep running. With live access. Unmonitored. Gartner projects 40% of enterprise applications will embed AI agents by end of 2026, up from less than 5% in 2025. That 35% governance gap is only going to get wider.

What Is an AI Agent Kill Switch — and Why Isn’t It Just One Thing?

Here’s where most people get tripped up. An AI agent kill switch is not a single off button. It’s a three-layer control architecture: credential revocation (IAM-level), session termination (runtime-level), and full agent deactivation (deployment-level). Each layer leaves different parts of the system running if you implement it alone. And probabilistic guardrails — confidence scores, content filters — are not kill switches. Deterministic controls are the only reliable stopping mechanism.

Layer 1 — Credential revocation: Invalidates API keys, OAuth tokens, and service account credentials. Prevents downstream authentication, but does not terminate in-flight execution or stop sub-agents that have already cached credentials.

Layer 2 — Session termination: Halts the agent’s active execution context. Does not propagate to spawned sub-agents unless process isolation is in place.

Layer 3 — Full agent deactivation: Removes the agent from the deployment environment and propagates termination signals through the whole agent hierarchy. The only layer that addresses ghost agents.

The guardrails-vs-kill-switch confusion is widespread and it matters. A system prompt is a natural-language instruction — overridable by prompt injection, ignorable across multi-turn conversations — with no enforcement power over credential usage. A circuit breaker halts execution when a threshold is crossed, regardless of model reasoning. That distinction is also a regulatory one: under the EU AI Act, organisations deploying high-risk AI systems must maintain documented shutdown procedures by 2 August 2026. For the structural reasons existing governance fails, probabilistic guardrails have never been sufficient for compliance.

How Do You Detect a Rogue Agent Before You Can Shut It Down?

There’s a step that comes before the kill switch, and it’s the one most organisations skip. Detecting a rogue agent requires behavioural baseline monitoring — establishing a normal operating pattern and triggering automated alerts when behaviour deviates. Without it, the kill switch has no trigger. You find out about rogue agents through downstream damage, often hours after the initial deviation.

Behavioural baseline monitoring is essentially UEBA adapted for non-human identities. Each agent gets a fingerprint — API call patterns, tool usage, data access volume — and anomalies surface as alerts. Detection time maps directly to blast radius. Four hours unchecked versus ten minutes is not a marginal difference. Automated escalation that fires without a human present needs to be built in from day one, not bolted on later.

What Is Blast Radius and Why Does It Determine Which Kill Switch Layer You Need?

Blast radius is the scope of damage a rogue agent can cause before it’s stopped — data exposure, downstream system corruption, and financial commitments made with legitimate credentials. Which kill switch layer you actually need comes down to what your agent has access to.

Research from Obsidian Security found 90% of AI agents hold excessive privileges and move 16x more data than human users. That gap between configured authority and actual authority is your blast radius floor.

Think about what a rogue agent with CRM, email, and billing API access can do in two hours before anyone notices. It can send thousands of customer emails, corrupt records, and initiate fraudulent billing actions — all using legitimate credentials. No network anomaly flags it. If your agent has narrow read-only access, session termination may be sufficient. If it has broad write access with sub-agent spawning, you need all three layers. For a look at what vendor tooling is available, see the AI control plane vendors are building to address this gap.

What Does Enterprise Kill Switch Governance Actually Look Like? (KPMG)

KPMG is notable here because it’s the only published enterprise practitioner case study for kill switch governance as of mid-2026. Most guidance out there is theoretical. Their framework operates through graduated escalation — automated pause to human-authorised suspension to full deactivation with forensic logging. Every agent carries a unique identifier enabling complete action logging and inter-agent tracking. Red-teaming is standard before deployment.

Most guidance is theoretical. KPMG’s is not.

What Happens When You Don’t Have a Kill Switch? (EchoLeak and DPD)

Two real-world examples that illustrate exactly what the absence of a kill switch looks like in practice.

EchoLeak (CVE-2025-32711): A zero-click indirect prompt injection in Microsoft 365 Copilot for Teams. Malicious instructions in an external document caused the agent to exfiltrate internal data using legitimate credentials, bypassing Microsoft’s prompt-injection classifier. The agent wasn’t compromised — it followed instructions it wasn’t supposed to receive. Patched reactively. NIST called indirect prompt injection “generative AI’s greatest security flaw.” Without kill switch capability, that kind of attack exfiltrates internal data before any human reviewer detects it. The agent behaved correctly from its own perspective.

DPD (January 2024): DPD’s customer service chatbot went off-policy after a system update. A customer asked it to write a poem criticising the company — it did. No alert fired. The failure surfaced through customer-posted screenshots and the chatbot kept running until it was manually disabled. Without real-time behavioural monitoring, a drifting agent just keeps going until someone externally notices.

Two different failure modes. Two different kill switch requirements.

Who Is Legally Liable When an AI Agent Causes Harm? (Moffatt v. Air Canada)

This one should make your legal team sit up. Moffatt v. Air Canada (2024) established that organisations are liable for non-deterministic promises made by their AI agents, even when those promises contradict internal policy. “The agent went rogue” is not a legal defence.

Jake Moffatt asked Air Canada‘s chatbot about bereavement fares. It told him he could apply for a retroactive discount within 90 days — a policy that did not exist. Air Canada argued the chatbot was “a separate legal entity.” The BC Civil Resolution Tribunal awarded Moffatt $812.02 and was unambiguous: “Air Canada is responsible for all information on its website, whether it comes from a static page or a chatbot.” If you can’t kill a rogue agent, you also can’t demonstrate adequate response controls in a legal proceeding.

How Do You Build a Human-on-the-Loop Model That Doesn’t Become a Governance Bottleneck?

The instinct when something goes wrong is to add human approval steps. That instinct will kill your throughput. A human-on-the-loop (HOTL) model keeps humans in the kill chain without requiring approval for every agent action. Humans set thresholds; automated circuit breakers enforce them; humans only intervene when an automated halt occurs. The key is moving human decision-making upstream — to threshold-setting — not inline at the point of action.

HITL requires approval at critical decision points. That’s viable at low volumes and a bottleneck at scale. HOTL defines acceptable thresholds per agent and escalates to a human — with an agent state snapshot — when the circuit breaker fires.

For organisations without a dedicated AI security team, the minimum viable architecture is straightforward: a circuit breaker, programmatic credential revocation on circuit break, an escalation alert to the CTO or on-call engineer, and a post-incident review before re-enabling. Okta, Microsoft Entra, and Google IAM all provide revocation APIs that wire directly to a circuit breaker. It’s tooling, not headcount. For the governance framework context, see what the WEF readiness framework says boards should demand. Understanding the structural reasons existing governance fails agents — as distinct from the kill switch gap specifically — is the broader context this article sits within.

FAQ

Can a company really not shut down its own AI agent?

Yes — for approximately 35% of organisations according to the Writer survey (April–May 2026), and 60% according to Kiteworks’ 2026 Data Security Forecast. Common failure modes: no documented shutdown procedure; a shutdown that only stops the front-end orchestrator; manual credential revocation too slow for agentic systems.

What is the difference between an AI agent guardrail and a kill switch?

A guardrail is a probabilistic output filter — it reduces the likelihood of harmful responses but can’t guarantee a stop. A kill switch is a deterministic control that halts execution regardless of model reasoning. They solve different problems.

What is a ghost agent and why does it make kill switches harder?

A ghost agent is a sub-agent spawned by an orchestrator that keeps running with live credentials after the orchestrator is stopped. It receives no termination signal unless the kill mechanism was specifically designed to propagate through the full agent hierarchy.

What is EchoLeak and what does it show about AI agent security?

EchoLeak (CVE-2025-32711) was a zero-click indirect prompt injection in Microsoft 365 Copilot for Teams. Malicious instructions in an external document caused the agent to exfiltrate internal data using legitimate credentials. Kill switches must detect data-plane behaviour, not just network-layer threats.

What is the legal precedent for AI agent liability?

Moffatt v. Air Canada (2024). The BC Civil Resolution Tribunal held Air Canada liable for incorrect chatbot information: “It is responsible for all information on its website, whether it comes from a static page or a chatbot.” You cannot disclaim responsibility by pointing to agent autonomy.

What does the EU AI Act require for AI agent shutdown?

Organisations deploying Annex III high-risk AI systems must maintain documented shutdown procedures by 2 August 2026. FinTech, HealthTech, and HR tech should treat this as an active compliance deadline.

What is the minimum viable kill switch for a small organisation?

An automated circuit breaker, programmatic credential revocation on circuit break, an escalation alert with agent state snapshot, and a post-incident review before restart. Implementable with Okta, Microsoft Entra, or Google IAM — no dedicated security function required.

Why do organisations find it hard to stop a rogue agent quickly?

Three reasons: agents operate at machine speed while shutdown procedures operate at human speed; credential revocation in most IAM systems is manual; and multi-agent architectures create hidden dependencies where stopping one agent doesn’t stop the agents it spawned.

What is behavioural baseline monitoring for AI agents?

It establishes a normal operating pattern for each deployed agent — API call frequency, tool usage, data access volume — and triggers alerts on statistical deviation. Without it, the kill switch has no trigger mechanism.

How does human-on-the-loop differ from human-in-the-loop for stopping rogue agents?

HITL requires approval for each agent action — effective at low volumes, a bottleneck at scale. HOTL moves human decision-making to threshold-setting, with circuit breakers handling enforcement and humans handling only escalations.

What did the DPD chatbot incident reveal about AI governance?

DPD’s chatbot went off-policy after a system update. No alert fired. The failure was discovered through customer-posted screenshots. Without real-time behavioural monitoring, agents that drift keep operating until someone externally notices.

How do the three kill switch layers work together?

Credential revocation prevents downstream authentication. Session termination halts current execution. Full agent deactivation propagates termination signals through the agent hierarchy, addressing ghost agents. All three are required in any multi-agent deployment.