Insights Business| SaaS| Technology How the OX Security Audit Exposed 7,000 Plus MCP Servers, 14 CVEs, and One Design Flaw
Business
|
SaaS
|
Technology
May 22, 2026

How the OX Security Audit Exposed 7,000 Plus MCP Servers, 14 CVEs, and One Design Flaw

AUTHOR

James A. Wondrasek James A. Wondrasek
Graphic representation of the OX Security audit exposing 7000 MCP servers and 14 CVEs

In April 2026, OX Security scanned more than 7,000 publicly accessible MCP servers. What they found wasn’t a collection of isolated bugs. It was one architectural decision — baked into Anthropic’s STDIO transport layer — that had quietly propagated a command-injection surface across every official SDK and application built on top of it. The result: 14 CVEs, 150 million downloads affected, 200,000 exposed instances, and confirmed exploitation on six live production platforms.

At the centre of the disclosure is a dispute that matters to every engineering leader using agentic AI tooling. Anthropic called the flaw “expected behaviour.” OX Security called it a design flaw requiring a protocol-level fix, invoking CISA Secure by Design principles to argue Anthropic bears responsibility as protocol owner. For context, see our MCP supply chain security overview.

This piece delivers the factual account: the STDIO mechanism, all four exploit families, the complete CVE inventory with patch status, and the disclosure outcome.

What Did OX Security Find When They Audited 7,000 MCP Servers?

OX Security is an Israeli application security firm. In November 2025, researchers Moshe Siman Tov Bustan, Mustafa Naamnih, Nir Zadok, and Roni Bar started a structured investigation of the MCP ecosystem. They published their findings on 15 April 2026 under the title “The Mother of All AI Supply Chains.” The methodology: automated scanning of publicly reachable MCP server endpoints, manual exploitation of representative samples, and coordinated disclosure spanning more than 30 vendors.

The headline finding was a single root-cause architectural flaw — the STDIO transport’s acceptance of unsanitised command and argument values — that had propagated across every official language SDK and into 14 downstream products. Not a collection of unrelated bugs. One design decision multiplying across an ecosystem.

Two subsidiary findings showed just how far this had spread. OX Security confirmed successful arbitrary command execution on six live, production-grade platforms during the responsible disclosure period — verified exploitation of deployed systems, not a lab demo. And nine of 11 major MCP registries contained poisoned or malicious entries, meaning the supply chain compromise extended into the discovery layer itself.

For context on growth: Trend Micro counted 1,467 internet-exposed MCP servers in October 2025. By April 2026, more than 7,000 — approximately fivefold in six months.

The registry-poisoning dimension is explored further in our companion piece on shadow MCP servers that were never audited. For now, the priority is understanding the root cause.

How Does the MCP STDIO Command-Injection Flaw Actually Work?

MCP supports multiple transport modes. SSE and HTTP transports communicate over network sockets. STDIO works differently: it launches an MCP server as a local OS subprocess through standard input/output streams. Think of it as the MCP client telling the operating system: “run this program and I’ll talk to it via its stdin and stdout.”

The mechanism is implemented through StdioServerParameters. In the Python SDK it accepts a command string and an args array, then passes them directly to subprocess.Popen(). Equivalent classes exist in the TypeScript, Java, and Rust SDKs.

Here’s the critical detail: no validation or sanitisation occurs. Whatever string is passed to command is executed by the OS. Whatever is in args is passed verbatim. Supply a value to the command field — through a config UI, a network request, a manipulated config file — and the OS runs it. That’s Remote Code Execution (RCE). No memory corruption, no exotic race condition. Write access to the MCP configuration is all you need.

This is not a bug in any single product. It is an architectural design decision in the protocol specification. Every language SDK that faithfully implements STDIO inherits the same exposure. SSE and HTTP transports are not affected.

Why Did Anthropic Call the Vulnerability “Expected Behaviour” — and Why Is That Contested?

During coordinated disclosure, OX Security notified Anthropic that the STDIO transport allows arbitrary OS command execution. Anthropic’s response: “expected behavior.” Their stated position was that sanitisation is the responsibility of the client developer.

The logic is internally coherent. STDIO is a local transport designed for trusted developer environments. A developer configuring a command value is intentionally granting that command execution rights. In the intended use case, the caller controls both the configuration and the environment.

OX Security pushed back. Anthropic quietly updated their security policy to say MCP STDIO adapters “should be used with caution.” OX Security’s response: “This change didn’t fix anything.” Anthropic declined to modify the protocol, listed the behaviour as “won’t fix,” and raised no objection to publication. LangChain, FastMCP, Amazon/AWS Labs, NVIDIA, Gemini-CLI, Claude Code, GitHub Copilot, OpenHands, PromptFoo, and Firebase Studio all did the same.

OX Security’s counter: the “trusted environment” assumption fails in real deployments. MCP configuration files can be modified by prompt injection, network requests, supply chain compromise, or MITM interception — all demonstrated during the audit. Expecting 200,000 developers to independently invent the same sanitisation logic is an unreasonable transfer of liability.

Anthropic is technically correct. OX Security is practically correct. See why patching alone won’t close the MCP risk for more.

What Are the Four Exploit Families OX Security Identified?

OX Security taxonomised the attack surface into four exploit families. Same root cause, entirely different attack paths, each requiring a different defensive response.

Family 1 — Direct Command Injection

The simplest exploit. An attacker supplies a malicious command value through any interface that accepts MCP server configuration — an “Add MCP Server” UI field, a network-accessible endpoint, an API call. It passes straight to StdioServerParameters and executes as an OS subprocess.

In the LangFlow case, over 915 publicly accessible instances were online with the MCP configuration panel exposed without authentication. An attacker could send a crafted network request and achieve full RCE without ever logging in.

Products affected: LiteLLM (CVE-2026-30623), Agent Zero (CVE-2026-30624), Bisheng (CVE-2026-33224), GPT Researcher (CVE-2025-65720), Jaaz (CVE-2026-30616), Langchain-Chatchat (CVE-2026-30617), Fay Digital Human Framework (CVE-2026-30618), and LangBot.

Family 2 — Allowlist Bypass via Argument Injection

Some products hardened against Family 1 by restricting command to an allowlist of approved executables — python, npm, npx, node, docker, uvx. A reasonable first step. Still not enough.

OX Security bypassed allowlist controls in Upsonic (CVE-2026-30625) and Flowise (CVE-2026-40933) by passing an approved binary as the command value and placing a destructive payload in the args field. Defending against this properly turns a simple allowlist into a shell syntax parser — which is precisely why OX Security called for a protocol-level manifest, not ad hoc application-level filtering.

Family 3 — Zero-Click Prompt Injection

The most dangerous family for AI coding assistant users — it exploits agentic autonomy itself.

AI coding assistants like Windsurf and Cursor have read access to web content, file-write permissions in the developer’s local environment, and the ability to execute agent actions autonomously. Family 3 exploits all of that: a hidden prompt injection on an attacker-controlled web page causes the AI assistant to silently modify the user’s local MCP configuration file, adding a malicious command entry. On the next agent invocation, it executes — no further user interaction required.

CVE-2026-30615 (Windsurf 1.9544.26) was the only IDE where exploitation required zero user interaction. Cursor (CVE-2025-54136), VS Code, Gemini-CLI, Claude Code, and GitHub Copilot were also vulnerable, though most require at least one user interaction to permit the config file edit. A web page visit is the entire attack surface.

Family 4 — Hidden STDIO via Transport-Type Downgrade

Family 4 affects products where STDIO is not exposed in the UI — but the backend still processes STDIO payloads. An attacker intercepts the configuration network request via a local MITM proxy and modifies the transport field from sse or http to stdio, injecting an arbitrary command value. The backend processes it without detecting the substitution. DocsGPT (CVE-2026-26015) and LettaAI were affected; both patched their production instances.

Even products that restricted STDIO in their UI remained vulnerable because there was no protocol-level enforcement. For technical analysis of specific CVEs, see our deep-dive on the CVE-2026-26029 Salesforce case study for enterprise impact context.

Which Products Received CVEs and What Is the Patch Status?

CVE counts vary by source — OX Security cited 10 or more, The Hacker News counted 11, independent tallies reach 14. The discrepancy exists because some CVEs were reported by independent researchers who share the same root cause. Treat 14 as a floor, not a ceiling.

Patched:

Check vendor for current status:

LettaAI and LangBot were disclosed but had no CVE assigned at time of publication. LettaAI patched on their production instance.

Two things to understand here. “Patched” means the vendor fixed their specific implementation — LiteLLM’s patch introduced an allowlist for STDIO command values, which is a reasonable product-level control. It does not change the underlying STDIO protocol design. A new product built on MCP’s STDIO transport after these patches still inherits the same root-cause risk.

And vendors that issued “won’t fix” responses are absent from the CVE inventory because they declined to accept the disclosure as a vulnerability. Absence from the list does not mean unaffected.

What Is the Real Scale of Exposure — 200,000 Instances, 150 Million Downloads?

Three figures quantify the exposure surface, and they measure different things. The 7,000+ public MCP servers is the count OX Security scanned — the directly addressable public population. The 200,000 exposed instances is OX Security’s estimate of deployments actively using STDIO transport with external or untrusted command values, including private deployments. The 150 million downloads is the cumulative download count across the four official MCP SDK packages and the most widely affected downstream products — the full installed base that inherits the root-cause design.

Unlike an npm supply chain attack — where a malicious package is substituted for a legitimate one — this vulnerability propagated through a protocol-level architectural decision across four official language SDKs. All 14 CVEs share the same root cause: product teams that built in good faith on a foundation with a design decision they had no way to audit or override.

For detail on deployments outside the 7,000+ public server count, see our piece on the shadow MCP governance gap.

What Did OX Security Ask Anthropic to Do — and What Happened?

OX Security’s core recommendation: add a command manifest or allowlist mechanism to the MCP specification — a pre-declared, signed list of allowable executables enforced by the MCP client before executing any command value via STDIO. One protocol-level change that propagates protection to every downstream project. It cannot be implemented without protocol-level support, which is precisely why OX Security addressed the recommendation to Anthropic.

Timeline: OX Security began research in November 2025. The advisory was published 15 April 2026. Anthropic’s “expected behavior” response came during coordinated disclosure; Anthropic raised no objection to publication. No CVE was assigned to the root MCP implementation.

As of May 2026, the MCP protocol roadmap includes security as a future area, with work underway on DPoP and Workload Identity Federation. No command manifest or STDIO allowlist has been committed to.

For organisations that cannot wait, OX Security’s immediate mitigations:

For what the absence of a protocol-level fix means for your deployment decisions, see why patching alone won’t close the MCP risk. For the complete picture, see our complete MCP security guide.

Frequently Asked Questions

What is StdioServerParameters and why is it dangerous?

StdioServerParameters is the Python class in the Anthropic MCP SDK that accepts a command string and args array and launches them as a local OS subprocess via subprocess.Popen(). It applies no validation or allowlist checks before execution. Equivalent classes exist in the TypeScript, Java, and Rust SDKs — the risk is language-independent. Any caller with write access to MCP configuration can achieve OS-level RCE.

How does this MCP flaw compare to an npm supply chain attack?

An npm supply chain attack requires a malicious package to be substituted into the dependency tree. The MCP STDIO flaw does not — any product that correctly implements the STDIO transport inherits the vulnerability. The “package” here is the protocol specification itself. One foundational decision propagates risk to every downstream consumer without their independent action.

Did Anthropic patch the MCP vulnerability?

No. Anthropic characterised the STDIO transport behaviour as “expected behavior” and declined to modify the protocol specification. Individual downstream products — LiteLLM, Bisheng, DocsGPT, LibreChat, Windsurf — issued their own patches. Patching a downstream product does not change the underlying STDIO design. Anthropic published updated security guidance recommending STDIO be used only in trusted environments; OX Security noted this “didn’t fix anything.”

Which products are still unpatched as of May 2026?

Confirmed patched: LiteLLM, Windsurf, LibreChat, Bisheng, DocsGPT, LettaAI (production only). No confirmed patch: Agent Zero, Flowise (>3.1.0), GPT Researcher, Jaaz, Langchain-Chatchat, Fay Digital Human Framework, Upsonic, WeKnora, LangBot, Cursor, MCP Inspector. “Won’t fix”: Anthropic, LangChain, FastMCP, Amazon/AWS Labs, NVIDIA, Gemini-CLI, Claude Code, GitHub Copilot, OpenHands, PromptFoo, Firebase Studio. Check vendor advisory pages for the latest.

What is CISA Secure by Design and why did OX Security invoke it?

CISA Secure by Design calls on software manufacturers to take responsibility for secure product defaults rather than pushing the burden onto customers. OX Security invoked it to argue that Anthropic, as the protocol owner, should fix the STDIO design flaw — not the 14+ downstream product teams who built on the protocol in good faith. It frames Anthropic’s “expected behavior” response as a failure to meet a recognised standard for vendor responsibility.

Can sandboxing or containerisation mitigate the STDIO flaw?

Partial mitigation only. Docker containers limit what an attacker’s subprocess can access on the host OS, but sandboxing does not prevent the initial command execution — an attacker can still execute code, read environment variables, and potentially escape a poorly configured container. Several “won’t fix” vendors cited sandboxing as their recommended mitigation; OX Security noted it is insufficient against compute-abuse and crypto-mining scenarios.

What does “zero-click” mean in the context of the Windsurf CVE?

Zero-click means the attack succeeds without the user doing anything beyond visiting a page. In CVE-2026-30615, visiting an attacker-controlled web page causes Windsurf’s AI agent to read hidden prompt injection instructions, silently edit the user’s local MCP configuration file, and trigger RCE on the next agent invocation — no user approval required. Windsurf was the only tested IDE where the entire sequence completed without any user interaction.

How do SSE and HTTP transports compare to STDIO in terms of security?

SSE and HTTP transports communicate over network sockets rather than spawning OS subprocesses — they do not use StdioServerParameters and are not affected by the command-injection flaw. Migrating from STDIO to SSE or HTTP eliminates the attack surface entirely, at the cost of added deployment complexity.

AUTHOR

James A. Wondrasek James A. Wondrasek

SHARE ARTICLE

Share
Copy Link

Related Articles

Need a reliable team to help achieve your software goals?

Drop us a line! We'd love to discuss your project.

Offices Dots
Offices

BUSINESS HOURS

Monday - Friday
9 AM - 9 PM (Sydney Time)
9 AM - 5 PM (Yogyakarta Time)

Monday - Friday
9 AM - 9 PM (Sydney Time)
9 AM - 5 PM (Yogyakarta Time)

Sydney

SYDNEY

55 Pyrmont Bridge Road
Pyrmont, NSW, 2009
Australia

55 Pyrmont Bridge Road, Pyrmont, NSW, 2009, Australia

+61 2-8123-0997

Yogyakarta

YOGYAKARTA

Unit A & B
Jl. Prof. Herman Yohanes No.1125, Terban, Gondokusuman, Yogyakarta,
Daerah Istimewa Yogyakarta 55223
Indonesia

Unit A & B Jl. Prof. Herman Yohanes No.1125, Yogyakarta, Daerah Istimewa Yogyakarta 55223, Indonesia

+62 274-4539660
Bandung

BANDUNG

JL. Banda No. 30
Bandung 40115
Indonesia

JL. Banda No. 30, Bandung 40115, Indonesia

+62 858-6514-9577

Subscribe to our newsletter