James A. Wondrasek, Author at SoftwareSeni

Inside CVE-2026-26029, the Salesforce MCP Remote Code Execution

CVE-2026-26029 is a CVSS 7.5 HIGH remote code execution vulnerability in sf-mcp-server — the official MCP integration for Salesforce CLI operations. This is not some hobbyist side-project someone knocked together on a weekend. Salesforce runs sales, service, and marketing operations for some of the largest organisations on the planet.

When a company at that scale ships a shell injection flaw in its MCP integration, the risk picture changes for every security team. It also raises a harder question: if this happened to Salesforce, what does that mean for every other MCP server your organisation is running?

This case study walks through how CVE-2026-26029 works, what the Salesforce context means for enterprise risk, and what STDIO transport architecture has to do with all of it. We cover the LiteLLM patch for CVE-2026-30623, the responsible disclosure timeline compliance teams need for audit trails, and a triage checklist for teams running LiteLLM, Windsurf, or Cursor.

For the broader picture of how this vulnerability class was found across 7,000+ MCP servers, see our MCP supply chain overview.

What Is CVE-2026-26029 and How Does the Salesforce MCP RCE Work?

CVE-2026-26029 is a CWE-78 OS command injection flaw in akutishevsky/sf-mcp-server, a Node.js package that wraps the Salesforce CLI through an MCP STDIO interface. CVSS base score: 7.5 HIGH, vector AV:N/AC:H/PR:N/UI:R/S:U/C:H/I:H/A:H. If an attacker successfully exploits it, they get arbitrary shell command execution with the full privileges of the MCP server process.

The flaw comes down to a single API choice. The package used child_process.exec to run Salesforce CLI commands. exec passes the command string to the OS shell — which means shell metacharacters like ;, &&, and | embedded in the command string are executed as shell operators. Concatenate user-controlled input into that string, and an attacker can break out of the intended command and run whatever they like.

The safe alternative is child_process.execFile. It runs the specified file directly without invoking a shell — metacharacters have no special meaning, only the intended binary runs. Fix commit 99fba0171b8c22b5ee3c0405053ccfd2910a066d replaces exec with execFile. The GitHub Security Advisory is GHSA-h4w9-g9c5-vfwq.

To put the CVSS score in plain language: network-reachable, no account on the target required, the user must run a malicious MCP configuration, and confidentiality, integrity, and availability are all rated HIGH impact. CWE-78 — “Improper Neutralisation of Special Elements Used in an OS Command” — is the assigned weakness type. The code accepts user-controlled input and passes it straight to a shell without stripping the dangerous characters.

One thing compliance teams should note: the NVD record for CVE-2026-26029 is flagged “not prioritised for NVD enrichment due to resource constraints.” If your organisation relies solely on NVD enrichment, you may have underweighted this one. GHSA-h4w9-g9c5-vfwq is the more complete record to work from.

The child_process.exec pattern turned up across dozens of MCP projects in 2026. What made the Salesforce case different was demonstrating that the STDIO architectural flaw OX Security documented across the MCP ecosystem had reached a vendor whose software sits inside Fortune 500 production infrastructure.

Why a Salesforce CVE Changes the Risk Calculus for Enterprise Security Teams

When sf-mcp-server shows up in the CVE database at CVSS 7.5 HIGH, STDIO command injection has reached enterprise production software at scale. Any organisation running MCP-integrated enterprise applications is now in scope for the same class of attack.

OX Security audited 7,000+ publicly accessible MCP servers and confirmed they could execute commands on six live production platforms. Salesforce is the enterprise-grade proof point in that set. The full OX Security audit findings cover the scope: 150M+ downloads, up to 200,000 vulnerable instances, and OX researchers successfully poisoning 9 of 11 MCP registries they tested.

This is not an isolated event. The Authzed “Timeline of MCP Security Breaches” (updated April 2026) documents escalating MCP incidents from April 2025 onward — WhatsApp MCP exfiltration, the Anthropic MCP Inspector RCE (CVE-2025-49596), mcp-remote OS command injection (CVE-2025-6514) — before reaching the enterprise incidents of early 2026. CVE-2026-26029 is a data point in a trend, not an anomaly.

The attack surface exists wherever an MCP server wraps a CLI tool and accepts input from an AI agent’s tool call results. The C:H/I:H/A:H impact rating means all three pillars of the CIA triad are at HIGH — the threshold that triggers mandatory remediation timelines under SOC 2 and ISO 27001, and the number compliance teams take to boards and insurers.

MCP STDIO Transport vs. HTTP/SSE Transport — What the Salesforce Case Reveals

STDIO transport runs an MCP server as a local subprocess. The host application communicates through standard input and output — no network, no authentication boundary, no audit log. Two parameters control that launch: command and args. In Anthropic’s official MCP SDKs, the StdioServerParameters class handles these — without validation.

HTTP/SSE transport is different. It runs the MCP server as a separate process with a network endpoint. The client sends HTTP requests and receives server-sent events. There is an authentication boundary (OAuth 2.1 per the MCP spec), observable HTTP logs, and no subprocess spawned for every tool call. No shell is involved, so there is no metacharacter injection surface.

For CVE-2026-26029, the practical difference is this: an attacker who supplies a crafted string as a command parameter via the STDIO configuration interface gets arbitrary OS command execution. With HTTP/SSE, that pathway simply does not exist.

STDIO is the default in Claude Desktop, Cursor, and Windsurf. So the default is the dangerous option.

When OX Security reported this to Anthropic, Anthropic classified the STDIO subprocess behaviour as “expected” and declined to modify the architecture. Input sanitisation, they said, is the client developer’s responsibility. That means individual vendor patches reduce per-package risk without resolving the systemic exposure. Any new STDIO MCP package can introduce the same class of vulnerability.

The architectural fix for production enterprise deployments is disabling STDIO transport at the configuration layer and migrating to HTTP/SSE. If an STDIO MCP server is accessible from outside a developer’s local machine, or handles credentials, it belongs on the migration list. See our article on the JFrog Universal MCP Registry and enterprise AI governance for tooling that supports this transition.

LiteLLM CVE-2026-30623: Patch Notes, Verification, and What to Do Now

CVE-2026-30623 is an authenticated RCE in LiteLLM — the most widely deployed open-source LLM proxy in the affected ecosystem. Same root cause as CVE-2026-26029: StdioServerParameters accepting unsanitised command input. The key difference: CVE-2026-30623 required valid API credentials. Still serious, but lower priority than an unauthenticated RCE.

The fix landed in commit 7b7f304 (PR #25343), first shipping in v1.83.6-nightly on April 21, 2026, then v1.83.7-stable. A new constant MCP_STDIO_ALLOWED_COMMANDS restricts the command field to known MCP launchers: npx, uvx, python, python3, node, docker, deno. Arbitrary executables are blocked at the API layer before reaching the process level. MCP test endpoints now also require the PROXY_ADMIN role.

To verify patch application: confirm v1.83.7-stable or later is installed with commit 7b7f304 / PR #25343 included. Audit existing STDIO MCP configurations — any server whose command falls outside the allowlist will fail to start after upgrade. Add any needed binaries to LITELLM_MCP_STDIO_EXTRA_COMMANDS. Review who holds PROXY_ADMIN access.

One caveat worth flagging: the allowlist is not a complete fix. OX Security documented that argument flags of allowed commands can be abused — npx -c <malicious-payload> — and bypassed similar allowlist implementations in both Upsonic (CVE-2026-30625) and Flowise using the same technique. Treat v1.83.7-stable as risk reduction, not closure. HTTP/SSE migration remains the architectural fix.

The Responsible Disclosure Timeline: Who Knew What and When

No single public document assembles this sequence. Here it is, in the format compliance and risk teams need for audit trails.

February 11, 2026: CVE-2026-26029 published to NVD. GHSA-h4w9-g9c5-vfwq published on GitHub. NVD record carries the “not prioritised for enrichment” flag.

Before April 15, 2026: OX Security notifies affected vendors privately. Anthropic is notified and declines to modify the STDIO specification. Most vendors patch before the public advisory.

April 15, 2026: OX Security publishes “The Mother of All AI Supply Chains” — covering 7,000+ MCP servers audited, six confirmed-vulnerable production platforms, and 30+ responsible disclosures.

April 20, 2026: The Hacker News amplifies the OX advisory. Organisations that had not yet patched entered an active exposure window.

April 21, 2026: LiteLLM ships v1.83.7-stable, patching CVE-2026-30623. The most operationally relevant post-advisory date for teams running LiteLLM.

The gap between NVD publication (February 11) and the OX advisory (April 15) is 63 days. During that window, organisations running NVD-based scanning had this CVE in scope — but without the OX advisory context, it was hard to prioritise. The “not prioritised for enrichment” flag made it worse. If your organisation was scanning in February 2026, you had a duty to investigate CVE-2026-26029. Whether a remediation action followed is a record-keeping question risk teams may need to answer.

Immediate Triage Checklist for Teams Using MCP Integrations

Complete these actions this week. Specific and sequenced — not generic security hygiene.

1. Check sf-mcp-server. If your environment uses akutishevsky/sf-mcp-server, confirm fix commit 99fba01 is present. Cross-reference with GHSA-h4w9-g9c5-vfwq. If you cannot confirm the fix, treat the installation as unpatched and replace the STDIO integration with an HTTP/SSE-based Salesforce endpoint.

2. Upgrade LiteLLM. Confirm v1.83.7-stable or later is installed with commit 7b7f304 / PR #25343. Audit existing STDIO configurations — anything outside the allowlist fails to start post-upgrade. Review PROXY_ADMIN access. Allowlist bypass via argument flags remains a documented risk.

3. Windsurf — highest urgency. CVE-2026-30615 is zero-click: malicious instructions in attacker-controlled HTML can register a malicious STDIO server and execute arbitrary commands without user interaction. Verify Windsurf is on the latest patched release. Audit all registered MCP servers. Treat any STDIO server you did not explicitly register as suspect.

4. Cursor. CVE-2025-54136 (Critical) requires one user confirmation step — lower urgency than Windsurf but still active. Verify Cursor is on the latest patched release, review active MCP server configurations, and investigate any unexpected STDIO server registrations.

5. Audit all STDIO configurations. List every MCP server registered in your AI coding tools and proxy configs. For each STDIO server: verify it is patched, confirm who added it, and assess whether HTTP/SSE migration is feasible. Local-only deployments are not inherently safe — developer machines hold git tokens, signing keys, and AWS credentials.

6. Plan HTTP/SSE migration. STDIO transport restriction is the architectural fix — it eliminates the subprocess command-execution pathway at the configuration level. Prioritise HTTP/SSE for any STDIO server accessible from outside the developer’s local machine, handling credentials, or part of a production integration.

See our MCP Security Playbook for a prioritised remediation framework covering server inventory, transport restriction, permission scoping, and registry adoption.

Frequently Asked Questions

Is CVE-2026-26029 patched?

Yes. Fix commit 99fba0171b8c22b5ee3c0405053ccfd2910a066d replaces child_process.exec with child_process.execFile, closing the shell injection pathway. GHSA-h4w9-g9c5-vfwq documents the fix. Verify the commit is present in your installed version.

Which version of sf-mcp-server is safe?

Check that fix commit 99fba01 is present via the repository’s commit history on GitHub. Use GHSA-h4w9-g9c5-vfwq as your primary reference. Regardless of patch status, consider HTTP/SSE migration for any production deployment.

Does switching to HTTP/SSE transport fully mitigate the Salesforce RCE?

Yes, for the STDIO command injection vector — HTTP/SSE does not spawn subprocesses, so there is no metacharacter injection pathway. That said, HTTP/SSE introduces its own attack surface that must be properly secured. “No STDIO” is necessary but not sufficient for a complete enterprise MCP security posture.

What is the difference between CVE-2026-26029 and CVE-2026-30623?

Both share the same root cause — StdioServerParameters accepts unsanitised command input. CVE-2026-26029 affects sf-mcp-server with an unauthenticated exploitation path. CVE-2026-30623 affects LiteLLM and requires valid API credentials. Patched via commit 99fba01 and v1.83.7-stable respectively.

Why did Anthropic not patch the MCP protocol after the OX Security disclosure?

Anthropic classified the STDIO subprocess behaviour as “expected” — the spec intentionally allows MCP servers to spawn subprocesses. Individual vendor patches reduce per-package risk but do not resolve the systemic flaw.

Can someone hack my company through an MCP plugin?

Yes, under the right conditions. CVE-2026-26029 and CVE-2026-30623 both demonstrate that attacker-controlled input reaching an STDIO MCP configuration results in arbitrary OS command execution. Windsurf’s CVE-2026-30615 is zero-click — no user interaction required.

What is CWE-78 and why does it apply here?

CWE-78 (Improper Neutralisation of Special Elements Used in an OS Command) means the code accepts user-controlled input and passes it to a shell without removing dangerous characters. It applies because sf-mcp-server passed attacker-controlled values directly to child_process.exec. NVD formally assigned CWE-78 for this CVE.

What does the CVSS 7.5 HIGH score mean for compliance?

Network-reachable, no privileges required, user interaction needed, HIGH impact across confidentiality, integrity, and availability. HIGH severity triggers mandatory remediation timelines under SOC 2 and ISO 27001, and is the threshold insurers and boards take seriously.

Where can I find the LiteLLM v1.83.7-stable release notes for CVE-2026-30623?

PR #25343 and commit 7b7f304 in the LiteLLM GitHub repository. The formal CVE is CVE-2026-30623. v1.83.6-nightly was the first build to include the fix (April 21, 2026), followed by v1.83.7-stable.

What triage steps should I take for Windsurf specifically?

Windsurf is highest urgency because CVE-2026-30615 is zero-click — a malicious STDIO server can be registered without user interaction via prompt injection. Verify Windsurf is on the latest patched release, audit all registered MCP servers, and treat any STDIO server you did not explicitly register as suspect.

Is the LiteLLM command allowlist sufficient, or can it be bypassed?

No, not on its own. OX Security demonstrated that argument flags of allowed commands can be abused — npx -c <malicious-payload>. Treat v1.83.7-stable as risk reduction and pursue HTTP/SSE migration as the architectural fix.

What is the MCP Model Context Protocol and why does its STDIO transport create security risks?

MCP is Anthropic’s specification for letting AI agents invoke external tools at runtime. STDIO transport runs the MCP server as a local subprocess — no authentication boundary, no audit log, and any attacker-controlled value reaching the command parameter executes as an OS command with the process’s privileges. HTTP/SSE transport does not have this subprocess execution pathway.

CVE-2026-26029 is one data point in a broader MCP security landscape that spans 14 CVEs, 7,000+ vulnerable servers, and a structural architectural flaw that patching alone will not resolve. For the complete picture — including the full CVE inventory, governance tooling options, and a prioritised remediation framework — see our broader MCP security landscape.

Vibe Coding Governance and Who Owns the Risk When the Code Writes Itself

In late April 2026, CISA co-authored joint guidance with Australia’s ASD, the NSA, and the UK’s NCSC recommending “careful adoption” of agentic AI systems. The same week, the Pentagon had already deployed over 103,000 AI agents built through vibe coding on its GenAI.mil platform.

That gap between the guidance and the reality is not a Pentagon problem. It’s the situation in most engineering teams right now.

Vibe coding tools — Lovable, Cursor, Bolt.new, GitHub Copilot — speed up development while leaving accountability exactly where it always was. Someone owns every line of code that ships, regardless of who or what wrote it. The question is whether you have a framework that makes ownership traceable and risk auditable before an incident forces the question.

This article gives you that framework: a concrete 30/60/90-day governance programme you can take to your board to show you have a handle on AI risk — without a dedicated security team. For the full evidence base this governance framework responds to, the research, the incidents, and the scale of the problem are all documented there.

Who owns the code when the AI writes it?

The developer who clicks “accept” owns the code. That is the legal and professional default — established in the AI tool vendors’ own terms of service.

GitHub Copilot’s IP indemnity covers intellectual property infringement claims where generated code duplicates third-party copyrighted content. It does not cover security vulnerabilities, functional defects, or regulatory compliance failures. No AI vendor has taken accountability for what their tools produce. They have all been explicit about this.

Accepting AI-generated code is legally and professionally the same as writing it. The developer’s signature is on the commit. The model’s is not.

The accountability chain runs upward from there. The developer accepted the code. The CTO permitted the tool and set — or failed to set — the review standards. The board carries the ultimate risk exposure: regulatory liability and reputational consequences if AI-generated code is the vector for a breach.

IP ownership is worth flagging for FinTech and HealthTech teams near fundraising or acquisition. The US Copyright Office has stated that AI-generated work without human authorship is not eligible for copyright protection. Acquirers are asking: can you demonstrate clear ownership of your codebase? An AI work product assignment clause in developer contracts and an automated licence scan gate in CI/CD are the minimum contractual artefacts.

The foundational governance artefact is the ownership chain record: which developer accepted which AI-generated code, when, and under what review conditions.

For the research that quantifies the risk this governance addresses — the vulnerability rates, the hallucination frequencies, the production scan data — that evidence base makes the ownership question concrete rather than theoretical.

What governance frameworks are actually worth building on?

Three frameworks give you credible external anchors to cite when presenting governance rationale to a board.

CISA Careful Adoption Guidance was co-authored with Australia’s ASD, the NSA, Canada’s Cyber Centre, and the UK’s NCSC. It recommends deploying incrementally, limiting initial use to low-risk tasks, and enforcing strict privilege controls. For what institutional-scale vibe coding governance failure looks like — what happened when even the Pentagon skipped the careful part — that article has the evidence.

Five Eyes Joint Guidance elevates AI governance to a national security concern, stating that “until security practices, evaluation methods and standards mature, organisations should assume that agentic AI systems may behave unexpectedly.” Australia’s ASD membership makes this directly applicable to Australian engineering teams.

CIS Controls v8.1 AI Companion is the most operationally specific of the three. The companion MCP Guide applies CIS Controls to Model Context Protocol integrations — a coverage gap in most other frameworks — and maps controls to concrete actions rather than principles.

💡 Model Context Protocol (MCP) is a standard that lets AI coding assistants connect to external tools and data sources — file systems, databases, APIs. It expands capability and expands attack surface simultaneously.

OWASP Agentic Top 10 is not a governance framework but the required vocabulary for writing one. It classifies vulnerability categories specific to agentic applications — agent goal hijack, prompt injection, memory poisoning — that the traditional OWASP Top 10 does not capture.

Our recommendation for teams without a dedicated security function: lead with CIS Controls v8.1 AI Companion for daily implementation, use CISA and Five Eyes as board-level legitimacy anchors, and use OWASP Agentic Top 10 as your security coverage checklist.

What does a 30/60/90-day governance programme look like without a dedicated security team?

The sequencing is what matters. A governance programme that tries to implement everything simultaneously will implement nothing.

Days 1–30: Inventory and Accountability

Before writing any policy, know what you are governing. The shadow AI inventory comes first because a policy written without knowledge of what your team is actually using will be worked around, not followed.

Deliverables by day 30:

Shadow AI inventory complete. Survey the team anonymously — anonymity surfaces honest answers. Audit browser extensions and IDE plugins. Audit MCP server connections and their permission scopes. Review network logs for traffic to known AI platform domains. Document findings before writing policy: the inventory is an input to the permitted-tools list, not the output of an enforcement action.
Ownership chain documentation established. For AI-generated code already in production, establish the record retroactively: which developer committed it, when, and under what review conditions.
AI code governance policy written and communicated. Four elements: a permitted-tools list grounded in inventory findings; review requirements by risk tier; an enforcement mechanism; and a stated accountability principle that developers have acknowledged.

Days 31–60: Pipeline and Controls

Deliverables by day 60:

SAST gates integrated into CI/CD for all AI-generated code. Static analysis runs automatically on every commit. AI models reproduce insecure code patterns from training data, so automated scanning is the minimum baseline.
Risk-tiered review process live. Authentication handlers, data handling code, infrastructure configuration, and payment processing are high-risk and require senior engineer review plus a security tool pass before merge. UI components and test scaffolding are lower risk. Allocate review effort proportionally.
Security review pipeline documented. What checks each risk tier must pass, who can approve high-risk merges, and the escalation path when SAST returns a critical finding.

How the 2.74x multiplier sizes the security review workload: AI code introduces cross-site scripting vulnerabilities at 2.74 times the rate of human-written code. If your team has tripled its code output through AI tools, senior review bandwidth must be planned to match — or the review pipeline gets bypassed.

Days 61–90: Hardening and Measurement

Deliverables by day 90:

DAST added to the pipeline for production-facing components. Dynamic application security testing identifies vulnerabilities that do not appear in static analysis.
Manual penetration test on highest-risk AI-generated components. Components that handle authentication, data access, or payment processing need a manual pentest by day 90 — this is what gives your board a security outcome, not only a process.
Measurement framework established. Three metrics: percentage of AI-generated code passing SAST without modification; time-to-review by risk tier; shadow AI incidents surfaced per period.
Board briefing prepared. A one-page evidence summary: permitted-tools list, ownership chain documentation rate, SAST pass rate, and outstanding high-risk components awaiting manual penetration testing with scheduled completion dates.

How do you evaluate a vibe coding platform before your team uses it?

Platform vendor security posture varies dramatically. Lovable is the evidence. What happened when platform vendor security obligations failed — a 76-day BOLA exposure where any free account could read another user’s source code, database credentials, and AI chat history — is the cautionary benchmark.

These questions apply to platforms your team is considering, and retrospectively to tools already in use.

Sandbox isolation. Does the platform execute generated code in an isolated environment? Good answer: full sandbox isolation with explicit permission grants required to connect to any external system.

Row-level security (RLS) defaults. Are RLS policies enabled by default? A May 2025 study found approximately 70% of Lovable-built applications had RLS disabled entirely. Good answer: RLS on by default; insecure configurations require an explicit override.

💡 Row-level security (RLS) is a database feature that restricts which rows a user can see or modify based on their identity — the control that prevents one user from reading another user’s data in a multi-tenant application.

Bug bounty programme. Does the vendor operate a public programme with disclosed scope? No bug bounty programme is a red flag.

Breach response history. Good answer: transparent disclosure within 72 hours and a published post-incident review. Delayed or absent disclosure should disqualify a platform from enterprise use.

IP and data handling. Does the vendor train models on user-submitted code? Good answer: opt-out available and data deletion on request confirmed in writing.

How do you govern what your team is already using?

Shadow AI is the realistic starting state. More than 80% of Fortune 500 companies use active AI agents built with low-code and no-code tools. Only 10% have a clear strategy to manage them. Companies are averaging 223 shadow AI incidents per month — a figure that has doubled year-over-year.

💡 Shadow AI refers to the use of AI tools within an organisation without official approval or governance oversight — the engineering equivalent of shadow IT, where developers adopt Lovable, Cursor, or other platforms outside any sanctioned procurement or security review process.

The shadow AI inventory is the governance response. Four methods:

Anonymous team survey. Ask what AI tools developers use for coding, code review, and scaffolding. Anonymity surfaces honest answers. The goal is an accurate picture, not an enforcement action.
Browser extension and IDE plugin audit. Cursor, Copilot, Tabnine, and similar tools leave installation traces. Audit across team machines.
MCP server and permission audit. List which MCP servers are connected to which tools, with what permission scopes, and which team members authorised them. This is the step most organisations skip — and the one that surfaces the highest-risk integrations.
Network log review. Review logs for traffic to known AI platform domains: Lovable, Bolt.new, v0.dev, Claude.ai, Cursor’s telemetry endpoints.

Write the policy after the inventory, not before it. A policy written without the inventory describes what leadership wishes was happening, not what governance can actually enforce.

The arms-race framing that makes governance urgent — attackers using the same vibe coding tools as your developers — makes the shadow AI inventory a continuous process, not a one-time audit.

Why does traditional IAM fail for AI agents, and what replaces it?

Traditional identity and access management was designed for human users: a person authenticates, is granted a role, and acts within its boundaries. AI agents break this model in three ways.

Credential inheritance. AI agents inherit the credentials of the developer who configured them. If the developer has production system access, so does the agent.

Autonomous cross-system operation. An agent can read from one system, write to another, and call external APIs within a single session — without the human checkpoints that constrain this in practice.

Prompt injection susceptibility. If a valid agent presents a valid token to perform a malicious action — because external content manipulated it — Zero Trust authorises the action. The identity is verified. The malicious instruction is not. Google documented a 32% relative increase in malicious prompt injection activity between November 2025 and February 2026.

The governance response is agentic permission scoping: minimum permissions, granted just-in-time and revoked upon task completion. Four steps:

Inventory all AI agent tool access scopes and MCP permissions as part of the day 1–30 shadow AI audit.
Require explicit approval for any agent permission that grants write access to production systems, external APIs, or sensitive data stores.
Establish an agent identity registry. Each agent integration documented: permitted scope, developer accountable, last review date.
Treat MCP integrations as supply chain risk. Supply chain security as a governance requirement covers how attackers are targeting the AI coding toolchain itself.

What does governance look like when it works?

Governed AI engineering means AI tools operating within a traceable accountability framework — development velocity maintained, ownership unambiguous, risk auditable.

The signal that governance is working is that you can answer four questions at any point:

Which AI tools are in use, and by which developers?
Who is accountable for each piece of AI-generated code currently in production?
What security checks did that code pass before deployment?
If an AI-generated component were the vector for a breach in the next 24 hours, who would own the response?

If any of those questions produces uncertainty, the governance programme has a gap. The 30/60/90 framework closes those gaps systematically — and it gives you something to put in front of a board before they ask.

The governance gap is not a knowledge gap. The frameworks are available. The remaining variable is whether you act before or after an incident forces the question. For the full scope of the vibe coding security landscape that makes governance urgent — from the Pentagon’s 100,000-agent deployment to the statistical evidence base — that context is documented in our comprehensive overview.

SoftwareSeni works with engineering teams to implement governed AI development practices — from shadow AI inventory through security review pipeline design to board-level reporting. Teams that want implementation support rather than a framework to self-execute can enquire directly.

Frequently Asked Questions

Who is legally responsible if AI-generated code causes a security breach?

The developer who accepted and committed the code bears primary professional accountability. The organisation carries regulatory and liability exposure. AI tool vendors disclaim liability for functional and security defects in their terms of service. Governance documentation — ownership chain records, review logs — is the primary evidence in any post-incident investigation.

What is the difference between vibe coding and governed AI engineering?

Vibe coding generates functional code with minimal human review, prioritising speed over verification. Governed AI engineering applies defined controls — ownership assignment, security review gates, permission scoping — so output is traceable and auditable. The difference is not the tools used but whether accountability and review are built into the workflow.

What is shadow AI and why is it a governance problem?

Shadow AI is the use of unsanctioned AI tools — developers adopting Lovable, Cursor, Bolt.new, or other platforms without official approval or governance oversight. Policies written without knowledge of actual tool use will be worked around rather than followed. The shadow AI inventory is the prerequisite for any enforceable governance policy.

Does GitHub Copilot’s IP indemnity protect my company if AI-generated code contains a security vulnerability?

No. GitHub Copilot’s IP indemnity covers intellectual property infringement claims under specific conditions. It does not cover security vulnerabilities, functional defects, or regulatory compliance failures. The developer who accepted the code remains accountable for its security posture.

What security tests should AI-generated code pass before going to production?

At minimum: SAST as a CI/CD gate for all AI-generated code; DAST for production-facing components; manual penetration testing for high-risk components handling authentication, data access, and payment processing. Risk-tiered review allocates senior engineer effort by component risk level. The 2.74x multiplier provides the quantitative basis for sizing review capacity.

What is prompt injection and why does OWASP treat it as the primary agent risk?

Prompt injection occurs when malicious instructions embedded in external content hijack an AI agent’s behaviour. It is the primary agent risk because agents will act on instructions from any source they can read. Google documented a 32% relative increase in malicious prompt injection activity between November 2025 and February 2026.

How do I find out which AI tools my developers are already using without formal approval?

Four methods: anonymous team survey; audit of browser extensions and IDE plugins across team machines; audit of MCP server connections and their permission scopes; review of network logs for traffic to known AI platform domains. Document findings before writing any policy.

What questions should I ask a vibe coding platform vendor before allowing team-wide adoption?

Five areas: sandbox isolation; RLS defaults; bug bounty programme; breach response history; IP and data handling — does the vendor train on user-submitted code, and is opt-out available?

Why doesn’t traditional IAM handle AI agent permissions?

Traditional IAM is designed for human users acting within a defined role. AI agents inherit the credentials of whoever configured them, operate autonomously across multiple systems, and can be manipulated through prompt injection. Agentic permission scoping — least-privilege tool access, just-in-time grants, and an agent identity registry — is the governance response.

What is the CIS Controls v8.1 AI Companion and why does it matter for vibe coding governance?

CIS Controls v8.1 AI Companion extends the Center for Internet Security’s control framework to AI-assisted development. It covers MCP integrations explicitly — a gap in most governance frameworks — and maps controls to concrete actions. For teams without a dedicated security function, it is the most immediately actionable control set available.

What should a board-level AI governance briefing include?

Four elements: permitted-tools list; ownership chain documentation rate; SAST pass rate for AI-generated code; outstanding high-risk components awaiting manual penetration testing with scheduled completion dates.

Is it safe to let developers use AI coding tools for HealthTech or FinTech applications?

Yes, under governed conditions. HIPAA, PCI DSS, and SOC 2 do not prohibit AI-assisted development but require auditability, access controls, and demonstrable security review. Governance requirements for regulated verticals: ownership chain documentation for audit evidence, SAST/DAST gates before any data-handling component ships, and agentic permission scoping for any agent that touches regulated data.

Vibe Coding’s Reality Check: Security, Scale, and What Happens When AI Writes the Code

In February 2025, Andrej Karpathy — OpenAI co-founder, former Tesla AI head, and one of the most credible engineers in the field — posted two sentences that launched a thousand think-pieces: he described a new way of building software he called “vibe coding,” where you “fully give in to the vibes” and let an AI write everything while you “forget that the code even exists.”

By April 2026, the Pentagon had built 103,000 AI agents on its GenAI.mil platform in under five weeks. Lovable had reached 8 million users and $200 million ARR in twelve months. A Q1 2026 assessment of over 200 vibe-coded applications found that 91.5% of them contained at least one security vulnerability traceable to AI hallucination. The first documented AI-generated ransomware — VECT — contained a logic error that made its own decryption impossible. And security researchers had identified 443 malicious ZIP archives specifically targeting the configuration files of AI coding tools.

Collins Dictionary named “vibe coding” its Word of the Year for 2025. The evidence suggests the security community is still catching up.

This article is the navigation hub for our complete coverage of vibe coding’s security reality. Each section below provides the key context and directs you to the detailed analysis in the cluster articles where the evidence, methodology, and prescriptive frameworks live.

What’s in this cluster:

Pentagon’s 20,000 AI Agents Per Week and What Institutional Vibe Coding Actually Looks Like — The DoD’s 103,000-agent GenAI.mil deployment as the definitive institutional scale case study; the governance gap between CISA and the Pentagon running simultaneously.
Lovable’s 48-Day Exposure and the 6.6 Billion Dollar BOLA Vulnerability — The incident timeline, BOLA anatomy, $6.6B at risk, and credential rotation requirements for teams that built on Lovable before November 2025.
91.5 Percent of Vibe-Coded Apps Have Vulnerabilities and What the Q1 2026 Research Actually Shows — Four independent studies, their methodologies, and what convergence across Veracode, CodeRabbit, Georgia Tech, and the Q1 2026 assessment actually means.
VECT Ransomware Was Partly Vibe-Coded and It Accidentally Destroyed Every File Over 128KB — The 128KB file destruction logic error, the structural signature of AI-generated code in malware, and what VECT 2.0 signals about the adversarial trajectory.
The 2.74x Vulnerability Multiplier and What AI Code Density Means for Security Review — The CodeRabbit 470-PR study methodology, what 2.74x means in review hours and SAST configuration, and how to adjust sprint security capacity.
443 Malicious ZIP Files and How Attackers Are Targeting the AI Coding Toolchain — CBSE attack mechanics, the Bitwarden CLI “Butlerian Jihad” incident, HuggingFace and ClawHub malware, and a toolchain security inventory checklist.
Vibe Coding Governance and Who Owns the Risk When the Code Writes Itself — The complete governance framework: accountability chain, 30/60/90-day implementation plan, platform vendor evaluation criteria, shadow AI inventory, and agentic identity governance.

What is vibe coding and how is it different from traditional software development?

Vibe coding, coined by Andrej Karpathy in February 2025, is the practice of describing desired software functionality in natural language and accepting AI-generated code without closely reviewing its internal structure. Unlike traditional development, where the programmer writes and owns every line, vibe coding externalises code production to a large language model (LLM). The developer’s role shifts from author to director — defining intent, accepting output, and deploying. Collins Dictionary named it Word of the Year in 2025.

The clearest definition comes from Karpathy himself: “It’s not really coding. I just see stuff, say stuff, run stuff, and copy paste stuff, and it mostly works.” He described the practice as “fully giving in to the vibes, embracing exponentials, and forgetting that the code even exists.”

What makes this definitionally distinct from using GitHub Copilot or Cursor is the question of review and ownership. Simon Willison put it precisely: “If an LLM wrote every line of your code, but you’ve reviewed, tested, and understood it all, that’s not vibe coding in my book — that’s using an LLM as a typing assistant.” The distinguishing characteristic is not who generates the code. It is whether the person deploying it understands what it does.

Google Cloud frames this as two modes: pure vibe coding (fully trust AI output; best for throwaway projects) versus responsible AI-assisted development (AI as pair programmer; user reviews, tests, and owns output). The distinction matters for risk. A developer who reviews, tests, and understands every generated line is doing something categorically different from someone who describes a feature and ships whatever the AI produces.

Waseem et al., in a 2025 arxiv paper that remains one of the most rigorous academic treatments of the topic, locate vibe coding precisely: it “moves control from isolated line-level assistance to conversational generation, automatic scaffolding, and rapid restructuring of end-to-end systems.” This is not an incremental change to how developers work. It is a different workflow with a different risk profile.

Karpathy himself updated his framing in February 2026, moving toward “agentic engineering” as the mature successor — the practice of orchestrating AI agents against detailed specifications with human oversight. The point is the same: vibe coding in its purest form externalises architectural decision-making to the AI, while the responsible path keeps architecture and constraints in human hands and delegates implementation.

The entire cluster in this hub is built on that distinction. The evidence in every subsequent section relates specifically to code produced without adequate human review — which is what vibe coding, as originally defined and as widely practised, means.

Why is vibe coding adoption growing so fast in 2025–2026?

Three forces converged in late 2025: step-change improvements in LLM capability, dramatic reductions in deployment friction via platforms like Lovable and Replit, and public validation from figures including Linus Torvalds, DHH, and the Pentagon. Dario Amodei’s March 2025 prediction that AI would write 90% of code within 3–6 months — widely dismissed at the time — was validated by December 2025. The adoption figures are institutional, not anecdotal.

The capability inflection point is well documented. The releases of Gemini 3 (17 November 2025), Opus 4.5 (24 November 2025), and GPT-5.2 (11 December 2025) represented what Simon Willison described as “one of those moments when the models get incrementally better in a way that tips over an invisible capability line.” Boris Cherny, the creator of Claude Code, reported that in the month after these releases, he didn’t open an IDE at all — Opus 4.5 wrote around 200 pull requests, every single line.

This shift in what the models could do unlocked adoption at every level of the industry. DHH, the creator of Ruby on Rails and a long-time sceptic of AI-generated code, reversed his position when the models improved. Linus Torvalds vibe-coded a component of his AudioNoise project using Google Antigravity in January 2026. Malte Ubl, CTO at Vercel, described Opus and Claude Code as behaving “like a senior software engineer whom you can just tell what to do, and it’ll do it.”

The scale numbers reflect this:

87% of Fortune 500 companies have deployed at least one vibe coding platform
Y Combinator‘s Winter 2025 batch had 25% of companies with codebases that were 95% or more AI-generated
Enterprise adoption grew 340% year on year; non-technical user adoption grew 520%
Gartner projects 60% of new code will be AI-generated by year-end 2026

The pace of adoption is the issue. Adoption is outpacing the security infrastructure built to support it. The governance gap — between how fast vibe coding has moved and how slow policy has followed — is the thread running through all seven cluster articles in this hub.

For the institutional deployment story, start with Pentagon’s 20,000 AI Agents Per Week and What Institutional Vibe Coding Actually Looks Like, which covers the largest documented single-organisation vibe coding deployment and what it reveals about the velocity of institutional adoption.

What are the key statistics that show how widespread vibe coding has become?

If the previous section frames the forces driving adoption, the specific deployment numbers make the scale concrete. The Pentagon built 103,000+ AI agents on GenAI.mil in under five weeks, running 180,000 agent sessions per week. Lovable reached 8 million users and $200 million ARR in 12 months — the fastest software startup growth by several measures. Y Combinator’s Winter 2025 batch had 25% of companies with 95%+ AI-generated codebases. These figures signal institutional normalisation, not experimentation.

The Pentagon case deserves specific attention. The DoD’s GenAI.mil platform uses Google Gemini’s Agent Designer to allow non-technical personnel to build AI agents without writing any code. Robert Malpass, Pentagon Deputy CDAO, described it as allowing “anybody across the Department” to “start to build out and work with advanced AI in their own context.” These agents are authorised at Impact Level 5 — the most sensitive unclassified tier — handling defence data without the code review process that would apply to traditionally developed systems.

At 1.1 million agent sessions and counting, with a creation rate of roughly 20,000 new agents per week, the Pentagon is not running a pilot programme. It has built an organisational infrastructure on vibe coding at a scale no enterprise deployment has matched.

The financial context is equally instructive. Lovable raised $330 million at a $6.6 billion valuation in December 2025. Its first four weeks of operation produced $4 million in ARR. Two months in, with a team of fifteen people, it reached $10 million ARR. Cursor went from $1 million to $500 million in revenue in under two years. App Store submissions surged 84% driven by vibe coding tools. These are not incremental numbers.

If vibe coding is already in your team without a policy, you are almost certainly affected by the governance gap this entire cluster addresses — and the statistics above show why that gap exists at every level of the industry, from individual developer tools to the most sensitive unclassified networks in the US Department of Defence. The Pentagon’s 20,000 AI Agents Per Week article covers the full GenAI.mil deployment story, including what institutional vibe coding looks like at DoD scale and what it reveals about the gap between deployment velocity and governance infrastructure.

What security vulnerabilities does AI-generated code typically introduce?

The most consistent findings across independent studies are broken authorisation (BOLA — Broken Object Level Authorization), missing CSRF protection, absent security headers, server-side request forgery (SSRF), and business logic flaws. In the Tenzai benchmark across five coding agents and 15 applications, all five agents failed to implement CSRF protection and security headers; all introduced SSRF. AI models generate authentication without authorisation — they confirm identity but not what each identity is allowed to do.

The Tenzai research team’s conclusion is worth quoting in full: “Based on our results, consistent with findings from the broader security research community, as of today, it doesn’t really matter which agent you use — vulnerabilities are almost certainly going to be introduced by them.” They tested Cursor, Claude Code, OpenAI Codex, Replit, and Devin. The agents built what was explicitly asked for, but “completely failed to grasp the bigger picture. They lack the security mindset to proactively introduce defensive mechanisms that weren’t explicitly requested.”

The pattern is structural, not incidental. AI coding agents perform well on “solved” vulnerability classes — SQL injection, cross-site scripting — where clear defences exist in training data. They fail consistently on “unsolved” classes where context determines what is safe: authorisation logic, SSRF, business rule enforcement. Zero agents across all Tenzai test runs used CSP, X-Frame-Options, HSTS, X-Content-Type-Options, or proper CORS configuration.

The quantified picture from Q1 2026:

91.5% of vibe-coded applications contained at least one AI hallucination-related flaw (Q1 2026 assessment, 200+ applications)
60% or more exposed API keys or database credentials in public repositories
AI-co-authored code introduces security vulnerabilities at 2.74x the rate of human-written code (CodeRabbit, 470 GitHub pull requests)
40–62% of AI-generated code contains security vulnerabilities depending on the study and measurement methodology
35 CVEs directly attributable to AI coding tools were disclosed in March 2026 alone, up from 6 in January — Georgia Tech estimates the actual figure is 5–10x higher than detected

Trend Micro‘s framing captures the practical consequence: “The real risk of vibe coding isn’t AI writing insecure code. It’s humans shipping code they never had a chance to secure.”

For the full research picture — methodology, study convergence, and what the numbers mean for assessing your codebase — read 91.5 Percent of Vibe-Coded Apps Have Vulnerabilities and What the Q1 2026 Research Actually Shows. For the operational translation of the 2.74x finding into code review hours, SAST configuration, and sprint capacity, see The 2.74x Vulnerability Multiplier and What AI Code Density Means for Security Review.

What happened at Lovable and what does it reveal about the vibe coding security problem?

Between approximately November 2025 and January 2026, Lovable’s API contained a Broken Object Level Authorization (BOLA) vulnerability that allowed any free-tier account to access another user’s profile, source code, AI chat histories, and database credentials in five API calls — no offensive hacking required. The vulnerability was reported to HackerOne on 3 March 2026 and remained open for 48 days. The company’s initial response described it as “intentional behaviour.” $6.6 billion in projected customer business value was at risk. This is the breach that put vibe coding security on CTO radar.

BOLA is OWASP API Security Top 10 #1 since 2019. The mechanics are straightforward: an attacker signs up for a free account, finds an endpoint that serves a project by ID, and swaps the ID for someone else’s. The backend confirms who the requester is, but doesn’t verify whether they are allowed to access that specific resource. This is the authorisation gap that Tenzai’s benchmark identified as universal across all five AI coding agents tested — and it’s the gap that made Lovable’s architecture structurally vulnerable.

The company’s response made a bad situation worse. Lovable’s cycle ran: first posted “did not suffer a data breach,” called the exposed data “intentional behaviour,” then blamed its own documentation, then blamed HackerOne, then issued a partial apology. A researcher named Taimur Khan had separately found 16 vulnerabilities — 6 critical — in a single Lovable-featured application with over 100,000 views, including inverted authentication logic that granted anonymous users full access while blocking authenticated ones. His report through Lovable’s support channel was closed without a response.

The structural pattern extends beyond Lovable itself. Approximately 70% of Lovable-generated applications ship with Row-Level Security disabled in their Supabase database configurations. Bolt.new has the same default. This is not a one-off configuration error; it is a structural consequence of vibe coding workflows that prioritise functional output over defensive defaults.

The Moltbook breach confirmed the pattern: a vibe-coded social network breached within three days of launch, with 1.5 million API authentication tokens and 35,000 email addresses exposed through misconfigured Supabase with no row-level security.

When evaluating vibe coding platform vendors, the response cycle is as significant a governance signal as the vulnerability itself — a vendor that deflects, blames the disclosure process, or takes 48 days to address a critical authorisation flaw is a vendor that has not built security into its product culture.

For the complete incident timeline, BOLA anatomy, credential rotation steps for teams affected, and a vendor response evaluation framework, read Lovable’s 48-Day Exposure and the 6.6 Billion Dollar BOLA Vulnerability.

Is AI-generated code safe to put into production?

AI-generated code is not safe to put into production without a structured security review process. The evidence is consistent across independent studies: 91.5% of vibe-coded applications carry at least one vulnerability, AI-co-authored code introduces security flaws at 2.74x the rate of human-written code, and the SecureVibeBench benchmark found the best-performing coding agent achieves only 23.8% correct-and-secure outputs. “Safe” depends entirely on what review process you apply before and after deployment.

The more useful question is not whether AI-generated code is safe, but what risk profile your application carries and what review process matches it. AI-generated code for an internal tool that automates a spreadsheet report, with no external access and no sensitive data, is a different proposition from AI-generated code handling user authentication, payment processing, or personal health records.

The SecureVibeBench result gives the clearest answer to what “unreviewed” actually means in practice: across 41 open-source security benchmarks, the best available coding agent produced correct-and-secure outputs only 23.8% of the time. For the remaining 76.2%, the code was either functionally incorrect, insecure, or both. The Escape security platform’s scan of vibe-coded applications in production found 2,000+ high-impact vulnerabilities and 400+ exposed secrets. These are not hypothetical risks — they exist in live systems.

What review is required depends on the risk classification of the code. The green/yellow/red zone framework covered in the governance section provides the operational decision tool for matching oversight level to risk. Functional testing alone is insufficient at any tier — it does not catch BOLA, SSRF, or missing security headers, none of which produce failures in normal test scenarios.

There is also the problem Lawfare identified as “vibe compliance”: vibe coders can prompt AI to generate compliance documentation — risk assessments, security policies, audit responses — that appears credible but bears no relationship to actual security measures. If your compliance artefacts were generated the same way your code was, your compliance posture may be as fragile as your codebase.

The 84% of developers who report using AI tools, with only 46% fully trusting AI-generated code, represent the gap precisely. The question is not whether to trust the tools — it is whether to ship before the review is done.

For the statistical backbone of the vibe coding security case, see the full research synthesis. For the operational translation into a security review process, see The 2.74x Vulnerability Multiplier and What AI Code Density Means for Security Review.

What is the “flow-debt tradeoff” in vibe coding?

The flow-debt tradeoff, from Waseem et al.’s 2025 arxiv paper, describes how vibe coding creates productive early momentum — developers ship features fast, with high confidence — followed by compounding structural problems as systems scale toward production. The “flow” phase is real: velocity is genuinely high. The “debt” phase is also real: architectural inconsistency, testing gaps, deployment fragility, and missing documentation accumulate faster than they can be addressed by normal maintenance processes.

Waseem et al. identify six debt dimensions: requirements ambiguity (prompts don’t specify architecture, so the AI makes choices you don’t know about); architectural inconsistency (each session starts without full context of previous decisions, producing what they call “a patchwork structure rather than a coherent design”); security vulnerabilities (the subject of most of this cluster); testing gaps (AI-generated tests cover only happy-path cases and omit edge conditions and failure states); deployment fragility (dependencies and configuration assumptions embedded in generated code surface in production); and maintainability challenges (code that no human fully understands is code that no human can confidently modify).

The GitClear longitudinal study across 211 million lines of code changes from 2020–2024 shows this trajectory at scale: code refactoring dropped from 25% to under 10% of changed lines; code duplication quadrupled; code churn nearly doubled. These are the signals of a codebase accumulating structural debt faster than it is being addressed.

The METR randomised controlled trial (July 2025) added a productive counterpoint: experienced open-source developers using AI tools were 19% slower, despite predicting they would be 24% faster and believing afterward they had been 20% faster. The productivity case for vibe coding is real in the right contexts. In the wrong contexts — complex, high-stakes, production systems — the flow phase gives way to the debt phase faster than the velocity gains can compensate.

The practical implication: the flow-debt tradeoff is most acute at the 6–18 month mark, when early vibe-coded MVPs begin scaling toward production systems and the accumulated debt becomes visible as incident frequency and maintenance burden. Building governance into the workflow from the start — before the debt phase arrives — is the intervention that changes the outcome. The question of who owns the risk when the code writes itself is not rhetorical; it is the governance starting point.

What does the VECT ransomware case tell us about vibe coding and adversarial AI?

VECT ransomware, analysed by Check Point Research on 29 April 2026, bears the structural signature of AI-generated code: a critical logic error that made its own encryption irreversible, string obfuscation routines that cancelled each other out, and a misidentified cipher implementation. These are the same vulnerability classes found in legitimate vibe-coded applications. If defenders are inadvertently deploying vulnerable code because AI generation is unreviewed, threat actors are doing the same — with the same structural consequences.

The 128KB destruction bug is the technically memorable detail. VECT split files over 128KB into four chunks, encrypted each with a nonce written to a shared output buffer — but each new nonce overwrote the previous one. Only the last nonce was preserved. Even if a victim paid the ransom and received the decryption key, every file chunk except the last was irrecoverably destroyed. This is not a deliberate wiper; it is an AI-generated logic error that accidentally produced one.

Check Point Research theorised that the group behind VECT “either used AI tools to generate some of its code or relied on an older code base as the starting point.” The additional code flaws support the AI-generation hypothesis: CPU thread mismanagement, three encryption modes parsed into code but never implemented, and Ukraine listed as a CIS member — an error not seen in other modern ransomware.

The more significant signal is VECT 2.0. An updated version appeared and corrected some of the original bugs. Threat actors are iterating on their AI-generated code. The same loop — vibe code → identify flaws → fix and re-release — applies to ransomware development as readily as it does to legitimate applications. VECT 1.0 was buggy; VECT 2.0 is less buggy. VECT 3.0 will be less buggy still.

VECT operates as a ransomware-as-a-service (RaaS) operation with an affiliate network on BreachForums, partnered with threat actor TeamPCP. The coordinated campaign dimension — TeamPCP is linked to both the VECT ransomware development and the AI toolchain supply chain attacks documented in the 443 malicious ZIPs finding — suggests this is not isolated experimentation. It is structured, iterated threat actor activity: when threat actors started using the same tools the development community adopted, the arms race became measurable.

The governance implication is that the vibe coding security question is not one-sided. “Is our code secure?” is necessary but insufficient. “Are adversaries now accelerating malware development using the same tools, and are they closing the gap faster than we can defend?” is the complete question.

For the technical analysis of the 128KB bug and the arms-race framing, read VECT Ransomware Was Partly Vibe-Coded and It Accidentally Destroyed Every File Over 128KB. For the TeamPCP toolchain attack context, see 443 Malicious ZIP Files and How Attackers Are Targeting the AI Coding Toolchain.

How are attackers targeting the AI coding toolchain itself?

Cymulate Research Labs identified 443 malicious ZIP archives and a vulnerability class called Configuration-Based Sandbox Escape (CBSE) targeting AI coding tools as of early 2026. The attack writes malicious instructions to configuration files — .claude/settings.json, .mcp.json — causing the payload to execute on the host operating system when the AI tool restarts. As Cymulate put it: “every new Claude Code session triggers the hook and executes the attacker’s command silently, with no user notification.” Claude Code’s CBSE vulnerability was patched in v2.1.2 (CVE-2026-25725, CVSS 7.7 High); Gemini CLI’s equivalent remained unpatched for 90+ days.

Codex CLI from OpenAI was reported as affected; the issue was closed as “informational, not fixed,” with OpenAI citing prompt injection as out of scope. Cymulate’s conclusion: “the sandbox is treated as the security boundary, while the real boundary — the host-side configuration and execution logic — remains writable from inside the sandbox.”

The Bitwarden CLI supply chain attack illustrates a different vector. On 22 April 2026, version 2026.4.0 of the Bitwarden CLI was compromised for approximately 1.5 hours. The malicious module was named “Butlerian Jihad” — a deliberate Dune reference that signals the attackers knew their audience. It explicitly targeted authenticated AI coding assistants by name: Claude Code, Gemini CLI, Codex CLI, and others. Any developer who ran an npm install during that 1.5-hour window may have been exposed, along with any CI/CD pipeline that ran during that period.

Pillar Security researchers demonstrated a “rules file backdoor” attack: hidden malicious instructions injected into configuration files used by Cursor and GitHub Copilot. In March 2026, the “Agent Commander” attack showed that prompt injection into AI coding agents could convert autonomous coding tools into remotely controlled malware delivery platforms.

The framing for your team: the supply chain attack surface that extends beyond the code itself is now the primary concern. The tools your developers use to build are now a primary target. An inventory of installed AI tools, their versions, and their permission scopes is a governance requirement.

For the full attack surface mapping, CBSE mechanics, the Bitwarden CLI incident detail, and a practical toolchain security inventory, read 443 Malicious ZIP Files and How Attackers Are Targeting the AI Coding Toolchain.

How should you approach vibe coding governance for your engineering team?

The foundational governance principle is clear: the developer who accepted AI-generated code is the accountable party — not the AI tool, not the platform vendor. From that starting point, governance requires three things: a risk zone classification system (which code can AI produce without extra review, which requires heightened review, which is human-first regardless), a security review pipeline before production deployment, and a shadow AI inventory of tools already in use.

Shadow AI is the realistic starting point for most teams. The numbers are instructive: 80% of Fortune 500 companies have deployed AI agents; only 10% have a management strategy. Teams are already using Lovable, Bolt.new, Cursor, and other tools with or without a policy. If you build a governance policy without first understanding what your team is actually doing, you build a policy they will work around.

The governance gap itself — the distance between vibe coding’s deployment speed and the security infrastructure surrounding it — was made visible at the highest possible institutional level when the Pentagon deployed 103,000+ agents while CISA was simultaneously publishing careful adoption guidance. As one framing put it, “adoption is driven by productivity incentives and competitive pressure; governance policy follows incidents.” The job of governance is to close that gap before an incident forces the conversation.

Three credible anchoring frameworks provide the institutional authority for building a governance programme that will survive board scrutiny:

CISA Careful Adoption Guidance: US federal framework; establishes the authoritative baseline for what “careful adoption” means in practice.
Five Eyes Joint Guidance on Agentic AI: Intelligence alliance guidance; frames AI agent governance as a national security concern, not only a compliance matter.
CIS Controls v8.1 AI Companion: The most operationally specific guidance available; covers MCP integrations explicitly; the framework that gets you from principle to control.

The green/yellow/red zone framework makes governance practical:

Green (AI with standard review): scaffolding, boilerplate, unit test generation, internal tooling with no sensitive data, UI components with no authentication logic.

Yellow (AI with enhanced review and mandatory security scanning): authentication changes, payment processing flows, encryption and key management, API integrations with external systems.

Red (human-first, AI as assistant only): security incident response, compliance flows (KYC/AML), cryptography, anything handling regulated data, anything you cannot fully test or reason about.

The regulatory timeline matters here. EU AI Act full obligations take effect 2 August 2026. GDPR applies to data in prompts today — if your developers are pasting customer data into AI coding tools to provide context, that data handling needs to be in scope for your data protection policy. For FinTech and HealthTech organisations, KYC/AML flows are already red-zone by any governance framework’s logic.

There is also the “vibe compliance” problem identified by Lawfare: developers can prompt AI to generate compliance documentation — risk assessments, security policies, audit reports — that appears credible but bears no relationship to actual security measures. If your compliance posture was produced by the same workflow as your code, it warrants the same scrutiny. Sizing the security review workload requires understanding the density finding and its operational implications — the 2.74x multiplier is the starting point for sprint capacity planning.

The IT Revolution‘s observation is worth holding: “Organisations that already operate at DORA elite level are better poised to take advantage of AI. Organisations that still struggle with tension between speed and quality will find they cannot get promised value from AI.” Governance is not the brake on vibe coding’s productivity gains. It is the condition for making those gains sustainable.

For the complete governance framework — the full accountability chain, a 30/60/90-day implementation plan, platform vendor evaluation criteria, shadow AI inventory process, and agentic identity governance — read Vibe Coding Governance and Who Owns the Risk When the Code Writes Itself.

Vibe Coding Resource Library

Foundational Understanding

What is vibe coding and why does it matter now? Vibe Coding’s Reality Check: Security, Scale, and What Happens When AI Writes the Code — This page. Start here for definitions, scale evidence, and navigation to all depth articles.

Institutional scale deployment Pentagon’s 20,000 AI Agents Per Week and What Institutional Vibe Coding Actually Looks Like — The DoD’s 103,000-agent GenAI.mil deployment as the scale-setting case study; the governance gap between CISA and the Pentagon running simultaneously. Estimated read: 10–12 minutes.

The statistical backbone 91.5 Percent of Vibe-Coded Apps Have Vulnerabilities and What the Q1 2026 Research Actually Shows — Methodology, convergence across four independent studies, SecureVibeBench’s 23.8% result. Estimated read: 12–14 minutes.

Incidents and Evidence

The platform breach case study Lovable’s 48-Day Exposure and the 6.6 Billion Dollar BOLA Vulnerability — Incident timeline, BOLA anatomy, $6.6B at risk, and credential rotation requirements for affected teams. Estimated read: 11–13 minutes.

The operational risk multiplier The 2.74x Vulnerability Multiplier and What AI Code Density Means for Security Review — How the density finding changes code review hours, SAST configuration, and sprint security capacity. Estimated read: 12–14 minutes.

Adversarial Dimension

AI-generated malware VECT Ransomware Was Partly Vibe-Coded and It Accidentally Destroyed Every File Over 128KB — The 128KB file destruction logic error, arms-race trajectory, and what VECT 2.0 signals for defenders. Estimated read: 10–11 minutes.

Toolchain attacks 443 Malicious ZIP Files and How Attackers Are Targeting the AI Coding Toolchain — CBSE attack class, Bitwarden CLI incident, HuggingFace and ClawHub malware, and the toolchain security inventory. Estimated read: 12–14 minutes.

Governance and Action

The governance framework Vibe Coding Governance and Who Owns the Risk When the Code Writes Itself — Accountability chain, 30/60/90-day implementation plan, platform vendor evaluation, shadow AI inventory, and agentic identity governance. Estimated read: 14–16 minutes.

Frequently Asked Questions

What exactly is vibe coding and should my team be using it?

Vibe coding is the practice of using natural language prompts to generate software without reviewing the resulting code. Whether your team should use it depends on what you are building, the risk profile of the application, and what review process you apply. For internal tooling with no sensitive data, the productivity gains are real. For customer-facing applications handling authentication, payments, or personal data, unreviewed vibe-coded output presents material security risk. The answer is not a blanket yes or no — it is a governance framework that matches oversight to risk. See Vibe Coding Governance and Who Owns the Risk When the Code Writes Itself for the framework.

What is spec-driven development and how does it differ from vibe coding?

Spec-driven development (SDD) is the practice of defining a structured specification — mission, tech stack, roadmap, architectural constraints — before engaging AI agents. Andrej Karpathy positioned this as “agentic engineering” in early 2026, the mature successor to vibe coding. The difference: vibe coding externalises both the problem definition and the solution to the AI; spec-driven development keeps architecture, context, and constraints in human hands while delegating implementation. SDD addresses vibe coding’s core weaknesses — context decay across sessions, architectural inconsistency, and the absence of testable acceptance criteria.

What is the difference between using Cursor or GitHub Copilot and full vibe coding?

AI-assisted tools like Cursor and GitHub Copilot assist the developer — the developer writes, reviews, and owns the output. Full vibe coding platforms like Lovable generate complete applications from natural language without developer review of the code structure. The risk profiles differ: the 2.74x vulnerability multiplier from the CodeRabbit study applies to AI-co-authored code at the Cursor/Copilot tier. Full vibe coding platforms introduce additional risk from the platform’s default configurations (RLS off by default in Lovable and Bolt.new), the absence of developer review, and the generation of complete application architectures rather than code fragments. See The 2.74x Vulnerability Multiplier for the risk spectrum analysis.

Who coined the term “vibe coding” and when did it emerge?

Andrej Karpathy coined “vibe coding” in February 2025, describing it as “fully giving in to the vibes” and “forgetting that the code even exists.” Karpathy — AI researcher, OpenAI co-founder, and former Tesla AI head — used the term to describe his own practice of building software by describing intent to an LLM and accepting the output without close review. By February 2026, Karpathy had moved on to “agentic engineering” as his preferred term for the mature version of the practice. Collins Dictionary named vibe coding its Word of the Year for 2025.

CISA published careful adoption guidance for AI agents in the same period the Pentagon was deploying 103,000+ of them. The gap is structural: adoption is driven by productivity incentives and competitive pressure; governance policy follows incidents. The governance gap is the distance between the AI tools your team is already using and the oversight, review, and accountability processes in place for the code they produce. The 30/60/90-day governance framework in Vibe Coding Governance and Who Owns the Risk When the Code Writes Itself is designed to close that gap without requiring a dedicated security team.

How does AI-generated code compare to human-written code in terms of security vulnerability rates?

The CodeRabbit analysis of 470 GitHub pull requests found AI-co-authored code introduces security vulnerabilities at 2.74 times the rate of human-written code. The Q1 2026 assessment of 200+ vibe-coded applications found 91.5% contained at least one vulnerability traceable to AI hallucination. The Veracode analysis of 4 million code scans found 45% of AI-generated code samples contained OWASP Top 10 vulnerabilities, with no improvement across 2025–2026 testing cycles. These are not contradictory findings — they measure different dimensions of the same structural problem: AI generation without review produces higher vulnerability density than human authorship. See 91.5 Percent of Vibe-Coded Apps Have Vulnerabilities for the full methodological comparison.

Is vibe coding just a trend or is it actually changing how software gets built?

It is changing how software gets built. The evidence is institutional: the Pentagon’s 103,000-agent deployment, Boris Cherny’s 200 pull requests with every line AI-written, Gartner’s projection that 60% of new code will be AI-generated by year-end. The productivity gains are real and documented — studies report improvements in the 20–55% range depending on task type and context. The security risks are also real and documented. The question is no longer whether vibe coding is happening — it is whether the governance infrastructure around it is adequate to the risk. For organisations that get the governance right, vibe coding is a capability multiplier. For those that don’t, it is a liability accumulator.

Conclusion

The evidence presented across this cluster adds up to a clear picture: vibe coding has moved faster than the security infrastructure built to support it, and the gap is measurable.

91.5% of vibe-coded applications carry at least one AI hallucination-related vulnerability. AI-co-authored code introduces security flaws at 2.74 times the rate of human-written code. The Pentagon built 103,000 agents in under five weeks. A vibe coding platform with 8 million users exposed source code and database credentials via a five-API-call vulnerability for 48 days. The first AI-generated ransomware contained a logic error that made its own decryption impossible. Attackers are targeting AI coding tool configuration files with 443 documented malicious archives.

None of this is an argument against AI-assisted development. The capability improvements are real, the productivity gains in the right contexts are real, and the institutional adoption is not reversing. The argument is for governance: matching the oversight applied to AI-generated code to the risk profile of what it does, starting with an honest picture of what your team is already using.

The governance framework in Vibe Coding Governance and Who Owns the Risk When the Code Writes Itself is the place to start. It covers the accountability chain, a 30/60/90-day implementation plan for teams without a dedicated security function, platform vendor evaluation criteria, and the shadow AI inventory process that gets you a realistic picture of current tool use before you build a policy.

The cluster articles in this hub exist because each dimension of this picture — the institutional scale, the breach incident, the statistical evidence, the adversarial escalation, the toolchain attack surface — deserves depth that a single article can’t provide. Follow the threads that matter most for your situation. The Resource Library above is organised to help you choose.

443 Malicious ZIP Files and How Attackers Are Targeting the AI Coding Toolchain

In May 2026, Cymulate Research Labs published a research report with a number at the top that deserves your attention: 443 malicious ZIP archives, 20 distinct malware campaigns, all of them targeting the configuration files and infrastructure of AI coding tools — not the code those tools produce.

Three attack classes. Configuration file injection — what Cymulate calls Configuration-Based Sandbox Escape (CBSE) — turns a tool’s own startup mechanism into an execution engine. npm supply chain compromise, exemplified by the April 2026 Bitwarden CLI attack, weaponises trusted package distribution. AI model and skill repositories — Hugging Face and ClawHub — have become staging grounds for malware that fires the moment a model is loaded or a skill is selected.

This article sits within the broader vibe coding security landscape this campaign targets. Each section covers how one attack class works, which tools are affected, and what a toolchain security inventory looks like in practice.

What is a supply chain attack on an AI coding tool — and why is it harder to detect than a traditional attack?

A supply chain attack targets the tools and infrastructure developers use to build software — not the finished application. When that toolchain includes AI coding assistants, model repositories, and skill registries, the attack surface expands to cover every artifact a developer trusts by default.

Traditional malware arrives through known bad channels: a phishing email, a drive-by download. Supply chain attacks arrive through trusted channels. An npm package update. A model pulled from Hugging Face. A skill installed from a registry your CI/CD pipeline uses. The developer doesn’t make a mistake — they follow the normal workflow, and the normal workflow delivers the payload.

The detection gap is architectural. Existing security tooling — antivirus, signature-based scanners, Hugging Face’s own PickleScan — checks for known bad patterns in code and binary structures. It does not semantically analyse configuration files or agent-readable documents for injected intent. A scanner that cannot read instructions cannot flag them.

The vibe coding workflow amplifies this. Developers extend implicit trust to AI-recommended artifacts. If the AI tool is itself compromised, that trust chain collapses from the root. That is why developers are high-value targets given AI-generated code vulnerability rates: developer workstations hold cloud API keys, GitHub tokens, and CI/CD secrets. They are the root of the supply chain.

What is Configuration-Based Sandbox Escape (CBSE) and which AI coding tools are vulnerable?

CBSE is the name Cymulate Research Labs gave to a recurring vulnerability class they found across Claude Code, Gemini CLI, and Codex CLI. The attack does not require a remote code execution exploit. It requires write access to a configuration file.

Here is the mechanism. An attacker gains execution inside the sandbox — through a prompt injection, a malicious repository, or a compromised dependency — and writes a malicious payload to the tool’s project-level configuration file. The session ends. The file persists on disk. The next time the developer starts the tool, it reads that file and executes the payload on the host OS with the developer’s full privileges.

Claude Code (CVE-2026-25725, CVSS 7.7)

The flaw was in the bubblewrap (bwrap) sandbox on Linux. Read-only protections for .claude/settings.json only applied if the file already existed at startup — and it does not exist by default. In most projects, the file was absent, so the protection never engaged.

An attacker writes .claude/settings.json from inside the sandbox, inserting a malicious SessionStart hook. The next time the developer runs Claude Code on the host, the hook fires and the payload executes. Anthropic patched CVE-2026-25725 in v2.1.2, 16 days after responsible disclosure.

Gemini CLI (unpatched as of publication)

Two vulnerability families. First: Gemini CLI mounts sensitive host paths — including .gemini/settings.json and oauth_creds.json — with write permissions inside the sandbox container. Second: on Windows, binaries in the project directory take precedence over system binaries, allowing an attacker to substitute a malicious where.exe or docker.exe. Cymulate notified Google on 7 January 2026. As of the May 2026 publication, 90 days had elapsed with no patch.

Codex CLI (closed as informational)

Vulnerable via configuration poisoning of .codex/config.toml and a notify feature that executes outside the sandbox. OpenAI closed the report as “informational” without fixing the architecture.

The vendor response comparison is a material data point: Anthropic patched in 16 days; Google has not patched after 90+ days; OpenAI dismissed the finding. If you are evaluating AI coding tools, that comparison belongs in your assessment.

How did the Bitwarden CLI attack specifically target AI coding tools?

The Bitwarden CLI — @bitwarden/cli on npm — is a trusted credential management tool used in developer workflows. On 22 April 2026, attackers published version 2026.4.0 containing a malicious payload. It was live for approximately 1.5 hours before removal.

The attack vector was novel. A Bitwarden engineer’s GitHub account was compromised. The attacker rewrote the publish workflow to exchange a GitHub Actions OIDC token for an npm auth token and published a malicious tarball directly. This was the first confirmed supply chain attack in which OIDC Trusted Publishing — GitHub’s mechanism for publishing packages using short-lived CI/CD tokens rather than long-lived secrets — was used as the distribution channel. A trusted mechanism turned into a delivery vehicle.

The compromised package shipped a preinstall hook that launched a credential stealer targeting SSH keys, GitHub tokens, AWS and GCP credentials, and AI tool configuration files.

The payload’s internal module was codenamed “Butlerian Jihad” — a deliberate Dune reference. It probed $PATH for Claude Code, Gemini CLI, Codex CLI, Kiro, Aider, and OpenCode. Credential files targeted explicitly include ~/.claude.json and .mcp.json — MCP is Anthropic’s open standard for connecting AI coding agents to external tools, and .mcp.json holds authentication tokens for all of them. One file, multiple downstream systems.

All stolen data was exfiltrated to audit.checkmarx.cx — a domain impersonating the legitimate security vendor Checkmarx. That level of operational care is not opportunistic.

Any CI/CD pipeline that ran npm install or npm ci during that 1.5-hour window on 22 April 2026 may have installed the compromised package. The threat actor is TeamPCP — the same group behind what VECT ransomware reveals about the adversarial AI coding picture.

How are attackers using Hugging Face and ClawHub to distribute malware?

Hugging Face hosts more than one million AI models. Protect AI examined over four million models and identified approximately 352,000 unsafe or suspicious issues across 51,700 of them. JFrog found more than 100 capable of arbitrary code execution. This is not a niche concern.

The attack technique is called nullifAI, named by JFrog. Machine learning models are commonly stored using Python’s pickle serialisation format — and pickle execution is equivalent to arbitrary code execution. Hugging Face’s PickleScan scanner is designed to catch malicious pickle files by parsing serialised Python objects for known dangerous patterns. NullifAI wraps the payload inside a 7z archive before embedding it in the model file. PickleScan does not decompress 7z archives before scanning. The file passes as safe. When the model loads, the wrapper expands and the payload executes — with the developer’s credentials in scope.

ClawHub is the skill registry for the OpenClaw AI coding agent ecosystem. A Koi Security audit of all 2,857 skills found 341 malicious entries — 335 of them traced to a single coordinated operation called ClawHavoc. Acronis TRU subsequently identified 575+ malicious skills across 13 developer accounts in the broader ecosystem.

ClawHub malware uses indirect prompt injection: hidden instructions in skill descriptions or SKILL.md files cause the AI agent to execute malicious actions without the developer’s knowledge. One skill presented itself as a YouTube transcript fetcher but instructed the AI to download a payload from GitHub or pipe an encoded curl command into bash. The agent does not need a human to click anything. It just needs to select the skill.

Snyk‘s ToxicSkills research puts it plainly: 36% of AI agent skills contain security flaws, and roughly 20% are outright malicious. ClawHub is not an outlier. Payloads from ClawHavoc included AMOS stealer (macOS) and cryptocurrency miners for both platforms.

Who is TeamPCP and how does it connect these attacks to a coordinated campaign?

TeamPCP first appeared in late December 2025, targeting misconfigured Docker APIs and Kubernetes clusters. Also tracked as UNC6780. What followed was not opportunism. Look at the pace:

19 March 2026: Trivy (Aqua Security‘s scanner) compromised — malicious code force-pushed to 76 of 77 version tags across multiple registries. The European Commission was hit.
24 March 2026: LiteLLM compromised on PyPI. ~500,000 credentials potentially exposed, including API keys for Meta, OpenAI, and Anthropic.
27 March 2026: Telnyx SDK backdoored on PyPI.
April 2026: PyTorch Lightning compromised for 42 minutes. Campaign name: “Mini Shai-Hulud.”
22 April 2026: Bitwarden CLI npm compromise, 1.5-hour window.
May 2026: Cymulate publishes 443 malicious ZIP archives across 20 campaigns.

In late March 2026, VECT ransomware announced an alliance with TeamPCP, declared intent to pursue ransomware operations against every organisation compromised during those campaigns, and listed its first known victim on 15 April — a property-management company claiming exfiltration of four million emails and 700 GB of data. How VECT ransomware fits the adversarial AI coding picture covers what happens in stage two.

TeamPCP’s consistent objectives: harvest developer credentials, establish workstation persistence, and use compromised machines as supply chain nodes for the next campaign. This is not “one vendor had a bad week.” It is a sustained operation against the developer toolchain as infrastructure — and it is one dimension of the full picture of vibe coding’s security reality.

What should I audit right now to assess my AI toolchain exposure?

Here is the inventory you can run immediately.

1. AI tool version check

Verify all developers have Claude Code v2.1.2 or later — CVE-2026-25725 is patched in that release. If your team uses Gemini CLI, the CBSE vulnerability is unpatched as of publication; treat .gemini/settings.json and oauth_creds.json as unprotected. If Codex CLI is in use, the architectural issue was not fixed — communicate OpenAI’s “informational” closure to your team.

2. npm lockfile audit

Check package-lock.json and yarn.lock across all repositories for @bitwarden/cli at version 2026.4.0. Any project that ran npm install or npm ci between approximately 5:57 PM and 7:30 PM ET on 22 April 2026 should be treated as potentially compromised. Cross-reference with StepSecurity‘s published IOCs for the Shai-Hulud campaign.

3. Credential rotation priority

If April 22 exposure is confirmed, rotate in this order: cloud provider credentials (AWS IAM keys, GCP service accounts, Azure service principals); GitHub personal access tokens and deploy keys; AI tool authentication files (~/.claude.json and equivalents); MCP server connection tokens (.mcp.json).

4. Configuration file inventory

Locate all instances of .claude/settings.json, .mcp.json, .gemini/settings.json, oauth_creds.json, and .codex/config.toml on developer workstations and CI/CD runners. Any SessionStart hooks or equivalent entries of unknown origin should be treated as suspect. If the file did not exist before and now does, investigate.

5. Model and skill provenance

If your team uses Hugging Face models, verify they originate from official organisation repositories rather than forks. Prefer safetensors format over pickle-based formats — safetensors does not execute code on load. Audit any ClawHub or OpenClaw skills in use against the Koi Security and Acronis TRU published malicious skill lists.

6. Least-privilege IAM

Scope down cloud credentials used with AI CLI tools. A developer running Claude Code or Gemini CLI does not need production database access or the ability to create IAM users. Reduce blast radius before the next incident, not after.

For the policy layer on top of this technical inventory, see supply chain security controls in a vibe coding governance framework.

Frequently Asked Questions

What are the 443 malicious ZIP archives discovered by Cymulate?

Cymulate Research Labs’ May 2026 research catalogued 443 malicious ZIP archives associated with 20 distinct malware campaigns, all crafted to target AI coding tool configurations — including .claude/settings.json and .mcp.json. Twenty separate campaigns signals that multiple threat actors now treat the AI coding toolchain as a primary attack surface, not an opportunistic one.

What files on my developer workstations are being targeted by attackers?

The primary targets are AI coding tool config and auth files: ~/.claude.json, .mcp.json, .gemini/settings.json, oauth_creds.json, and .codex/config.toml. Attackers also target SSH keys, npm and GitHub tokens, AWS and GCP credentials in environment variables, and .env files containing API keys.

What is indirect prompt injection and why does it matter for AI agent security?

Indirect prompt injection embeds hidden instructions in documents or skill files that an AI agent reads. The agent executes those instructions on the user’s behalf without their knowledge. In ClawHub’s case, malicious skills used SKILL.md instructions to redirect agent actions toward credential exfiltration or installing secondary payloads. The agent becomes an unwitting attack intermediary.

How do I check if my CI/CD pipeline installed the compromised Bitwarden CLI package?

Check all package-lock.json and yarn.lock files for @bitwarden/cli at version 2026.4.0. Focus on CI/CD runs between approximately 5:57 PM and 7:30 PM ET on 22 April 2026. Cross-reference with StepSecurity’s published IOCs for the Shai-Hulud campaign.

What is OIDC Trusted Publishing and why was its compromise significant?

OIDC Trusted Publishing is GitHub’s mechanism for publishing npm packages using short-lived CI/CD tokens instead of long-lived API secrets — designed to eliminate stolen credential attacks. The Bitwarden CLI attack was the first documented instance of this channel being weaponised: rather than stealing a secret, the attacker compromised the CI/CD pipeline that held publishing rights. That invalidated a security assumption many teams had built on.

Why did Google’s Gemini CLI remain unpatched for 90+ days after disclosure?

Cymulate disclosed the Gemini CLI CBSE vulnerabilities on 7 January 2026. As of the May 2026 publication, Google had not issued a patch or a formal public response. The practical implication: the vulnerability remains live. Treat .gemini/settings.json and oauth_creds.json as unprotected.

How does nullifAI bypass Hugging Face’s PickleScan safety scanner?

PickleScan detects malicious pickle files by parsing serialised Python objects for known dangerous patterns. NullifAI wraps the payload inside a 7z archive before embedding it in the model file. PickleScan does not decompress 7z archives before scanning, so the payload is invisible to it. When the model loads and deserialises, the wrapper expands and the payload executes.

What is the Model Context Protocol (MCP) and why is .mcp.json a target?

MCP is Anthropic’s open standard for connecting AI coding agents to external tools, databases, and services. .mcp.json specifies which MCP servers an agent may communicate with, including authentication tokens for those servers. The Bitwarden CLI “Butlerian Jihad” payload explicitly targeted it because one file unlocks multiple downstream systems.

How should I evaluate the security maturity of an AI coding tool vendor before adoption?

Look at how they respond to disclosed vulnerabilities: Anthropic patched CVE-2026-25725 in 16 days; Google had not patched after 90+ days; OpenAI closed the Codex CLI report as “informational.” Ask vendors specifically about sandbox isolation and configuration file protection. Include those questions in your standard security questionnaire.

What is the difference between a CBSE attack and a traditional remote code execution exploit?

A traditional RCE exploit delivers a payload through a software vulnerability — a buffer overflow, a parser bug, an injection flaw. CBSE requires no exploit. An attacker inside the sandbox writes a text file — the configuration file — and waits for the developer to restart their tool. The tool’s own startup mechanism executes the payload. No exploit needed.

What other supply chain attacks preceded the Bitwarden CLI incident in 2026?

Trivy was compromised on 19 March 2026 — the European Commission was hit. LiteLLM followed on 24 March (500,000 credentials at risk). Telnyx SDK was backdoored on 27 March. PyTorch Lightning was compromised in April for 42 minutes (“Mini Shai-Hulud”). All attributed to TeamPCP, systematically probing developer dependency trust across GitHub Actions, Docker Hub, npm, PyPI, and OpenVSX.

What is the “Butlerian Jihad” module and why does the name matter?

It is the internal name of the credential-harvesting module within the malicious Bitwarden CLI package. In Dune, the Butlerian Jihad was the crusade against artificial intelligence. The name signals the attackers understood their audience. Dune-themed exfiltration repository names — sardaukar, mentat, fremen, atreides — confirm it is deliberate, and it is a form of threat actor fingerprinting security researchers use for campaign attribution.

VECT Ransomware Was Partly Vibe-Coded and It Accidentally Destroyed Every File Over 128KB

Vibe coding — writing software by prompting AI and shipping the output without deep review — is not just a habit your developers have picked up. It is showing up in attacker tools now too. VECT ransomware, a Ransomware-as-a-Service operation that surfaced in December 2025, has the structural fingerprints of unreviewed AI code generation all through it. The most consequential is a logic error so fundamental that VECT permanently destroys the data it was supposed to hold hostage. Every file over 128KB is rendered permanently irrecoverable because the encryption loop discards three of the four nonces it generates.

Check Point Research published their technical analysis on 29 April 2026. Their conclusion: VECT is the work of novice actors who know what a professional ransomware tool should look like, but demonstrably struggled to build one. The result is a data wiper dressed up as ransomware.

This article covers what VECT is, how the 128KB bug works, why AI-generated code produces this exact class of error, and what the two-sided development environment means for organisations whose own developers are using the same tools. For the broader security picture of vibe coding in production, this is one piece of a larger picture.

What is VECT ransomware and what makes researchers think it was partly AI-generated?

VECT is a Ransomware-as-a-Service (RaaS) operation — the operator provides the builder, infrastructure, and negotiation platform; affiliates deploy it and take a cut. It appeared in December 2025 on Russian-language cybercrime forums, claimed its first two victims in January 2026, then released VECT 2.0 in February 2026 targeting Windows, Linux, and VMware ESXi.

Check Point Research accessed the builder panel via BreachForums and did static analysis on the compiled binaries. VECT’s .text section has flat entropy and libsodium source path strings sitting in plaintext — both signs of unobfuscated, unreviewed code. CPR identified three structural signatures consistent with AI code generation.

Signature 1 — Nonce overwrite bug. The encryption loop generates a fresh nonce per chunk but writes all four to a single shared buffer. Only the final nonce survives to disk.

Signature 2 — Self-cancelling XOR obfuscation. VECT applies XOR twice with the same key, which cancels the effect entirely and leaves strings in plaintext. That is the pattern you get when a model combines obfuscation snippets without understanding that they invert each other.

Signature 3 — Geofencing inconsistency. The Linux and ESXi variants enforce CIS geofencing and exit without encrypting. The Windows variant has no such check. That is consistent with AI-generated platform variants produced independently without anyone cross-checking shared requirements.

CPR also found advertised speed modes that are parsed but never read back, and anti-analysis routines that are compiled but permanently unreachable. Their inference: “The authors know what features a professional ransomware tool should have, but demonstrably struggled to implement them correctly or at all.”

These are the same vulnerability classes that appear across legitimate AI-coded production applications — logic errors, missing controls, dead code — because the same generation process produces them on both sides.

💡 RaaS (Ransomware-as-a-Service): A criminal business model where an operator provides the ransomware builder, infrastructure, and negotiation platform; affiliates deploy and share revenue. VECT distributed access keys to all BreachForums members in April 2026, removing the usual affiliate vetting barrier.

How does the 128KB file destruction bug work — and why does it make paying the ransom pointless?

VECT uses ChaCha20-IETF via libsodium to encrypt files. Stream ciphers require a unique random value for each encryption operation — a nonce. If that nonce is not saved, decryption is mathematically impossible regardless of whether you hold the key.

💡 Nonce: A “number used once” — a unique random value required by stream ciphers like ChaCha20 to ensure each encryption operation produces distinct output. Losing a nonce makes the ciphertext permanently unreadable, even with the correct key.

For files over 131,072 bytes (128KB), VECT divides the file into four chunks and encrypts each with a freshly generated 12-byte nonce. All four nonces get written to the same buffer variable — each iteration overwrites the previous one. Only the fourth nonce makes it to disk. The first three are gone.

So what does that mean in practice? Paying the ransom gets you a key and the fourth nonce. The first three quarters of every large file are irrecoverable. As CPR noted, 128KB is smaller than a typical email attachment — database files, VM images, email archives, and documents with attachments all exceed it.

JUMPSEC, which independently analysed VECT, found a second bug on top of this: files between 32KB and 128KB are never encrypted at all — just renamed with the .vect extension. Their summary: “The only files that Vect can actually decrypt and restore are files under 32KB.”

The practical upshot: do not pay. Focus on offline or immutable backup restoration, and check files in the 32KB–128KB range first — those are only renamed and fully recoverable without any key.

Why does AI-generated code produce exactly this kind of bug — and does that apply to defenders too?

The nonce overwrite is a loop state error. Any developer who has debugged a loop writing results into a single variable instead of indexed slots will recognise it immediately. A human developer writing encryption code typically knows nonces must be per-operation. An LLM generating from training examples where loop results are accumulated may not apply that constraint.

The double-XOR failure is the same class of mistake. The geofencing inconsistency across platform ports is the clearest diagnostic of all: models prioritise local functional correctness over global coherence — code that looks right at the snippet level but breaks down when requirements must hold across variants generated in separate sessions.

VECT is not an isolated case either. Transparent Tribe (APT36) was documented in early 2026 using AI coding tools to produce implants that were “often unstable and riddled with logical errors” — volume over sophistication, flooding defensive telemetry with disposable binaries.

The defender-side connection is direct. The generation process is the same on both sides. What differs is the review gap. If your team ships AI-generated code and your review process does not specifically audit loop state variables, per-iteration nonce initialisation, and multi-platform logic parity, your organisation shares the same risk class as VECT’s developers. The code is not malware. The review gap looks identical.

What is VECT 2.0 and what does the fact that threat actors iterated tell us?

The timeline is tight: VECT appeared December 2025, VECT 2.0 released February 2026, CPR analysis published 29 April 2026. That is a two-month development iteration cycle.

What changed in 2.0: expanded platform targeting (Linux and ESXi added), RaaS infrastructure, formal affiliate onboarding. What did not change: the 128KB nonce bug. They added reach without fixing the core encryption flaw.

On 18 April 2026, VECT issued access keys to all registered BreachForums users, removing the usual vetting barrier entirely. As JUMPSEC noted: “Lockbit spent years working on their encryptor before opening the gate to affiliates, whereas Vect have gone straight to opening the gate to anyone.” The affiliate fee is real; the negotiation link returns a 404.

VECT’s confirmed victims were reached via TeamPCP, responsible for supply chain attacks in March 2026 affecting more than 1,000 SaaS environments. VECT announced a formal partnership with TeamPCP to exploit compromised organisations. The parallel campaign targeting the AI coding toolchain itself is covered separately.

The arms-race argument rests on that iteration. VECT 1.0 had bugs; VECT 2.0 corrected some and expanded reach. The trajectory is real — VECT is the evidence.

What does the arms race mean for organisations whose developers are also using AI coding tools?

Vibe coding creates security risk on both sides of the fence. VECT is the first documented case where attacker-side vibe coding produced a traceable, analysable outcome — a broken tool. But VECT 2.0 confirms attackers apply correction processes, just like any development team would.

The asymmetry matters. When VECT’s nonce bug broke its own ransomware, the cost fell on victims — not on the operators. If a defender’s AI-generated encryption code had the same flaw, the cost would fall on the organisation that shipped it: breach notification, regulatory scrutiny, litigation exposure.

The review practices that would have caught VECT’s nonce bug — auditing loop state management, per-iteration variable initialisation, and multi-platform logic parity — apply equally to any team shipping AI-generated cryptographic code. Standard code review designed to catch human-written bugs may not catch these, because the failure modes are different.

VECT is early-stage. The appropriate response is preparation: targeted review practices for the specific bug classes AI generation produces, implemented before the adversarial trajectory matures. Understanding this threat in the context of the full vibe coding security landscape — from institutional deployment to statistical vulnerability rates to toolchain attacks — clarifies why the risk is structural, not incidental. The governance implications of a two-sided arms race — who owns the risk when the code writes itself — are the next layer.

Frequently Asked Questions

What is VECT ransomware?

VECT is a Ransomware-as-a-Service (RaaS) operation first advertised on Russian-language cybercrime forums in December 2025. Version 2.0, released February 2026, targets Windows, Linux, and VMware ESXi systems. Check Point Research analysed it in April 2026 and found structural signatures consistent with partly AI-generated code, including a logic error that permanently destroys data in files over 128KB.

Can VECT ransomware decrypt my files if I pay the ransom?

For files over 128KB, no — decryption is cryptographically impossible. VECT’s nonce overwrite bug discards three of the four encryption nonces when processing large files. Without those nonces, the ChaCha20-IETF cipher cannot reverse the encryption regardless of what key the attacker provides. The information required to build a decrypter was destroyed the moment VECT ran.

What is the 128KB file destruction threshold in VECT?

VECT processes files over 131,072 bytes (128KB) in four equal chunks. It generates a unique nonce for each chunk but writes all four to the same buffer variable — each write overwrites the previous. Only the fourth nonce survives to disk. The first three are gone, making the first three quarters of every large file permanently irrecoverable.

What is a cryptographic nonce and why does losing one matter?

A nonce is a unique random number required once per encryption operation in stream ciphers like ChaCha20. It ensures each chunk produces distinct ciphertext. If it is not saved, the ciphertext from that chunk can never be decrypted — even by the person who created it.

Is VECT ransomware actually a data wiper?

In practical effect, yes — for all files over 128KB, VECT permanently destroys data with no recovery path. VECT was advertised as ransomware, but the nonce bug made it an accidental wiper. Files between 32KB and 128KB are not encrypted at all — only renamed with the .vect extension and fully recoverable without any key.

What structural evidence led Check Point Research to call VECT partly AI-generated?

Three code signatures: a nonce overwrite logic error; double-XOR obfuscation that cancels itself; and geofencing logic present in Linux/ESXi variants but absent in Windows. Each is characteristic of unreviewed AI code generation. The inference is structural — no actor admitted using an AI tool.

Who is TeamPCP and how does VECT connect to them?

TeamPCP is a threat actor responsible for March 2026 supply chain attacks on Trivy, LiteLLM, Checkmarx KICS, and Telnyx, affecting more than 1,000 SaaS environments. VECT announced a formal partnership with TeamPCP to exploit compromised organisations. VECT’s two confirmed victims were reached via the Trivy supply chain pathway.

Is vibe-coded malware more or less dangerous than traditionally written ransomware?

Currently less sophisticated — VECT’s core encryption is broken — but not less dangerous in impact. A data wiper causes the same data loss as functioning ransomware without any recovery path. VECT 2.0’s existence confirms attackers are iterating. The concern is not where VECT is today; it is where the next version lands.

How does VECT compare to ransomware like LockBit?

LockBit and comparable professional RaaS operations employ experienced developers who review and test their encryption. VECT’s structural bugs — self-cancelling obfuscation, incorrect cipher identification in its own advertising, geofencing inconsistency across platforms — do not appear in mature operations. VECT is early-stage adversarial vibe coding: dangerous in its accidental destructiveness, not yet at the technical maturity of established RaaS groups.

Did hackers use ChatGPT to write VECT ransomware?

CPR found structural evidence of AI code generation but did not identify a specific LLM tool. The evidence is inferential: the bug profile matches patterns LLMs produce when generating code without review. No actor has claimed to have used a specific AI tool to build VECT.

What should an organisation do if hit by VECT ransomware?

Do not pay — decryption is not possible for files over 128KB. Files between 32KB and 128KB were never encrypted, only renamed with the .vect extension, so those are recoverable without any key. Recovery depends on offline or immutable backups and containment of lateral movement paths (WMI, SMB, SSH).

The 2.74x Vulnerability Multiplier and What AI Code Density Means for Security Review

AI coding assistants are shipping code faster than most engineering teams have updated their security processes. A December 2025 analysis of 470 real-world pull requests found that AI-co-authored code introduces cross-site scripting vulnerabilities at 2.74 times the rate of human-written code. The code is still being merged. The review processes have not changed to match.

This article explains the methodology behind the 2.74x figure, translates it into review hour estimates and SAST configuration changes, and gives you a practical framework for adjusting sprint security capacity when AI is writing part of your codebase. For the broader picture, start with the vibe coding security landscape. This article is about the numbers and the process changes that follow from them.

Where Does the 2.74x Number Come From and How Was It Measured?

The 2.74x figure comes from CodeRabbit‘s December 2025 study of 470 open-source GitHub pull requests — 320 AI-co-authored and 150 human-only — drawn from real production repositories, not synthetic benchmarks.

Vulnerability density is the count of security flaws per pull request. That is the right unit because it mirrors how review workload is actually allocated — a reviewer is assigned to a PR, not to a line count.

CodeRabbit categorised each PR using commit metadata, co-author annotations, and tool-specific signatures. “AI-co-authored” means a developer used an AI coding assistant and committed the output after review — not unreviewed, machine-only submissions. The 2.74x finding applies to code that passed through developer review.

Here is where teams get tripped up: the 2.74x multiplier is XSS-specific. Cross-site scripting (CWE-80) appeared in AI-co-authored PRs at 2.74 times the rate of human-authored PRs. The same study found 1.7x more issues overall. These two figures are frequently conflated — they are not the same thing. 2.74x is the XSS-class density; 1.7x is the blended rate across all categories. Both matter, but they carry different implications for how you configure your SAST tooling.

Worth noting: CodeRabbit is both the researcher and a code review vendor. But the sample was public open-source code, and the finding has been independently corroborated.

Veracode‘s 2025 GenAI Code Security Report tested more than 100 LLMs across 4 million-plus code scans. Forty-five per cent of AI-generated code failed OWASP Top 10 tests. In the Spring 2026 update, XSS (CWE-80) had a 15% pass rate — the persistent worst performer. Larger models produce no better outcomes than smaller ones. The density issue is structural, not a model version problem.

For the broader research context behind the 2.74x finding — including the 91.5% vulnerability rate and the arXiv academic study — see the research synthesis article.

What Does 2.74x Actually Mean for Your Code Review Process?

The multiplier translates directly into review hours, and the maths is uncomfortable.

At 8 minutes per SAST finding — triage, reproduce, confirm, classify, document — a human-authored PR generating 4 XSS findings takes about 32 minutes. An equivalent AI-co-authored PR generating 11 XSS findings takes about 88 minutes. That is 2.74x the review time per PR. Reviewer capacity has to grow proportionally or findings start to pile up faster than you can clear them.

The compound effect is worse when both CodeRabbit findings land at once. Reviewers are dealing with 1.7x more total findings, with 2.74x more in XSS specifically. A queue of AI-co-authored PRs carries a skewed distribution that puts the heaviest pressure on exactly the most security-sensitive categories.

Faros.ai found that on teams with high AI coding adoption, engineers merge 98% more pull requests — but PR review time increases 91%. Writing code is no longer the bottleneck. Reviewing it is. By June 2025, AI-generated code was adding more than 10,000 new security findings per month across studied repositories — a 10x jump from December 2024. That accumulates fast when your review process has not been built to handle the volume.

The Lovable 48-day BOLA exposure is production evidence that this is not theoretical. It is what you get when review processes have not kept pace.

What SAST and DAST Configuration Changes Does the 2.74x Finding Require?

Two quick definitions, because these tools are more often encountered than configured at scale.

SAST (Static Application Security Testing) analyses source code without executing it — it catches vulnerability patterns at commit time. DAST (Dynamic Application Security Testing) tests the running application to find exploitable paths against a live build. Both need reconfiguration when the expected finding rate has increased 2.74x for XSS and 1.7x overall. The tooling is not wrong; it is calibrated for human-authored code density.

For SAST, the primary change is rule set prioritisation, not tool replacement. Semgrep is the recommended first-line tool — open-source, supports 30-plus languages via YAML-based custom rules, integrates into CI/CD pipelines, and runs targeted rule packs as mandatory merge gates.

At 2.74x XSS density, configure these Semgrep rule packs as hard-fail gate conditions — PRs that trigger findings cannot merge without security team sign-off:

XSS and DOM-XSS (CWE-80) — the highest-density category from the CodeRabbit study
SQL injection (CWE-89) — 18% of AI-generated code still fails this at scale
Command injection and path traversal — AI models frequently concatenate strings rather than use parameterised statements
Hardcoded credentials — AI models sometimes embed credentials in generated code

At higher alert volume, the risk is over-suppression: teams tuning away entire vulnerability classes to manage alert fatigue simultaneously tune away real XSS findings. Suppress only specific rule IDs with confirmed false-positive patterns.

For DAST, StackHawk provides CI/CD-integrated testing against REST, GraphQL, SOAP, and gRPC APIs. At 1.7x overall issue density, runtime-dependent vulnerabilities — authentication bypasses, session handling flaws, broken access control — become more frequent. Expand DAST coverage to every PR touching AI-generated modules.

Java teams should pay particular attention here: Java has a 71% failure rate in Veracode’s Spring 2026 data — the worst of any language. Weight CWE-89 (SQL injection) alongside CWE-80 (XSS) as dual mandatory gate conditions.

Does It Matter Whether Your Team Uses Cursor, GitHub Copilot, or Lovable?

Yes, tool type matters. There is a measurable risk spectrum, and the vulnerability density differs across it.

The 2.74x finding was produced from AI-co-authored PRs where a developer still reviewed and committed the output. That means 2.74x is the measured density at the AI-assisted tier — Cursor and GitHub Copilot. It is not zero risk, as some teams assume. The multiplier applies to code that developers reviewed and approved before commit.

Cursor and GitHub Copilot both retain the developer review layer. The developer sees the generated code, can reject or modify it, and commits only what they accept. Both operate at the 2.74x XSS density baseline — the lower end of the risk spectrum, not the safe end.

Full vibe coding platforms — Lovable, Bolt.new — remove the mandatory developer code review layer before deployment. The density at this tier is logically higher than 2.74x, because the review layer that catches security issues before commit is absent.

Escape‘s scan of 5,600 publicly deployed vibe-coded applications found more than 2,000 high-impact vulnerabilities, 400 exposed secrets (API keys, database credentials, tokens), and 175 instances of PII including medical records and payment data. Bolt.new ships with row-level security off by default.

So here is the practical policy implication. AI-assisted coding via Cursor or Copilot is manageable with the SAST/DAST reconfiguration and sprint capacity adjustments in this article. Full vibe coding platforms require stricter controls or outright restriction for customer-facing or data-handling applications — the review layer that contains the density effect is absent, and nothing substitutes for it.

For context on how the full cluster of evidence frames the risk, see the pillar article.

When Human Review Capacity Runs Out: The Case for Autonomous Penetration Testing

At some volume — and that volume arrives faster than most teams expect — human review cannot keep pace. At 2.74x XSS density and 1.7x overall, each AI-heavy sprint produces more security debt than an equivalent human-authored sprint. The review queue grows faster than it can be cleared.

Autonomous penetration testing is the economic response. It is continuous security assessment using AI agents to test a deployed application for exploitable vulnerabilities at a rate human testers cannot match within a sprint cycle. It runs against the deployed production environment — catching vulnerabilities that passed SAST and DAST gates, including those introduced by integration patterns or runtime configuration.

Escape raised $18 million specifically for this market. Lovable subsequently partnered with Aikido to bring automated pentesting to its platform. The market is confirming that the human review constraint is real.

Where it sits in the pipeline:

SAST at the PR gate — catches known vulnerability patterns in static code before merge
DAST against the staged build — tests the running application for exploitable paths before production
Autonomous penetration testing in production — continuous coverage, catching what passed the upstream gates

It complements SAST/DAST, it does not replace them. Autonomous testing catches broken authentication flows, exposed data through legitimate endpoints, and logic errors that only appear at runtime. For small teams without dedicated AppSec staff, start by blocking AI-generated PRs that include new dependencies without human approval, then layer in autonomous testing as volume grows.

How Do You Adjust Sprint Security Capacity When AI Is in the Coding Loop?

Sprint security capacity planning has to treat the 2.74x density multiplier as an input variable, not background noise. If you do not recalculate security review workload when AI adoption increases, you get a growing review debt that compounds across sprints. The leading indicator is review queue length — it surfaces before anything shows up in incident data.

Here is a practical four-step framework.

Step 1: Estimate current AI code density. What percentage of new commits this sprint are AI-assisted? If 30% are AI-assisted, 30% of SAST finding volume arrives at 2.74x the human baseline XSS rate.

Step 2: Calculate adjusted review hours. Use a blended density multiplier:

Blended multiplier = (AI_density_% × 2.74) + ((1 − AI_density_%) × 1)

At 30% AI density: (0.30 × 2.74) + (0.70 × 1) = 1.52. Apply that to your sprint’s security review allocation.

Step 3: Set a review capacity ceiling. If adjusted hours exceed available reviewer time, you have two options: reduce AI-generated code volume in the sprint, or escalate the overflow to autonomous penetration testing. Do not absorb the excess — that is how review debt accumulates.

Step 4: Assign code ownership explicitly. The developer who accepted the AI-generated code owns its security review. Without explicit ownership, responsibilities diffuse, findings get deferred, and the pipeline loses accountability.

If security is already a constraint, the 2.74x finding means your existing bottleneck gets worse with every AI-assisted sprint. Prioritise SAST reconfiguration — Semgrep gates as hard-fail conditions — before expanding AI coding adoption. The tools need to be in place before volume increases.

Only 12% of organisations apply the same security standards to AI-generated code as to traditional code — a gap that accounts for why 74% cannot provide security provenance data for AI code. That gap closes at the sprint planning level.

The governance-level policy response — who owns the risk, what review requirements apply before production deployment, and how to structure accountability — is covered in how the density math translates into governance policy.

Frequently Asked Questions

What is vulnerability density and why does it matter for AI-generated code?

Vulnerability density is the count of security flaws per unit of code — per pull request, per file, or per thousand lines. The CodeRabbit 470-PR study found AI-co-authored PRs produce XSS vulnerabilities at 2.74 times the density of human-authored PRs. That means the same review process misses proportionally more flaws unless you adjust for it.

Does the 2.74x figure apply to all types of security vulnerabilities or just XSS?

The 2.74x multiplier specifically covers XSS (cross-site scripting, CWE-80). The same study found a separate 1.7x rate for all issue types combined. These figures are frequently conflated — 2.74x is the XSS-specific finding, not the universal multiplier.

Why haven’t newer AI models improved at writing secure code?

Veracode tested more than 100 LLMs across 4 million-plus scans and found no meaningful improvement. AI models produce plausible-sounding but incomplete security controls — a pattern known as AI hallucination. Security pass rates remain stuck at approximately 55% regardless of model size. The vulnerability density issue is structural.

Is AI-generated code in open-source projects as risky as AI-generated code in enterprise codebases?

The CodeRabbit study drew from open-source GitHub repositories, which may have more active reviewer communities than enterprise environments. Enterprise AI code without equivalent review scrutiny may carry higher density than the 2.74x baseline, not lower.

Which OWASP Top 10 categories are most elevated in AI-generated code?

Veracode’s 4 million-scan study found 45% of AI-generated code fails OWASP Top 10 tests. XSS (CWE-80) has an 85% failure rate in the Spring 2026 update — the persistent worst performer. Java-generated code has a 71% failure rate. SQL injection and command injection are the other priority categories for SAST configuration.

Does using GitHub Copilot or Cursor make my application less secure?

Not automatically, but AI-co-authored code — including code produced with Cursor and Copilot — introduces XSS vulnerabilities at 2.74 times the rate of human-written code. The tools do not guarantee insecure output, but they shift the probability distribution in a way that requires adjusted review processes.

What is the difference between Semgrep and a standard SAST tool for AI-generated code?

Semgrep is an open-source SAST tool with customisable rule sets and a low false-positive rate when properly configured. It lets teams run targeted XSS and injection rule packs as mandatory merge gates — the configuration most relevant given the 2.74x XSS density finding.

How do I know if my team’s AI coding volume has exceeded our security review capacity?

The leading indicator is review queue length: if findings are accumulating across sprints rather than being closed within the sprint they are raised, review capacity has been exceeded. Calculate the blended density multiplier and compare the implied review hours against available security reviewer time per sprint.

What is autonomous penetration testing and when does it make sense?

Autonomous penetration testing uses AI agents to test deployed applications for exploitable vulnerabilities at machine speed — covering attack paths, authentication flows, and integration surfaces that SAST/DAST gates may miss. It makes sense when AI coding output volume has grown past the point where manual penetration testing can provide adequate coverage within sprint cadence.

Is vibe coding riskier than using AI coding assistants for security purposes?

Yes — there is a measurable risk spectrum. The 2.74x finding was produced from code where developers retained review control. Full vibe coding platforms like Lovable and Bolt.new remove that layer entirely. Escape’s scan of 5,600 vibe-coded applications found more than 2,000 high-impact vulnerabilities and 400 exposed secrets, suggesting the density at the full vibe coding tier is substantially higher than 2.74x.

How should I handle SAST false positives at 2.74x vulnerability density?

At higher density, alert volume increases — both real findings and false positives. Avoid broad suppression rules that silence entire vulnerability classes. Triage by CWE class instead: suppress only specific rule IDs with confirmed false-positive patterns, and require explicit security team sign-off for any new suppression in XSS or injection categories.

91.5 Percent of Vibe-Coded Apps Have Vulnerabilities and What the Q1 2026 Research Actually Shows

The 91.5% statistic has been doing the rounds since Q1 2026. As a headline it hits hard: nearly every vibe-coded application has a security vulnerability. What the headline drops is the qualifier that makes the number meaningful — and the five independent studies that give it credibility despite its source.

Vibe coding — natural-language-directed AI code generation deployed with minimal human review, coined by Andrej Karpathy in February 2025 and Collins Word of the Year for 2025 — has gone from novelty to engineering norm faster than the security research could keep up. The research is catching up now. When Veracode‘s 4 million scans, Georgia Tech‘s CVE tracker, CodeRabbit‘s 470-PR meta-analysis, Kingbird’s audit of 200+ apps, and Escape‘s 5,600-app production scan all point the same direction, dismissing the finding means dismissing all five independently. This article is part of our comprehensive the full vibe coding security picture, which covers all seven dimensions of vibe coding’s security reality.

What does “91.5% of vibe-coded apps have vulnerabilities” actually mean — and how was it measured?

The 91.5% figure is an app-level prevalence metric, not a code-line rate. It answers “how many applications contain at least one qualifying flaw” — not “what percentage of individual lines of code are insecure.”

The precise qualifier is “at least one AI hallucination-related flaw.” Strip that qualifier and you change the meaning entirely. A single exposed secret in a 50,000-line codebase is enough to make an app count in the 91.5%.

The source is Kingbird Solutions, a security audit firm, from a Q1 2026 internal audit of over 200 vibe-coded applications. Kingbird is a commercial source, not an independent academic institution, and its methodology is partially disclosed: flaw categories and sample size are named, but sampling method and app selection criteria are not published. That’s worth noting. It doesn’t invalidate the finding, but it does mean it shouldn’t stand alone.

The 40–62% code-level vulnerability rate is consistent with the 91.5% app-level rate because apps are made up of many components — any single component flaw triggers inclusion. As Kingbird puts it: “The 40 to 62 percent range applies to the code itself. The 91.5 percent is specifically about hallucination flaws in production vibe-coded apps.”

What is an AI hallucination flaw, and why does it produce the same vulnerability types repeatedly?

In security contexts, an AI hallucination flaw is code that runs and compiles but is built on false security assumptions. The AI invents a security function that doesn’t exist, assumes a framework enforces permissions it doesn’t enforce, or uses a deprecated security library because its training data predates the deprecation. The flaw is latent — it doesn’t surface during functional testing. It surfaces when an adversary probes the false security assumption.

Here’s a common pattern: AI-generated code calls a verifyUser() function that looks plausible but never checks object ownership. Authentication runs; authorisation never does. The app ships with a BOLA vulnerability no functional test will surface.

That mechanism is why the vulnerability distribution is predictable and recurring rather than random.

BOLA (Broken Object Level Authorization) — also called IDOR in older OWASP taxonomy — is the most prevalent access control vulnerability in vibe-coded apps. API endpoints authenticate the user but fail to verify ownership of the requested object. Automated scanning often misses it because it requires understanding business logic. The Lovable breach as the primary 2026 case study — 48 days of live exposure allowing any free account holder to access another user’s source code and database credentials — is the clearest real-world illustration of what that looks like in production.

Credential exposure follows from training data patterns that normalise secrets in code. Escape found 400 exposed secrets across 5,600 scanned apps. ShipSafe found 45% of scanned AI-generated repos had hardcoded secrets.

Injection flaws show an interesting asymmetry: SQL injection (CWE-89) achieves an 82% security pass rate in Veracode’s testing because parameterised query patterns dominate training data. XSS (CWE-80) achieves only 15%, and log injection (CWE-117) only 13% — both require tracking data flow across multiple function calls, which is beyond pattern-matching.

The hallucination mechanism also explains why newer, more capable models haven’t improved security pass rates despite improving syntax correctness. Better models produce better-functioning code using the same false security assumptions. The sophistication improves; the misconceptions persist.

Do four independent studies actually reach the same conclusion — or are they measuring different things?

Four independent studies, different methodologies, different sample populations. Consistent conclusions. Each study’s limitations are different — which is exactly why consistency across those limitations is the evidentiary case.

Veracode (4 million scans): Tested 150+ LLMs across 80 coding tasks and 4 languages. Finding: only 55% of AI code generation tasks produce secure code — “stubbornly stuck at approximately 55%” across the 2025–2026 testing cycles. Syntax correctness has climbed above 95%. The gap between “works” and “works securely” is widening. Limitation: controlled test conditions, not production.

One signal worth acting on for enterprise teams: Java achieves only a 29% security pass rate versus 62% for Python. If your stack is Java-heavy, that is a material per-task risk differential. Not marginal.

CodeRabbit (470-PR meta-analysis): Analysis of 470 GitHub pull requests found AI co-authored code produces XSS (CWE-80) vulnerabilities at 2.74 times the rate of equivalent human-written code. For a deeper look at the CodeRabbit 2.74x finding and its methodology, the companion article covers the engineering implications. Limitation: primary source not independently verifiable.

Georgia Tech SSLab / Vibe Security Radar: Tracks CVEs attributable to AI coding tools across CVE.org, NVD, GitHub Advisory Database, OSV, and RustSec. As of March 2026: 74 confirmed CVEs, with 35 disclosed in March alone — up from 6 in January. Georgia Tech estimates confirmed CVEs represent 10–20% of the actual total. That’s a floor, not a ceiling. Limitation: attribution requires metadata; most AI-assisted code leaves none.

Escape (5,600-app production scan): 5,600 publicly deployed vibe-coded apps, DAST and AI pentesting. Finding: 2,000+ high-impact vulnerabilities, 400 exposed secrets, 175 instances of exposed PII. This is the production-validation study — the same patterns found in controlled conditions exist in deployed, live applications. Limitation: publicly deployed apps may self-select for lower security quality.

Each study’s limitations are independent. Consistency across independent limitations is the evidentiary case.

How do the 40–62% code-level rate and the 91.5% app-level rate fit together — and which number matters more for risk assessment?

These two numbers measure fundamentally different things. Don’t conflate them.

The 40–62% range is a code-level vulnerability rate: what percentage of individual AI-generated code samples contain at least one detectable vulnerability? Veracode sits at approximately 45%, Cloud Security Alliance at 62%.

The 91.5% figure is an app-level prevalence rate: how many complete vibe-coded applications contain at least one AI hallucination-related flaw across all components? Because a production app is made up of many components, even a moderate code-level rate compounds to near-certain app-level prevalence. The two numbers aren’t in tension — they measure different levels of abstraction.

Use the right number for the right question. The 91.5% answers “is our app likely to have a problem?” The 40–62% answers “what proportion of our commits need security review?”

The most operationally useful single number is Veracode’s 55% security pass rate: roughly one in two AI code generation tasks produces vulnerable output. Security review structured as exception-handling is calibrated for a pass rate above 90%. At 55%, it must be a mandatory step for every component. The broader security context of AI code generation — including how these statistics play out in a real platform incident — is best understood through the Lovable breach, which demonstrated in production terms exactly what a 55% pass rate means when it reaches hostile actors. For a complete vibe coding security overview, the pillar resource frames the full evidence base.

What does the SecureVibeBench 23.8% result mean for AI code security in practice?

SecureVibeBench (arXiv 2509.22097) is a peer-reviewed academic benchmark for evaluating secure code generation by AI agents — and its result is the most practically significant single number in the Q1 2026 research for anyone who wants to stress-test these claims.

Rather than generating code from scratch, SecureVibeBench uses vulnerability-introducing commits (VICs) — real commits that introduced CVEs into OSS-Fuzz and ARVO projects — to reconstruct the exact contexts where developers originally introduced vulnerabilities. The agent must solve the task without replicating the historical error. Evaluation covers three dimensions: functional correctness, security correctness, and SAST detection via Semgrep. 105 C/C++ test cases, 41 real projects.

The result: SWE-agent with Claude Sonnet 4.5 achieved a correct-and-secure rate of 23.8%. Other agents performed worse. That’s a 76.2% failure rate for the best available agent under controlled academic conditions, against known vulnerability types.

The benchmark covers memory-related C/C++ vulnerabilities, not web application patterns — so don’t transpose the 23.8% directly to BOLA or XSS. What it validates is the general finding: functional correctness and security correctness remain uncoupled. AI agents have gotten better at producing code that runs. They haven’t gotten better at producing code that runs safely.

Semgrep, used as the SAST oracle in SecureVibeBench, is the validated starting point for vibe-coded codebases. For what SAST and DAST configuration changes are needed when teams use AI coding assistants, the companion article covers the full guidance.

What do these numbers mean for assessing your team’s vibe-coded codebase?

With a 45–55% chance of vulnerable output per code generation task, security review cannot be structured as exception-handling. It has to be a mandatory step in the CI/CD pipeline for every AI-generated component — not a spot-check applied to suspicious-looking code.

SAST configuration requires specific adjustment for AI-generated codebases. Standard configurations are tuned for human-written code dominance. AI-generated code has higher vulnerability density per unit — the same configuration that handles human-written code adequately may be under-scaled. Semgrep, validated by SecureVibeBench, is the recommended starting point. Full SAST reconfiguration guidance is covered in a deeper look at the CodeRabbit 2.74x finding and its methodology.

DAST addresses what SAST cannot: runtime behaviour of deployed applications. For teams that have already deployed vibe-coded code, Escape demonstrates the DAST plus AI pentesting agent approach for continuous production scanning.

Prioritise BOLA and credential exposure first. BOLA can’t be fully automated — it requires understanding the authorisation model to verify that ownership is being checked, not just authentication. Credential exposure requires scanning across the full git history, not only current state.

For teams with Java-heavy stacks: a 29% security pass rate versus 62% for Python means AI-generated Java code carries more than double the per-task vulnerability probability. Applying the same review process uniformly across languages leaves you under-resourced for Java.

And for worst-case context — these vulnerability patterns are not abstract risk. VECT ransomware resulted in the confirmed breach of Mercor — a $10 billion AI recruiting startup — with 4 TB of data exfiltrated via the LiteLLM supply chain compromise. For what happens when the same vulnerability patterns reach hostile actors, the VECT Ransomware article covers the full timeline.

Frequently Asked Questions

Is the 91.5% vulnerability rate a real statistic or vendor marketing?

It’s sourced from Kingbird Solutions, a security audit firm, via a Q1 2026 internal audit of 200+ vibe-coded applications. Commercial research, not academic — scepticism is warranted. Sampling method and app selection criteria are not fully disclosed.

That said, the finding is consistent with four independent studies: Veracode’s 4 million scans, CodeRabbit’s 470-PR analysis, Georgia Tech’s CVE tracking, and Escape’s 5,600-app production scan. The credibility rests on convergence, not on Kingbird alone. And the qualifier “at least one AI hallucination-related flaw” is load-bearing: not “91.5% of code lines are vulnerable” — “91.5% of apps had at least one qualifying flaw of any severity.”

Why hasn’t AI-generated code become more secure as the models have improved?

Veracode’s longitudinal data — 150+ LLMs tested across 2025–2026 — shows no improvement in security pass rate despite significant improvement in functional correctness. The same approximately 55% persists across model generations.

Better models produce better-functioning code using the same false security assumptions baked into training data. Security correctness requires tracking data flow across function boundaries and modelling threat behaviour — a different kind of reasoning than pattern-matching, however sophisticated. Improving the pattern doesn’t fix the assumption.

What does “hallucination flaw” mean in security context versus in general AI usage?

In general AI usage, hallucination refers to factually incorrect text. In security context, an AI hallucination flaw is code that runs and compiles correctly but is built on false security assumptions — invented functions, unenforced framework permissions, deprecated libraries used as current. The flaw doesn’t surface during functional testing; it surfaces during an attack.

What is the difference between BOLA and IDOR?

BOLA (Broken Object Level Authorization) is the current OWASP API Security Top 10 term. IDOR (Insecure Direct Object Reference) is the older OWASP Web Application Top 10 term for the same flaw. An attacker manipulates an object identifier in an API request to access another user’s data. The API authenticates but does not authorise. BOLA is the preferred modern term.

Which AI coding tools produce the most CVEs according to the Georgia Tech research?

Of 74 confirmed AI-attributed CVEs tracked by Georgia Tech’s Vibe Security Radar as of March 2026: Claude Code 49 (11 critical), GitHub Copilot 15 (2 critical). The imbalance is partly methodological — Claude Code always leaves a co-author git signature; Copilot’s inline suggestions leave no trace. The confirmed figure represents 10–20% of the actual total; the remainder is unattributable.

Does vibe coding security risk vary by programming language?

Yes. Veracode’s Spring 2026 study: Java 29% security pass rate, Python 62%, C# 58%, JavaScript 57%. For Java-heavy stacks, that gap should directly influence review prioritisation — it’s more than double the failure rate.

How do I know if my team’s vibe-coded app already has these vulnerabilities?

Start with credential scanning across the full git history — tools like GitLeaks or GitHub’s built-in secret scanning find hardcoded secrets in 45% of ShipSafe-scanned AI-generated repos. BOLA and access control failures require manual review against the authorisation model. Semgrep with AI-code-specific rulesets provides first-pass SAST coverage; Escape provides production-level DAST validation. VibeCheck (notelon.ai) offers a free initial scan of GitHub repos or live sites.

What does “security pass rate” mean, and why is 55% low?

Veracode’s security pass rate is the percentage of AI code generation tasks that produce no detectable vulnerability when analysed by SAST — per-task, not per-line. At 55%, one in two tasks produces vulnerable output. Syntax correctness exceeds 95%. That 40-percentage-point gap is the central finding: “works” and “works securely” are not the same thing, and at 55%, review must be systematic, not exception-handling.

Are there sectors where vibe coding security risk is higher?

Financial services and healthcare report the lowest vibe coding adoption — 34% and 28% — suggesting regulated industries are already self-selecting toward caution. That won’t be enough. EU AI Act high-risk obligations take effect 2 August 2026; California S.B. 53 and the New York RAISE Act add further requirements. The 91.5% vulnerability rate in unsystematically reviewed AI-generated code creates documentable audit exposure in any compliance-sensitive environment.

What is SecureVibeBench, and why is 23.8% the right number to cite for AI security benchmarking?

SecureVibeBench (arXiv 2509.22097) evaluates whether AI coding agents produce code that is both functionally correct and security-safe. 105 test cases from real C/C++ vulnerabilities in OSS-Fuzz and ARVO — real contexts, not synthetic examples. The 23.8% correct-and-secure rate is the best performance by any agent tested (SWE-agent with Claude Sonnet 4.5). Right number for a sceptical audience: peer-reviewed, reproducible, vendor-independent, tests both dimensions simultaneously.

What is the Escape production scan, and why does it matter alongside the controlled studies?

Escape.tech scanned 5,600 publicly deployed vibe-coded applications: 2,000+ high-impact vulnerabilities, 400+ exposed secrets, 175 instances of exposed PII. Its significance is production validation — the same patterns from controlled studies exist in live, deployed applications. The 400+ exposed secrets are the most immediately actionable finding: live credentials, accessible right now.

Do SAST tools need to be reconfigured for AI-generated code, and why?

Standard SAST configurations are tuned for human-written code. AI-generated code has higher vulnerability density per unit — the same configuration may be under-scaled. Reconfiguration priorities: lower thresholds for credential exposure; add rules for BOLA and access control patterns; increase XSS and log injection coverage. Full guidance is in the companion article on the CodeRabbit 2.74x finding.

Lovable’s 48-Day Exposure and the 6.6 Billion Dollar BOLA Vulnerability

In April 2026, a security researcher showed that any free Lovable account could read another user’s source code, database credentials, and full AI chat history. Five API calls. By substituting a project ID.

Lovable is a $6.6 billion platform used by eight million developers. Its enterprise clients include Uber, Zendesk, and Deutsche Telekom. The vulnerability class — BOLA, or Broken Object Level Authorisation — has held the number one spot on the OWASP API Security Top 10 since 2019. The exposure window ran 76 days total. The window after a formal security report was filed: 48 days.

This article reconstructs the timeline, explains the mechanism, documents the company’s response, and gives you a concrete checklist if your team built on Lovable. This incident is part of a broader pattern documented in our vibe coding security reality check — worth reading for full context on what the category looks like across the board.

What Is a BOLA Vulnerability and Why Does Vibe Coding Produce Them So Reliably?

BOLA stands for Broken Object Level Authorisation. It is API1:2023 on the OWASP API Security Top 10 — OWASP being the Open Worldwide Application Security Project, the vendor-neutral body that ranks API security risks. BOLA has held the top spot since 2019.

Plain English version: the server checks that you are logged in, but it does not check whether the object you are requesting actually belongs to you. Any authenticated user can retrieve any other user’s data just by swapping a different ID into the API request. You are authenticated. The failure is one step later — the ownership check. In BOLA, that check simply does not exist.

Here is why vibe coding produces this so reliably. AI models are trained on code where authentication logic is everywhere — login flows, session tokens, JWT verification. Per-object ownership checks are not. The model generates a working login and wires it to API endpoints, but the authorisation layer — “does this user own this resource?” — is sparse in training data, so it is sparse in output. A senior developer writing a /projects/{id} endpoint includes an ownership guard almost by instinct. A vibe-coded equivalent generates the route, wires it to the database, and ships.

In Lovable’s case, the exploit chain required five API calls. Authenticate with a free account. Call /projects/{id}/*, /git/files, /git/file, /documents. Substitute someone else’s project ID. The API verified a Firebase auth token but never checked whether the authenticated user owned the project. The response was a 200 OK with the full source tree.

No SQLi. No XSS. No brute force. A different number in the URL.

If you want the full picture of vibe coding in production and how widespread this pattern has become across the category, that is worth reading separately.

The Lovable Breach Timeline: 48 Days, Five API Calls, No Offensive Hacking Required

The vulnerability was not there from the beginning. Lovable had actually removed it once before.

In March 2025, Lovable deliberately removed API access to chat history and source code for public projects. By November 2025, all new projects were private by default. Then on February 3, 2026, a backend refactor reintroduced the problem. Lovable’s own words: “While unifying permissions in our backend, we accidentally re-enabled access to chats on public projects.” Their testing pipeline did not catch it.

Here is the timeline:

February 3, 2026 — backend regression introduced; BOLA access re-enabled on public projects.

February 22, 2026 — first HackerOne report filed on the regression, per Lovable’s official incident response. Closed without escalation.

March 3, 2026 — independent security researcher Matt Palmer (@weezerOSINT) filed HackerOne report #3583821. Also closed without escalation.

March 3 to April 20 — the 48-day window. A known, formally reported vulnerability sat unpatched because Lovable’s HackerOne triage partners were working from outdated internal documentation that described public project chat visibility as intended behaviour.

April 20, 2026 — Matt Palmer published his public disclosure after receiving no patch. Lovable deployed a fix within two hours.

76 days total is the actual exposure window. 48 days is the disclosure failure window. Both figures are accurate. The 48-day framing matters because it measures how long a known vulnerability went unpatched after responsible disclosure. That is the number to focus on.

What Data Was Actually Exposed — and Whose $6.6 Billion Was at Risk?

More damaging than the source code access were the credentials embedded within it.

Hardcoded Supabase credentials — SUPABASE_URL, SUPABASE_PUBLISHABLE_KEY, SUPABASE_SERVICE_ROLE_KEY — were sitting right there in the exposed source trees. Third-party API keys in .env files: Stripe, SendGrid, Anthropic, GitHub tokens. Full AI conversation histories, including every prompt typed mid-session and every architectural discussion.

Each credential is a separate attack surface. A Supabase service role key bypasses Row-Level Security and grants full database read/write access. A leaked Stripe key can issue refunds or read payment data. A GitHub token can access private repositories.

Connected Women in AI, a Danish nonprofit, had an active admin panel demonstrated as an exploitable target — records included names, job titles, and Stripe customer IDs for individuals at Accenture Denmark and Copenhagen Business School. In a separate February 2026 incident, researcher Taimur Khan found 16 vulnerabilities in a Lovable-featured EdTech application — 6 rated critical — exposing 18,697 user records including 4,538 student accounts from UC Berkeley and UC Davis. The support ticket was closed without a response.

The $6.6 billion figure refers to Lovable’s December 2025 funding valuation. This is not a hobbyist tool. Uber, Zendesk, and Deutsche Telekom built on it in good faith. The platform’s enterprise credibility is what makes the security posture consequential. Gartner forecasts that 60% of all new code will be AI-generated by end of 2026. The scale of this exposure pattern is growing, not shrinking.

For the statistical basis for why BOLA appears so reliably in AI-generated apps, the research behind the numbers goes deeper than this single incident.

The Denial Cycle: How Lovable’s Response Made a Bad Situation Worse

All four stages of Lovable’s response happened on a single afternoon. April 20, 2026.

The Register documented the evolving responses in real time and later called it “a lesson in how not to respond to vulnerability reports.”

Stage 1 — “Intentional behaviour”: Lovable’s first X post stated “We did not suffer a data breach” and described the exposed data as intentional. “When it comes to code of public projects: That is intentional behavior… the core behavior has been consistent and by design.”

Stage 2 — Documentation blame: The framing shifted. Lovable acknowledged “our documentation of what ‘public’ implies was unclear.” The behaviour was no longer intentional, but the failure had been relocated to the user’s understanding rather than the platform’s API.

Stage 3 — HackerOne blame: Lovable pointed to its bug bounty partner: reports were closed without escalation because Lovable’s HackerOne partners thought that seeing public projects’ chats was the intended behaviour. Lovable’s official blog later confirmed the reports were closed “because of outdated internal documentation we provided to the triage team” — which puts the root cause back at Lovable.

Stage 4 — Partial apology: A new X post acknowledged the earlier post “didn’t properly address our mistake.”

Deny, deflect, blame process, partially apologise. It is a recognisable pattern in platform-side security responses — one that shifts liability toward the user without addressing the technical failure. For any vendor evaluation, treat this response cycle as a data point alongside the technical findings.

CEO Anton Osika published a blog post committing to remediation: the regression was fixed, all historically public projects made private, HackerOne triage restructured, and an automated pentesting partnership with Aikido Security announced. As of publication, no independent source has confirmed completion of the in-progress items. For what platform vendors should have built to prevent this, the governance obligations are worth reading directly.

Row-Level Security Off by Default: Why Lovable Is Not the Only Platform with This Problem

The BOLA vulnerability exposed the Supabase credentials. What those credentials accessed was its own problem.

Supabase’s Row-Level Security (RLS) restricts which rows a database user can read or write. When disabled, any authenticated connection can access every row in every table. Approximately 70% of Lovable-built apps shipped with RLS disabled.

Supabase is not at fault here. RLS is available, documented, and works. The vibe coding workflow produces the misconfiguration because the AI generates a working backend without enabling the security layer, and the developer does not know to turn it on.

Bolt.new, from StackBlitz, ships with Supabase RLS disabled by default as well. The Moltbook breach in January 2026 — a vibe-coded social network breached within three days of launch — exposed 1.5 million API tokens and 35,000 email addresses through the same misconfiguration.

A Q1 2026 assessment of more than 200 vibe-coded applications found 91.5% contained at least one exploitable vulnerability — the systemic research that shows Lovable is not an outlier. More than 60% exposed API keys or database credentials in public repositories. Lovable is the named case study. Bolt.new confirms the pattern is structural. Assume the rest of the category shares it until demonstrated otherwise.

What to Do If Your Team Built on Lovable Before April 2026

Lovable’s official statement confirms private projects and Lovable Cloud were not impacted. The exposure applied to public projects between February 3 and April 20, 2026. That said, treat any project created before April 20, 2026 as potentially exposed regardless of its current privacy setting — researchers demonstrated access to projects not set as publicly shareable. Rotate credentials; verify scope later.

Step 1 — Rotate Supabase credentials. Log into the Supabase dashboard. Navigate to Project Settings > API. Regenerate the service role key and the anon key. Update all environment variables in the Lovable project and any deployment pipelines. Treat all .env secrets as burned until verified otherwise.

Step 2 — Audit Row-Level Security. In the Supabase dashboard, navigate to Authentication > Policies. Confirm RLS is enabled on every table that stores user or business data. If disabled, enable it and define policies before redeploying. If you are not sure whether RLS was enabled when the project was built, assume it was not.

Step 3 — Rotate third-party API keys. Any API key hardcoded in a Lovable-built project should be treated as compromised: Stripe restricted keys, SendGrid API keys, Anthropic API keys, GitHub personal access tokens. Regenerate rather than revoke-and-recreate where that preserves existing integrations.

Step 4 — Scan for hardcoded secrets. Use a secrets scanner such as TruffleHog or GitLeaks to review the full repository history. Secrets committed in earlier commits remain accessible even after removal from the current branch.

Step 5 — Assess downstream exposure. If Supabase credentials were hardcoded and RLS was disabled, any authenticated user during the exposure window had full database read/write access. Assess what data was in the database and whether notification obligations apply under GDPR, CCPA, or the Australian Privacy Act. Document the incident, the remediation steps, and the timeline — relevant both to regulatory response and to any contractual obligations to enterprise customers.

Before resuming deployment, ask Lovable’s security team (via their Aikido Security partnership) for confirmation that in-progress remediation items are complete.

Frequently Asked Questions

What is the difference between the 48-day and 76-day figures in the Lovable exposure?

Both figures are accurate and refer to different things. 76 days is the actual exposure window: February 3, 2026 (when the backend regression was introduced) to April 20, 2026 (when the patch was deployed). 48 days is the disclosure failure window: March 3 to April 20 — how long a formally reported, known vulnerability went unpatched after Matt Palmer filed HackerOne report #3583821.

What does “no data breach” mean if my credentials were exposed?

Lovable’s first X post stated “We did not suffer a data breach” — meaning no confirmed evidence of unauthorised exfiltration of data. A data exposure means credentials and source code were accessible. A breach implies confirmed theft. Operationally, the distinction does not change what you need to do. If credentials were accessible, rotate them.

Was Lovable’s bug bounty process at fault, or was this HackerOne’s failure?

Matt Palmer filed HackerOne report #3583821 on March 3, 2026. It was closed without escalation. Lovable’s official account states the reports were closed “because of outdated internal documentation we provided to the triage team.” HackerOne declined to comment. Regardless of where the specific breakdown occurred, 48 days between formal report and patch is a failure of the bug bounty process. Full stop.

Does this mean Supabase itself is insecure?

No. Supabase provides Row-Level Security as a documented feature. The vibe coding workflow produces RLS-off configurations because AI code generators do not enable it by default and developers building with prompts do not know to turn it on. The vulnerability is in Lovable’s API, not Supabase’s platform.

Is my Lovable project affected if it was set to “private”?

Lovable’s official statement is that private projects and Lovable Cloud were never impacted. However, researchers disputed the “public project” framing and demonstrated access to projects not set as publicly shareable. Treat any project created before April 20, 2026 as potentially exposed and rotate credentials regardless of privacy setting.

Which other vibe coding platforms have the same RLS-off-by-default problem?

Bolt.new from StackBlitz ships with Supabase RLS disabled by default. A Q1 2026 assessment found 91.5% of vibe-coded applications contained at least one exploitable vulnerability — Lovable and Bolt.new are named cases, not isolated outliers.

What did Lovable commit to fixing, and has it been verified?

Confirmed completed: the permission regression was fixed; all historically public projects were made private. In-progress as of April 20, 2026: restructuring HackerOne vulnerability triage and communicating with affected project owners. No independent source has confirmed completion of in-progress items. Request confirmation from Lovable’s security team before resuming deployment on the platform.

What legal exposure could we face if company data was accessed via the Lovable breach?

If Supabase databases contained personal data — customer records, employee data, health information, financial data — the exposure may trigger notification obligations under GDPR, CCPA, or the Australian Privacy Act. The threshold is typically “likely to result in serious harm.” Consult legal counsel and document everything.

What questions should I ask a vibe coding platform before deploying it across my team?

Does the platform enforce object-level authorisation checks on all API endpoints? What is the default Row-Level Security configuration for database integrations? Does the platform have a public security disclosure policy, a bug bounty programme, and a documented escalation process that has actually been tested? What is the platform’s most recent penetration test date and can results be shared under NDA? How does the platform handle hardcoded secrets — does it scan generated code before saving, or store credentials as plaintext in the repository?

These are the questions the Lovable incident demonstrates were not being asked loudly enough.

For the full picture of where vibe coding security stands today — across platform incidents, research findings, adversarial use, and governance frameworks — see our comprehensive vibe coding security overview.

Pentagon’s 20,000 AI Agents Per Week and What Institutional Vibe Coding Actually Looks Like

The US Department of Defense confirmed it had created more than 103,000 AI agents in five weeks. That’s twenty thousand new agents per week. Twenty-five thousand daily active sessions at peak. All of it running on unclassified government networks handling sensitive data, built by military and civilian personnel using vibe coding — describing the software you want in plain language and letting AI build it — with no coding required.

That is not a pilot programme. That is institutional vibe coding at a scale no single organisation has publicly documented before.

Here’s the kicker: the very same week this deployment scale became public — late April into early May 2026 — the Five Eyes intelligence community jointly published guidance recommending “careful adoption” of agentic AI systems. The same US government apparatus that authorised GenAI.mil co-authored the warning against deploying exactly what GenAI.mil was already doing at scale.

This article documents what the Pentagon did, how Agent Designer made it possible, and why the governance gap between those two signals matters for any organisation running AI agents right now. For the broader security picture, see our broader analysis of vibe coding’s security reality.

What is GenAI.mil and how did the Pentagon become the world’s largest vibe coding shop?

GenAI.mil is the DoD’s enterprise-wide generative AI platform, launched December 9, 2025. Within a week it had 500,000 users. Within a month, one million — with zero latency issues and zero downtime, according to Pentagon Chief Data Officer Gavin Kliger. Today, up to 3 million DoD personnel have access.

The platform includes Gemini 3.1 Pro (added April 2026), with DoD users getting access only eight weeks behind commercial customers.

The mechanism behind the 103,000 agents is Agent Designer — a no-code agent builder built into GenAI.mil. Any user describes what they want in plain language and the system generates and deploys the agent. No coding required. No engineering team in the loop. DoD sources use the term “vibe coding” without embarrassment.

GenAI.mil operates at Impact Level 5 (IL5) — the highest classification tier for unclassified US government systems. IL5 covers Controlled Unclassified Information (CUI), Protected Health Information (PHI), and Personally Identifiable Information (PII). That is the data environment these vibe-coded agents operate in.

The full vibe coding security landscape explains why that combination — data sensitivity plus unreviewed agent creation — is the central risk question.

How does the DoD vibe coding workflow actually work — who is building these agents?

Any DoD user opens Agent Designer, describes what they want in natural language, and it gets built and deployed. That is the entire workflow.

Breaking Defense described it plainly: these are low-code/no-code tools that guide the user through figuring out what they want in natural language, then autonomously build the agent to their specifications. No software experience required. No mandatory code review. The human is supposed to review the agent’s output before acting on it, but the agent itself goes through no formal security or quality gate before deployment.

The DoD frames this as democratisation. Warfighters, intelligence analysts, logisticians, administrators — anyone across the department can build advanced AI tools for their own context without waiting for a software development cycle.

If you’ve ever used Zapier, Power Automate, or Make, you already understand what Agent Designer is. It’s that category of tool — at a higher data classification tier. And the critical thing to understand for your risk framing: these agents are not automations running predefined rules. They take actions, access data, make decisions, and operate with whatever level of human oversight the individual user decides to configure.

100,000 agents in five weeks: what the usage numbers reveal about deployment velocity

Breaking Defense reported on April 23, 2026 with direct Pentagon attribution: over 103,000 agents built and more than 1.1 million agent sessions recorded as of mid-April. Average weekly sessions: 180,000. That works out to roughly 25,700 daily active sessions — consistent with Defense One’s April 27 reporting citing 25,000 daily sessions at peak.

TechRadar independently confirmed the 20,000 agents per week creation rate. The arithmetic: 103,000 agents over five weeks is roughly 2,900 new agents per day, seven days a week.

The scale matters not just as a number but as an attack surface. Over 100,000 distinct autonomous agents, built without individual review, running on IL5 networks that handle CUI, PHI, and PII. For what that creation velocity likely means in terms of defect density, see what 2.74x vulnerability density means at DoD scale.

And this deployment velocity was a deliberate policy choice. Pentagon acting principal deputy Chief Digital and AI Officer Andrew Mapes was direct about it: “We just don’t have the luxury of taking such a deliberate approach.” Kliger cited competition with China as the strategic justification. The trade-off between pace and review rigour is explicit in their public communications — not accidental, not an oversight. That same week, the intelligence agencies that partner with the US published their answer to exactly that trade-off.

The governance gap: CISA says “careful adoption,” the Pentagon is creating 20,000 agents per week

On May 1, 2026 — within days of the Pentagon’s deployment scale being publicly confirmed — CISA, the NSA, Australia’s ACSC, the UK’s NCSC, Canada’s CCCS, and New Zealand’s NCSC-NZ jointly published “Careful Adoption of Agentic AI Services.” It is the first joint Five Eyes publication specifically addressing agentic AI.

The guidance is straightforward. Deploy incrementally. Start with low-risk tasks. Enforce strict privilege controls. Maintain continuous monitoring. Establish human oversight before scaling. The Register reported the core conclusion: organisations “should assume that agentic AI systems may behave unexpectedly.”

The Pentagon’s deployment — 103,000 agents created by non-technical users over five weeks, no mandatory code review, on networks handling Controlled Unclassified Information — is the opposite approach.

DoD officials point to the IL5 Authorization to Operate (ATO) as the governance answer. GenAI.mil has IL5 ATO — the platform’s infrastructure meets US government security requirements. What it doesn’t cover is the behaviour, data access patterns, or security posture of each of the 103,000 agents created on that platform. ATO is a risk-acceptance sign-off on the platform, not a continuous audit of what every agent built on it is actually doing.

The fact that the Pentagon vibe-coded 100,000 AI agents at the same time CISA and its partners were advocating for careful adoption is institutional irony, not institutional failure. Both signals came from the same US government ecosystem in the same news cycle. That is the governance gap made concrete.

For the security implications of 100,000+ unreviewed agents on sensitive networks, 91.5% of vibe-coded apps assessed in Q1 2026 carried at least one flaw.

What this means when your team is doing the same thing without IL5 oversight

The Pentagon’s deployment is the largest documented case, but the structure is identical everywhere. Microsoft’s 2026 Cyber Pulse survey found that more than 80% of Fortune 500 companies now use active AI agents built with low-code and no-code tools. Only 10% have a clear management strategy. The average enterprise manages 37 deployed agents, with more than half running without any security oversight or logging.

That is shadow AI at enterprise scale. Employees building and deploying agents outside formal IT governance, using exactly the same no-code tooling that powers Agent Designer.

If your teams are using GitHub Copilot, Cursor, or any no-code agent builder connected to production systems or customer data, the structural risk is the same: creation velocity has outpaced review infrastructure. The Five Eyes guidance applies directly to your context — identify what data your agents can access, define the scope of actions they can take, and establish a human review gate before agents are granted persistent permissions.

Shadow AI is already the operating condition of most organisations. That is the present state of the Fortune 500. The difference between the DoD and your organisation is not the technology — it is the presence of any governance layer at all.

The institutional vibe coding policy question

The Pentagon’s deployment makes one thing clear: vibe coding has entered institutional operations at a scale that policy has not kept pace with. The Five Eyes guidance is a post-hoc framework. That sequencing is the point. Policy is the trailing variable in institutional AI adoption.

The accountability question is concrete. When an IL5-authorised platform is used to create tens of thousands of agents per week without individual agent review, what is the accountability chain if an agent mishandles CUI? That gap has no formal answer in current public documentation.

The Five Eyes guidance explicitly acknowledges the threat intelligence landscape for agentic AI is still developing: “some attack vectors unique to agentic AI may not be fully captured or addressed” by existing frameworks like OWASP or MITRE ATLAS.

The Pentagon’s approach is evidence that institutional deployment is already happening at this scale — which means the governance question is urgent rather than theoretical. For organisations that want to build policy before the deployment gets ahead of them, what institutional vibe coding governance should look like is where to start.

What the DoD deployment establishes is that vibe coding at institutional scale requires a governance layer that is not yet standard — and that the absence of one is already the default condition in most organisations.

FAQ

What is GenAI.mil and who can use it?

GenAI.mil is the US Department of Defense’s official generative AI platform, launched December 9, 2025. Up to 3 million DoD personnel — military and civilian — have access, with more than 1.3 million actively using it. It includes Agent Designer (a no-code agent builder) and access to LLMs including Google Gemini. It operates at IL5, the highest unclassified security tier, covering Controlled Unclassified Information, Protected Health Information, and Personally Identifiable Information.

What is Agent Designer on GenAI.mil?

Agent Designer is the no-code agent builder built into GenAI.mil. Any DoD user — regardless of technical background — can describe what they want an agent to do in plain language, and the system generates and deploys it. No coding or code review is required. It is the mechanism behind the 20,000-agents-per-week creation rate.

What is Impact Level 5 (IL5) and what data does it protect?

IL5 is the highest classification tier for unclassified US government systems. It covers Controlled Unclassified Information (CUI), Protected Health Information (PHI), and Personally Identifiable Information (PII). GenAI.mil’s IL5 authorisation means the platform itself is certified to handle that data. Individual agents created on the platform are not individually reviewed or certified.

What is an Authorization to Operate (ATO) for AI systems?

An ATO is a formal US government security risk-acceptance sign-off required before deploying systems on government networks. It is not a continuous security audit. For GenAI.mil, the ATO covers the platform’s infrastructure and configuration — not the behaviour or data access patterns of each of the 103,000+ agents created on it.

What is the Five Eyes joint guidance on agentic AI?

“Careful Adoption of Agentic AI Services” was published May 1, 2026, jointly by CISA, NSA, and their counterpart agencies in Australia, New Zealand, the UK, and Canada. It is the first joint Five Eyes publication specifically addressing agentic AI. The core recommendation: deploy incrementally, start with low-risk tasks, and scale only after establishing monitoring, human oversight, and accountability frameworks.

What is shadow AI in the enterprise context?

Shadow AI refers to AI tools and agents deployed by employees outside sanctioned IT governance. Microsoft’s 2026 Cyber Pulse survey found 80% of Fortune 500 companies have deployed AI agents, yet only 10% have a management strategy in place. The average enterprise manages 37 deployed agents, with more than half running without any security oversight or logging.

Is vibe coding used officially in DoD contexts or is it an informal term?

DoD-adjacent sources including Breaking Defense use “vibe coding” freely when describing Agent Designer’s workflow. The DoD frames it as enabling non-technical warfighters and administrators to build their own tools without programming knowledge. The term has made it into official programme discussions without embarrassment.

Why did the Five Eyes publish agentic AI warnings at the same time the Pentagon was deploying 100,000 agents?

Because institutional adoption has outpaced governance. CISA, which co-authored the Five Eyes warning, is part of the same US federal apparatus that authorised GenAI.mil. Both signals emerged in the same news cycle — Resilient Cyber #96 called it “institutional irony, not institutional failure.”

Does Google Gemini being in GenAI.mil create additional security considerations?

Yes. The Five Eyes guidance addresses supply chain risk for agentic AI — the model provider’s data handling, training practices, and update cadence all become part of your risk surface. Getting the latest model “only eight weeks behind commercial customers” compresses the supply chain evaluation window considerably.

How does the Pentagon’s AI agent deployment compare to what enterprise organisations are doing?

Larger in scale, identical in structure. Agents deployed faster than governance can follow, via no-code tooling, by non-technical users. The difference: the DoD has an IL5 ATO as a partial governance layer. Most enterprises have nothing comparable.

What is the difference between an AI agent and a simple automation or chatbot?

A chatbot responds to prompts. A simple automation runs predefined rules. An AI agent takes actions — it can access data, make decisions, call external services, and operate with varying degrees of autonomy. Agent Designer enables the creation of agents in that third category, which is why the governance implications go well beyond a standard chatbot deployment.

Where can I read the Five Eyes agentic AI guidance?

The full text is available on the CISA resources page, co-released with partner agencies in the UK, Australia, Canada, and New Zealand on May 1, 2026.

What Is CVE-2026-26029 and How Does the Salesforce MCP RCE Work?

Why a Salesforce CVE Changes the Risk Calculus for Enterprise Security Teams

MCP STDIO Transport vs. HTTP/SSE Transport — What the Salesforce Case Reveals

LiteLLM CVE-2026-30623: Patch Notes, Verification, and What to Do Now

The Responsible Disclosure Timeline: Who Knew What and When

Immediate Triage Checklist for Teams Using MCP Integrations

Frequently Asked Questions

Who owns the code when the AI writes it?

What governance frameworks are actually worth building on?

What does a 30/60/90-day governance programme look like without a dedicated security team?

Days 1–30: Inventory and Accountability

Days 31–60: Pipeline and Controls

Days 61–90: Hardening and Measurement

How do you evaluate a vibe coding platform before your team uses it?

How do you govern what your team is already using?

Why does traditional IAM fail for AI agents, and what replaces it?

What does governance look like when it works?

Frequently Asked Questions

Who is legally responsible if AI-generated code causes a security breach?

What is the difference between vibe coding and governed AI engineering?

What is shadow AI and why is it a governance problem?

Does GitHub Copilot’s IP indemnity protect my company if AI-generated code contains a security vulnerability?

What security tests should AI-generated code pass before going to production?

What is prompt injection and why does OWASP treat it as the primary agent risk?

How do I find out which AI tools my developers are already using without formal approval?

What questions should I ask a vibe coding platform vendor before allowing team-wide adoption?

Why doesn’t traditional IAM handle AI agent permissions?

What is the CIS Controls v8.1 AI Companion and why does it matter for vibe coding governance?

What should a board-level AI governance briefing include?

Is it safe to let developers use AI coding tools for HealthTech or FinTech applications?

What is vibe coding and how is it different from traditional software development?

Why is vibe coding adoption growing so fast in 2025–2026?

What are the key statistics that show how widespread vibe coding has become?

What security vulnerabilities does AI-generated code typically introduce?

What happened at Lovable and what does it reveal about the vibe coding security problem?

Is AI-generated code safe to put into production?

What is the “flow-debt tradeoff” in vibe coding?

What does the VECT ransomware case tell us about vibe coding and adversarial AI?

How are attackers targeting the AI coding toolchain itself?

How should you approach vibe coding governance for your engineering team?

Vibe Coding Resource Library

Foundational Understanding

Incidents and Evidence

Adversarial Dimension

Governance and Action

Frequently Asked Questions

What exactly is vibe coding and should my team be using it?

What is spec-driven development and how does it differ from vibe coding?

What is the difference between using Cursor or GitHub Copilot and full vibe coding?

Who coined the term “vibe coding” and when did it emerge?

What is the governance gap between what security bodies recommend and what organisations are actually doing?

How does AI-generated code compare to human-written code in terms of security vulnerability rates?

Is vibe coding just a trend or is it actually changing how software gets built?

Conclusion

What is a supply chain attack on an AI coding tool — and why is it harder to detect than a traditional attack?

What is Configuration-Based Sandbox Escape (CBSE) and which AI coding tools are vulnerable?

How did the Bitwarden CLI attack specifically target AI coding tools?

How are attackers using Hugging Face and ClawHub to distribute malware?

Who is TeamPCP and how does it connect these attacks to a coordinated campaign?

What should I audit right now to assess my AI toolchain exposure?

Frequently Asked Questions

What are the 443 malicious ZIP archives discovered by Cymulate?

What files on my developer workstations are being targeted by attackers?

What is indirect prompt injection and why does it matter for AI agent security?

How do I check if my CI/CD pipeline installed the compromised Bitwarden CLI package?

What is OIDC Trusted Publishing and why was its compromise significant?

Why did Google’s Gemini CLI remain unpatched for 90+ days after disclosure?

How does nullifAI bypass Hugging Face’s PickleScan safety scanner?

What is the Model Context Protocol (MCP) and why is .mcp.json a target?

How should I evaluate the security maturity of an AI coding tool vendor before adoption?

What is the difference between a CBSE attack and a traditional remote code execution exploit?

What other supply chain attacks preceded the Bitwarden CLI incident in 2026?

What is the “Butlerian Jihad” module and why does the name matter?

What is VECT ransomware and what makes researchers think it was partly AI-generated?

How does the 128KB file destruction bug work — and why does it make paying the ransom pointless?

Why does AI-generated code produce exactly this kind of bug — and does that apply to defenders too?

What is VECT 2.0 and what does the fact that threat actors iterated tell us?

What does the arms race mean for organisations whose developers are also using AI coding tools?

Frequently Asked Questions

What is VECT ransomware?