In February 2025, Andrej Karpathy — OpenAI co-founder, former Tesla AI head, and one of the most credible engineers in the field — posted two sentences that launched a thousand think-pieces: he described a new way of building software he called “vibe coding,” where you “fully give in to the vibes” and let an AI write everything while you “forget that the code even exists.”
By April 2026, the Pentagon had built 103,000 AI agents on its GenAI.mil platform in under five weeks. Lovable had reached 8 million users and $200 million ARR in twelve months. A Q1 2026 assessment of over 200 vibe-coded applications found that 91.5% of them contained at least one security vulnerability traceable to AI hallucination. The first documented AI-generated ransomware — VECT — contained a logic error that made its own decryption impossible. And security researchers had identified 443 malicious ZIP archives specifically targeting the configuration files of AI coding tools.
Collins Dictionary named “vibe coding” its Word of the Year for 2025. The evidence suggests the security community is still catching up.
This article is the navigation hub for our complete coverage of vibe coding’s security reality. Each section below provides the key context and directs you to the detailed analysis in the cluster articles where the evidence, methodology, and prescriptive frameworks live.
What’s in this cluster:
- Pentagon’s 20,000 AI Agents Per Week and What Institutional Vibe Coding Actually Looks Like — The DoD’s 103,000-agent GenAI.mil deployment as the definitive institutional scale case study; the governance gap between CISA and the Pentagon running simultaneously.
- Lovable’s 48-Day Exposure and the 6.6 Billion Dollar BOLA Vulnerability — The incident timeline, BOLA anatomy, $6.6B at risk, and credential rotation requirements for teams that built on Lovable before November 2025.
- 91.5 Percent of Vibe-Coded Apps Have Vulnerabilities and What the Q1 2026 Research Actually Shows — Four independent studies, their methodologies, and what convergence across Veracode, CodeRabbit, Georgia Tech, and the Q1 2026 assessment actually means.
- VECT Ransomware Was Partly Vibe-Coded and It Accidentally Destroyed Every File Over 128KB — The 128KB file destruction logic error, the structural signature of AI-generated code in malware, and what VECT 2.0 signals about the adversarial trajectory.
- The 2.74x Vulnerability Multiplier and What AI Code Density Means for Security Review — The CodeRabbit 470-PR study methodology, what 2.74x means in review hours and SAST configuration, and how to adjust sprint security capacity.
- 443 Malicious ZIP Files and How Attackers Are Targeting the AI Coding Toolchain — CBSE attack mechanics, the Bitwarden CLI “Butlerian Jihad” incident, HuggingFace and ClawHub malware, and a toolchain security inventory checklist.
- Vibe Coding Governance and Who Owns the Risk When the Code Writes Itself — The complete governance framework: accountability chain, 30/60/90-day implementation plan, platform vendor evaluation criteria, shadow AI inventory, and agentic identity governance.
What is vibe coding and how is it different from traditional software development?
Vibe coding, coined by Andrej Karpathy in February 2025, is the practice of describing desired software functionality in natural language and accepting AI-generated code without closely reviewing its internal structure. Unlike traditional development, where the programmer writes and owns every line, vibe coding externalises code production to a large language model (LLM). The developer’s role shifts from author to director — defining intent, accepting output, and deploying. Collins Dictionary named it Word of the Year in 2025.
The clearest definition comes from Karpathy himself: “It’s not really coding. I just see stuff, say stuff, run stuff, and copy paste stuff, and it mostly works.” He described the practice as “fully giving in to the vibes, embracing exponentials, and forgetting that the code even exists.”
What makes this definitionally distinct from using GitHub Copilot or Cursor is the question of review and ownership. Simon Willison put it precisely: “If an LLM wrote every line of your code, but you’ve reviewed, tested, and understood it all, that’s not vibe coding in my book — that’s using an LLM as a typing assistant.” The distinguishing characteristic is not who generates the code. It is whether the person deploying it understands what it does.
Google Cloud frames this as two modes: pure vibe coding (fully trust AI output; best for throwaway projects) versus responsible AI-assisted development (AI as pair programmer; user reviews, tests, and owns output). The distinction matters for risk. A developer who reviews, tests, and understands every generated line is doing something categorically different from someone who describes a feature and ships whatever the AI produces.
Waseem et al., in a 2025 arxiv paper that remains one of the most rigorous academic treatments of the topic, locate vibe coding precisely: it “moves control from isolated line-level assistance to conversational generation, automatic scaffolding, and rapid restructuring of end-to-end systems.” This is not an incremental change to how developers work. It is a different workflow with a different risk profile.
Karpathy himself updated his framing in February 2026, moving toward “agentic engineering” as the mature successor — the practice of orchestrating AI agents against detailed specifications with human oversight. The point is the same: vibe coding in its purest form externalises architectural decision-making to the AI, while the responsible path keeps architecture and constraints in human hands and delegates implementation.
The entire cluster in this hub is built on that distinction. The evidence in every subsequent section relates specifically to code produced without adequate human review — which is what vibe coding, as originally defined and as widely practised, means.
Why is vibe coding adoption growing so fast in 2025–2026?
Three forces converged in late 2025: step-change improvements in LLM capability, dramatic reductions in deployment friction via platforms like Lovable and Replit, and public validation from figures including Linus Torvalds, DHH, and the Pentagon. Dario Amodei’s March 2025 prediction that AI would write 90% of code within 3–6 months — widely dismissed at the time — was validated by December 2025. The adoption figures are institutional, not anecdotal.
The capability inflection point is well documented. The releases of Gemini 3 (17 November 2025), Opus 4.5 (24 November 2025), and GPT-5.2 (11 December 2025) represented what Simon Willison described as “one of those moments when the models get incrementally better in a way that tips over an invisible capability line.” Boris Cherny, the creator of Claude Code, reported that in the month after these releases, he didn’t open an IDE at all — Opus 4.5 wrote around 200 pull requests, every single line.
This shift in what the models could do unlocked adoption at every level of the industry. DHH, the creator of Ruby on Rails and a long-time sceptic of AI-generated code, reversed his position when the models improved. Linus Torvalds vibe-coded a component of his AudioNoise project using Google Antigravity in January 2026. Malte Ubl, CTO at Vercel, described Opus and Claude Code as behaving “like a senior software engineer whom you can just tell what to do, and it’ll do it.”
The scale numbers reflect this:
- 87% of Fortune 500 companies have deployed at least one vibe coding platform
- Y Combinator‘s Winter 2025 batch had 25% of companies with codebases that were 95% or more AI-generated
- Enterprise adoption grew 340% year on year; non-technical user adoption grew 520%
- Gartner projects 60% of new code will be AI-generated by year-end 2026
The pace of adoption is the issue. Adoption is outpacing the security infrastructure built to support it. The governance gap — between how fast vibe coding has moved and how slow policy has followed — is the thread running through all seven cluster articles in this hub.
For the institutional deployment story, start with Pentagon’s 20,000 AI Agents Per Week and What Institutional Vibe Coding Actually Looks Like, which covers the largest documented single-organisation vibe coding deployment and what it reveals about the velocity of institutional adoption.
What are the key statistics that show how widespread vibe coding has become?
If the previous section frames the forces driving adoption, the specific deployment numbers make the scale concrete. The Pentagon built 103,000+ AI agents on GenAI.mil in under five weeks, running 180,000 agent sessions per week. Lovable reached 8 million users and $200 million ARR in 12 months — the fastest software startup growth by several measures. Y Combinator’s Winter 2025 batch had 25% of companies with 95%+ AI-generated codebases. These figures signal institutional normalisation, not experimentation.
The Pentagon case deserves specific attention. The DoD’s GenAI.mil platform uses Google Gemini’s Agent Designer to allow non-technical personnel to build AI agents without writing any code. Robert Malpass, Pentagon Deputy CDAO, described it as allowing “anybody across the Department” to “start to build out and work with advanced AI in their own context.” These agents are authorised at Impact Level 5 — the most sensitive unclassified tier — handling defence data without the code review process that would apply to traditionally developed systems.
At 1.1 million agent sessions and counting, with a creation rate of roughly 20,000 new agents per week, the Pentagon is not running a pilot programme. It has built an organisational infrastructure on vibe coding at a scale no enterprise deployment has matched.
The financial context is equally instructive. Lovable raised $330 million at a $6.6 billion valuation in December 2025. Its first four weeks of operation produced $4 million in ARR. Two months in, with a team of fifteen people, it reached $10 million ARR. Cursor went from $1 million to $500 million in revenue in under two years. App Store submissions surged 84% driven by vibe coding tools. These are not incremental numbers.
If vibe coding is already in your team without a policy, you are almost certainly affected by the governance gap this entire cluster addresses — and the statistics above show why that gap exists at every level of the industry, from individual developer tools to the most sensitive unclassified networks in the US Department of Defence. The Pentagon’s 20,000 AI Agents Per Week article covers the full GenAI.mil deployment story, including what institutional vibe coding looks like at DoD scale and what it reveals about the gap between deployment velocity and governance infrastructure.
What security vulnerabilities does AI-generated code typically introduce?
The most consistent findings across independent studies are broken authorisation (BOLA — Broken Object Level Authorization), missing CSRF protection, absent security headers, server-side request forgery (SSRF), and business logic flaws. In the Tenzai benchmark across five coding agents and 15 applications, all five agents failed to implement CSRF protection and security headers; all introduced SSRF. AI models generate authentication without authorisation — they confirm identity but not what each identity is allowed to do.
The Tenzai research team’s conclusion is worth quoting in full: “Based on our results, consistent with findings from the broader security research community, as of today, it doesn’t really matter which agent you use — vulnerabilities are almost certainly going to be introduced by them.” They tested Cursor, Claude Code, OpenAI Codex, Replit, and Devin. The agents built what was explicitly asked for, but “completely failed to grasp the bigger picture. They lack the security mindset to proactively introduce defensive mechanisms that weren’t explicitly requested.”
The pattern is structural, not incidental. AI coding agents perform well on “solved” vulnerability classes — SQL injection, cross-site scripting — where clear defences exist in training data. They fail consistently on “unsolved” classes where context determines what is safe: authorisation logic, SSRF, business rule enforcement. Zero agents across all Tenzai test runs used CSP, X-Frame-Options, HSTS, X-Content-Type-Options, or proper CORS configuration.
The quantified picture from Q1 2026:
- 91.5% of vibe-coded applications contained at least one AI hallucination-related flaw (Q1 2026 assessment, 200+ applications)
- 60% or more exposed API keys or database credentials in public repositories
- AI-co-authored code introduces security vulnerabilities at 2.74x the rate of human-written code (CodeRabbit, 470 GitHub pull requests)
- 40–62% of AI-generated code contains security vulnerabilities depending on the study and measurement methodology
- 35 CVEs directly attributable to AI coding tools were disclosed in March 2026 alone, up from 6 in January — Georgia Tech estimates the actual figure is 5–10x higher than detected
Trend Micro‘s framing captures the practical consequence: “The real risk of vibe coding isn’t AI writing insecure code. It’s humans shipping code they never had a chance to secure.”
For the full research picture — methodology, study convergence, and what the numbers mean for assessing your codebase — read 91.5 Percent of Vibe-Coded Apps Have Vulnerabilities and What the Q1 2026 Research Actually Shows. For the operational translation of the 2.74x finding into code review hours, SAST configuration, and sprint capacity, see The 2.74x Vulnerability Multiplier and What AI Code Density Means for Security Review.
What happened at Lovable and what does it reveal about the vibe coding security problem?
Between approximately November 2025 and January 2026, Lovable’s API contained a Broken Object Level Authorization (BOLA) vulnerability that allowed any free-tier account to access another user’s profile, source code, AI chat histories, and database credentials in five API calls — no offensive hacking required. The vulnerability was reported to HackerOne on 3 March 2026 and remained open for 48 days. The company’s initial response described it as “intentional behaviour.” $6.6 billion in projected customer business value was at risk. This is the breach that put vibe coding security on CTO radar.
BOLA is OWASP API Security Top 10 #1 since 2019. The mechanics are straightforward: an attacker signs up for a free account, finds an endpoint that serves a project by ID, and swaps the ID for someone else’s. The backend confirms who the requester is, but doesn’t verify whether they are allowed to access that specific resource. This is the authorisation gap that Tenzai’s benchmark identified as universal across all five AI coding agents tested — and it’s the gap that made Lovable’s architecture structurally vulnerable.
The company’s response made a bad situation worse. Lovable’s cycle ran: first posted “did not suffer a data breach,” called the exposed data “intentional behaviour,” then blamed its own documentation, then blamed HackerOne, then issued a partial apology. A researcher named Taimur Khan had separately found 16 vulnerabilities — 6 critical — in a single Lovable-featured application with over 100,000 views, including inverted authentication logic that granted anonymous users full access while blocking authenticated ones. His report through Lovable’s support channel was closed without a response.
The structural pattern extends beyond Lovable itself. Approximately 70% of Lovable-generated applications ship with Row-Level Security disabled in their Supabase database configurations. Bolt.new has the same default. This is not a one-off configuration error; it is a structural consequence of vibe coding workflows that prioritise functional output over defensive defaults.
The Moltbook breach confirmed the pattern: a vibe-coded social network breached within three days of launch, with 1.5 million API authentication tokens and 35,000 email addresses exposed through misconfigured Supabase with no row-level security.
When evaluating vibe coding platform vendors, the response cycle is as significant a governance signal as the vulnerability itself — a vendor that deflects, blames the disclosure process, or takes 48 days to address a critical authorisation flaw is a vendor that has not built security into its product culture.
For the complete incident timeline, BOLA anatomy, credential rotation steps for teams affected, and a vendor response evaluation framework, read Lovable’s 48-Day Exposure and the 6.6 Billion Dollar BOLA Vulnerability.
Is AI-generated code safe to put into production?
AI-generated code is not safe to put into production without a structured security review process. The evidence is consistent across independent studies: 91.5% of vibe-coded applications carry at least one vulnerability, AI-co-authored code introduces security flaws at 2.74x the rate of human-written code, and the SecureVibeBench benchmark found the best-performing coding agent achieves only 23.8% correct-and-secure outputs. “Safe” depends entirely on what review process you apply before and after deployment.
The more useful question is not whether AI-generated code is safe, but what risk profile your application carries and what review process matches it. AI-generated code for an internal tool that automates a spreadsheet report, with no external access and no sensitive data, is a different proposition from AI-generated code handling user authentication, payment processing, or personal health records.
The SecureVibeBench result gives the clearest answer to what “unreviewed” actually means in practice: across 41 open-source security benchmarks, the best available coding agent produced correct-and-secure outputs only 23.8% of the time. For the remaining 76.2%, the code was either functionally incorrect, insecure, or both. The Escape security platform’s scan of vibe-coded applications in production found 2,000+ high-impact vulnerabilities and 400+ exposed secrets. These are not hypothetical risks — they exist in live systems.
What review is required depends on the risk classification of the code. The green/yellow/red zone framework covered in the governance section provides the operational decision tool for matching oversight level to risk. Functional testing alone is insufficient at any tier — it does not catch BOLA, SSRF, or missing security headers, none of which produce failures in normal test scenarios.
There is also the problem Lawfare identified as “vibe compliance”: vibe coders can prompt AI to generate compliance documentation — risk assessments, security policies, audit responses — that appears credible but bears no relationship to actual security measures. If your compliance artefacts were generated the same way your code was, your compliance posture may be as fragile as your codebase.
The 84% of developers who report using AI tools, with only 46% fully trusting AI-generated code, represent the gap precisely. The question is not whether to trust the tools — it is whether to ship before the review is done.
For the statistical backbone of the vibe coding security case, see the full research synthesis. For the operational translation into a security review process, see The 2.74x Vulnerability Multiplier and What AI Code Density Means for Security Review.
What is the “flow-debt tradeoff” in vibe coding?
The flow-debt tradeoff, from Waseem et al.’s 2025 arxiv paper, describes how vibe coding creates productive early momentum — developers ship features fast, with high confidence — followed by compounding structural problems as systems scale toward production. The “flow” phase is real: velocity is genuinely high. The “debt” phase is also real: architectural inconsistency, testing gaps, deployment fragility, and missing documentation accumulate faster than they can be addressed by normal maintenance processes.
Waseem et al. identify six debt dimensions: requirements ambiguity (prompts don’t specify architecture, so the AI makes choices you don’t know about); architectural inconsistency (each session starts without full context of previous decisions, producing what they call “a patchwork structure rather than a coherent design”); security vulnerabilities (the subject of most of this cluster); testing gaps (AI-generated tests cover only happy-path cases and omit edge conditions and failure states); deployment fragility (dependencies and configuration assumptions embedded in generated code surface in production); and maintainability challenges (code that no human fully understands is code that no human can confidently modify).
The GitClear longitudinal study across 211 million lines of code changes from 2020–2024 shows this trajectory at scale: code refactoring dropped from 25% to under 10% of changed lines; code duplication quadrupled; code churn nearly doubled. These are the signals of a codebase accumulating structural debt faster than it is being addressed.
The METR randomised controlled trial (July 2025) added a productive counterpoint: experienced open-source developers using AI tools were 19% slower, despite predicting they would be 24% faster and believing afterward they had been 20% faster. The productivity case for vibe coding is real in the right contexts. In the wrong contexts — complex, high-stakes, production systems — the flow phase gives way to the debt phase faster than the velocity gains can compensate.
The practical implication: the flow-debt tradeoff is most acute at the 6–18 month mark, when early vibe-coded MVPs begin scaling toward production systems and the accumulated debt becomes visible as incident frequency and maintenance burden. Building governance into the workflow from the start — before the debt phase arrives — is the intervention that changes the outcome. The question of who owns the risk when the code writes itself is not rhetorical; it is the governance starting point.
What does the VECT ransomware case tell us about vibe coding and adversarial AI?
VECT ransomware, analysed by Check Point Research on 29 April 2026, bears the structural signature of AI-generated code: a critical logic error that made its own encryption irreversible, string obfuscation routines that cancelled each other out, and a misidentified cipher implementation. These are the same vulnerability classes found in legitimate vibe-coded applications. If defenders are inadvertently deploying vulnerable code because AI generation is unreviewed, threat actors are doing the same — with the same structural consequences.
The 128KB destruction bug is the technically memorable detail. VECT split files over 128KB into four chunks, encrypted each with a nonce written to a shared output buffer — but each new nonce overwrote the previous one. Only the last nonce was preserved. Even if a victim paid the ransom and received the decryption key, every file chunk except the last was irrecoverably destroyed. This is not a deliberate wiper; it is an AI-generated logic error that accidentally produced one.
Check Point Research theorised that the group behind VECT “either used AI tools to generate some of its code or relied on an older code base as the starting point.” The additional code flaws support the AI-generation hypothesis: CPU thread mismanagement, three encryption modes parsed into code but never implemented, and Ukraine listed as a CIS member — an error not seen in other modern ransomware.
The more significant signal is VECT 2.0. An updated version appeared and corrected some of the original bugs. Threat actors are iterating on their AI-generated code. The same loop — vibe code → identify flaws → fix and re-release — applies to ransomware development as readily as it does to legitimate applications. VECT 1.0 was buggy; VECT 2.0 is less buggy. VECT 3.0 will be less buggy still.
VECT operates as a ransomware-as-a-service (RaaS) operation with an affiliate network on BreachForums, partnered with threat actor TeamPCP. The coordinated campaign dimension — TeamPCP is linked to both the VECT ransomware development and the AI toolchain supply chain attacks documented in the 443 malicious ZIPs finding — suggests this is not isolated experimentation. It is structured, iterated threat actor activity: when threat actors started using the same tools the development community adopted, the arms race became measurable.
The governance implication is that the vibe coding security question is not one-sided. “Is our code secure?” is necessary but insufficient. “Are adversaries now accelerating malware development using the same tools, and are they closing the gap faster than we can defend?” is the complete question.
For the technical analysis of the 128KB bug and the arms-race framing, read VECT Ransomware Was Partly Vibe-Coded and It Accidentally Destroyed Every File Over 128KB. For the TeamPCP toolchain attack context, see 443 Malicious ZIP Files and How Attackers Are Targeting the AI Coding Toolchain.
How are attackers targeting the AI coding toolchain itself?
Cymulate Research Labs identified 443 malicious ZIP archives and a vulnerability class called Configuration-Based Sandbox Escape (CBSE) targeting AI coding tools as of early 2026. The attack writes malicious instructions to configuration files — .claude/settings.json, .mcp.json — causing the payload to execute on the host operating system when the AI tool restarts. As Cymulate put it: “every new Claude Code session triggers the hook and executes the attacker’s command silently, with no user notification.” Claude Code’s CBSE vulnerability was patched in v2.1.2 (CVE-2026-25725, CVSS 7.7 High); Gemini CLI’s equivalent remained unpatched for 90+ days.
Codex CLI from OpenAI was reported as affected; the issue was closed as “informational, not fixed,” with OpenAI citing prompt injection as out of scope. Cymulate’s conclusion: “the sandbox is treated as the security boundary, while the real boundary — the host-side configuration and execution logic — remains writable from inside the sandbox.”
The Bitwarden CLI supply chain attack illustrates a different vector. On 22 April 2026, version 2026.4.0 of the Bitwarden CLI was compromised for approximately 1.5 hours. The malicious module was named “Butlerian Jihad” — a deliberate Dune reference that signals the attackers knew their audience. It explicitly targeted authenticated AI coding assistants by name: Claude Code, Gemini CLI, Codex CLI, and others. Any developer who ran an npm install during that 1.5-hour window may have been exposed, along with any CI/CD pipeline that ran during that period.
Pillar Security researchers demonstrated a “rules file backdoor” attack: hidden malicious instructions injected into configuration files used by Cursor and GitHub Copilot. In March 2026, the “Agent Commander” attack showed that prompt injection into AI coding agents could convert autonomous coding tools into remotely controlled malware delivery platforms.
The framing for your team: the supply chain attack surface that extends beyond the code itself is now the primary concern. The tools your developers use to build are now a primary target. An inventory of installed AI tools, their versions, and their permission scopes is a governance requirement.
For the full attack surface mapping, CBSE mechanics, the Bitwarden CLI incident detail, and a practical toolchain security inventory, read 443 Malicious ZIP Files and How Attackers Are Targeting the AI Coding Toolchain.
How should you approach vibe coding governance for your engineering team?
The foundational governance principle is clear: the developer who accepted AI-generated code is the accountable party — not the AI tool, not the platform vendor. From that starting point, governance requires three things: a risk zone classification system (which code can AI produce without extra review, which requires heightened review, which is human-first regardless), a security review pipeline before production deployment, and a shadow AI inventory of tools already in use.
Shadow AI is the realistic starting point for most teams. The numbers are instructive: 80% of Fortune 500 companies have deployed AI agents; only 10% have a management strategy. Teams are already using Lovable, Bolt.new, Cursor, and other tools with or without a policy. If you build a governance policy without first understanding what your team is actually doing, you build a policy they will work around.
The governance gap itself — the distance between vibe coding’s deployment speed and the security infrastructure surrounding it — was made visible at the highest possible institutional level when the Pentagon deployed 103,000+ agents while CISA was simultaneously publishing careful adoption guidance. As one framing put it, “adoption is driven by productivity incentives and competitive pressure; governance policy follows incidents.” The job of governance is to close that gap before an incident forces the conversation.
Three credible anchoring frameworks provide the institutional authority for building a governance programme that will survive board scrutiny:
- CISA Careful Adoption Guidance: US federal framework; establishes the authoritative baseline for what “careful adoption” means in practice.
- Five Eyes Joint Guidance on Agentic AI: Intelligence alliance guidance; frames AI agent governance as a national security concern, not only a compliance matter.
- CIS Controls v8.1 AI Companion: The most operationally specific guidance available; covers MCP integrations explicitly; the framework that gets you from principle to control.
The green/yellow/red zone framework makes governance practical:
Green (AI with standard review): scaffolding, boilerplate, unit test generation, internal tooling with no sensitive data, UI components with no authentication logic.
Yellow (AI with enhanced review and mandatory security scanning): authentication changes, payment processing flows, encryption and key management, API integrations with external systems.
Red (human-first, AI as assistant only): security incident response, compliance flows (KYC/AML), cryptography, anything handling regulated data, anything you cannot fully test or reason about.
The regulatory timeline matters here. EU AI Act full obligations take effect 2 August 2026. GDPR applies to data in prompts today — if your developers are pasting customer data into AI coding tools to provide context, that data handling needs to be in scope for your data protection policy. For FinTech and HealthTech organisations, KYC/AML flows are already red-zone by any governance framework’s logic.
There is also the “vibe compliance” problem identified by Lawfare: developers can prompt AI to generate compliance documentation — risk assessments, security policies, audit reports — that appears credible but bears no relationship to actual security measures. If your compliance posture was produced by the same workflow as your code, it warrants the same scrutiny. Sizing the security review workload requires understanding the density finding and its operational implications — the 2.74x multiplier is the starting point for sprint capacity planning.
The IT Revolution‘s observation is worth holding: “Organisations that already operate at DORA elite level are better poised to take advantage of AI. Organisations that still struggle with tension between speed and quality will find they cannot get promised value from AI.” Governance is not the brake on vibe coding’s productivity gains. It is the condition for making those gains sustainable.
For the complete governance framework — the full accountability chain, a 30/60/90-day implementation plan, platform vendor evaluation criteria, shadow AI inventory process, and agentic identity governance — read Vibe Coding Governance and Who Owns the Risk When the Code Writes Itself.
Vibe Coding Resource Library
Foundational Understanding
What is vibe coding and why does it matter now? Vibe Coding’s Reality Check: Security, Scale, and What Happens When AI Writes the Code — This page. Start here for definitions, scale evidence, and navigation to all depth articles.
Institutional scale deployment Pentagon’s 20,000 AI Agents Per Week and What Institutional Vibe Coding Actually Looks Like — The DoD’s 103,000-agent GenAI.mil deployment as the scale-setting case study; the governance gap between CISA and the Pentagon running simultaneously. Estimated read: 10–12 minutes.
The statistical backbone 91.5 Percent of Vibe-Coded Apps Have Vulnerabilities and What the Q1 2026 Research Actually Shows — Methodology, convergence across four independent studies, SecureVibeBench’s 23.8% result. Estimated read: 12–14 minutes.
Incidents and Evidence
The platform breach case study Lovable’s 48-Day Exposure and the 6.6 Billion Dollar BOLA Vulnerability — Incident timeline, BOLA anatomy, $6.6B at risk, and credential rotation requirements for affected teams. Estimated read: 11–13 minutes.
The operational risk multiplier The 2.74x Vulnerability Multiplier and What AI Code Density Means for Security Review — How the density finding changes code review hours, SAST configuration, and sprint security capacity. Estimated read: 12–14 minutes.
Adversarial Dimension
AI-generated malware VECT Ransomware Was Partly Vibe-Coded and It Accidentally Destroyed Every File Over 128KB — The 128KB file destruction logic error, arms-race trajectory, and what VECT 2.0 signals for defenders. Estimated read: 10–11 minutes.
Toolchain attacks 443 Malicious ZIP Files and How Attackers Are Targeting the AI Coding Toolchain — CBSE attack class, Bitwarden CLI incident, HuggingFace and ClawHub malware, and the toolchain security inventory. Estimated read: 12–14 minutes.
Governance and Action
The governance framework Vibe Coding Governance and Who Owns the Risk When the Code Writes Itself — Accountability chain, 30/60/90-day implementation plan, platform vendor evaluation, shadow AI inventory, and agentic identity governance. Estimated read: 14–16 minutes.
Frequently Asked Questions
What exactly is vibe coding and should my team be using it?
Vibe coding is the practice of using natural language prompts to generate software without reviewing the resulting code. Whether your team should use it depends on what you are building, the risk profile of the application, and what review process you apply. For internal tooling with no sensitive data, the productivity gains are real. For customer-facing applications handling authentication, payments, or personal data, unreviewed vibe-coded output presents material security risk. The answer is not a blanket yes or no — it is a governance framework that matches oversight to risk. See Vibe Coding Governance and Who Owns the Risk When the Code Writes Itself for the framework.
What is spec-driven development and how does it differ from vibe coding?
Spec-driven development (SDD) is the practice of defining a structured specification — mission, tech stack, roadmap, architectural constraints — before engaging AI agents. Andrej Karpathy positioned this as “agentic engineering” in early 2026, the mature successor to vibe coding. The difference: vibe coding externalises both the problem definition and the solution to the AI; spec-driven development keeps architecture, context, and constraints in human hands while delegating implementation. SDD addresses vibe coding’s core weaknesses — context decay across sessions, architectural inconsistency, and the absence of testable acceptance criteria.
What is the difference between using Cursor or GitHub Copilot and full vibe coding?
AI-assisted tools like Cursor and GitHub Copilot assist the developer — the developer writes, reviews, and owns the output. Full vibe coding platforms like Lovable generate complete applications from natural language without developer review of the code structure. The risk profiles differ: the 2.74x vulnerability multiplier from the CodeRabbit study applies to AI-co-authored code at the Cursor/Copilot tier. Full vibe coding platforms introduce additional risk from the platform’s default configurations (RLS off by default in Lovable and Bolt.new), the absence of developer review, and the generation of complete application architectures rather than code fragments. See The 2.74x Vulnerability Multiplier for the risk spectrum analysis.
Who coined the term “vibe coding” and when did it emerge?
Andrej Karpathy coined “vibe coding” in February 2025, describing it as “fully giving in to the vibes” and “forgetting that the code even exists.” Karpathy — AI researcher, OpenAI co-founder, and former Tesla AI head — used the term to describe his own practice of building software by describing intent to an LLM and accepting the output without close review. By February 2026, Karpathy had moved on to “agentic engineering” as his preferred term for the mature version of the practice. Collins Dictionary named vibe coding its Word of the Year for 2025.
What is the governance gap between what security bodies recommend and what organisations are actually doing?
CISA published careful adoption guidance for AI agents in the same period the Pentagon was deploying 103,000+ of them. The gap is structural: adoption is driven by productivity incentives and competitive pressure; governance policy follows incidents. The governance gap is the distance between the AI tools your team is already using and the oversight, review, and accountability processes in place for the code they produce. The 30/60/90-day governance framework in Vibe Coding Governance and Who Owns the Risk When the Code Writes Itself is designed to close that gap without requiring a dedicated security team.
How does AI-generated code compare to human-written code in terms of security vulnerability rates?
The CodeRabbit analysis of 470 GitHub pull requests found AI-co-authored code introduces security vulnerabilities at 2.74 times the rate of human-written code. The Q1 2026 assessment of 200+ vibe-coded applications found 91.5% contained at least one vulnerability traceable to AI hallucination. The Veracode analysis of 4 million code scans found 45% of AI-generated code samples contained OWASP Top 10 vulnerabilities, with no improvement across 2025–2026 testing cycles. These are not contradictory findings — they measure different dimensions of the same structural problem: AI generation without review produces higher vulnerability density than human authorship. See 91.5 Percent of Vibe-Coded Apps Have Vulnerabilities for the full methodological comparison.
Is vibe coding just a trend or is it actually changing how software gets built?
It is changing how software gets built. The evidence is institutional: the Pentagon’s 103,000-agent deployment, Boris Cherny’s 200 pull requests with every line AI-written, Gartner’s projection that 60% of new code will be AI-generated by year-end. The productivity gains are real and documented — studies report improvements in the 20–55% range depending on task type and context. The security risks are also real and documented. The question is no longer whether vibe coding is happening — it is whether the governance infrastructure around it is adequate to the risk. For organisations that get the governance right, vibe coding is a capability multiplier. For those that don’t, it is a liability accumulator.
Conclusion
The evidence presented across this cluster adds up to a clear picture: vibe coding has moved faster than the security infrastructure built to support it, and the gap is measurable.
91.5% of vibe-coded applications carry at least one AI hallucination-related vulnerability. AI-co-authored code introduces security flaws at 2.74 times the rate of human-written code. The Pentagon built 103,000 agents in under five weeks. A vibe coding platform with 8 million users exposed source code and database credentials via a five-API-call vulnerability for 48 days. The first AI-generated ransomware contained a logic error that made its own decryption impossible. Attackers are targeting AI coding tool configuration files with 443 documented malicious archives.
None of this is an argument against AI-assisted development. The capability improvements are real, the productivity gains in the right contexts are real, and the institutional adoption is not reversing. The argument is for governance: matching the oversight applied to AI-generated code to the risk profile of what it does, starting with an honest picture of what your team is already using.
The governance framework in Vibe Coding Governance and Who Owns the Risk When the Code Writes Itself is the place to start. It covers the accountability chain, a 30/60/90-day implementation plan for teams without a dedicated security function, platform vendor evaluation criteria, and the shadow AI inventory process that gets you a realistic picture of current tool use before you build a policy.
The cluster articles in this hub exist because each dimension of this picture — the institutional scale, the breach incident, the statistical evidence, the adversarial escalation, the toolchain attack surface — deserves depth that a single article can’t provide. Follow the threads that matter most for your situation. The Resource Library above is organised to help you choose.