Business

SaaS

Technology

•

May 29, 2026

Spec-Driven Development and the End of Vibe Coding — What Engineering Leaders Need to Know

Q: What is the difference between vibe coding and spec-driven development?

Vibe coding uses natural language prompts to generate code with minimal structured constraints. SDD is the counter-pattern: it front-loads the definition of outcomes, scope boundaries, and verification criteria in a formal specification before any code is generated. The spec acts as a persistent contract the agent must satisfy; vibe coding provides no such contract. The distinction is not about tools — it is whether a formalised spec governs the agent's work.

Q: What is the 'A Sufficiently Detailed Spec Is Code' principle?

The principle holds that when a specification is detailed enough to constrain an AI agent's output completely, it is functionally equivalent to code. Code becomes a generated artifact; the spec is the deliverable. This is the spec-as-source end of Martin Fowler's three-level taxonomy and represents the philosophical north star of the SDD movement. Spec Drift — the divergence between a spec and the actual codebase over time — is the failure mode the principle is designed to prevent.

Q: Is spec-driven development just waterfall with a new name?

No. Waterfall front-loads all specification work before any execution begins and treats the spec as a fixed contract. SDD treats the spec as a living document that evolves with the project; implementation begins incrementally from task-level specs, not a completed requirements freeze. The key structural difference: SDD specs govern AI agents continuously, not human developers once at the start of a project. Most practitioners operate at spec-first or spec-anchored, where the process is iterative within a feature or change scope.

In April 2026, a production disruption at Amazon — linked by analysts to an agentic coding session that misconfigured access controls — pushed AI coding governance onto engineering leadership’s agenda. Agentic AI coding tools had become powerful enough to do real damage without formal constraints. The response that has gained the most traction is spec-driven development (SDD): a methodology where structured specification documents, written before any code is generated, serve as the binding contract for AI agents.

This page answers the broad questions in plain terms and points you to the dedicated articles for the depth you need.

In this series:

What is spec-driven development and why is it gaining ground now?

Spec-driven development is a methodology where structured specification documents — written before any code — serve as the source of truth for AI agents that then generate, validate, and iterate on the implementation. Unlike vibe coding, where prompts generate code with minimal formal constraints, SDD enforces scope boundaries, architectural decisions, and verification criteria from the outset. It gained commercial traction in 2025–2026 as autonomous AI agents became powerful enough that undirected prompting started producing costly failures in production.

Its intellectual lineage runs through formal methods (Hoare, Meyer) and industrial practitioners, but the urgency is new — Thoughtworks placed it in the Assess ring of their 2025 Tech Radar as a genuine maturation phase, not a rebranding. The methodology sits in direct lineage with TDD and BDD: specs govern AI agents the way tests govern interfaces.

For the full diagnostic case — why the shift is happening now and what the documented failure modes look like — see the vibe coding failure-mode analysis.

What is the difference between vibe coding and spec-driven development?

Vibe coding — coined by Andrej Karpathy in February 2025 — describes the practice of using natural language prompts to generate complete application code with minimal structured constraints or review. SDD is the counter-pattern: it front-loads the definition of outcomes, scope boundaries, and verification criteria in a formal specification before any code is generated. The spec acts as a persistent contract the agent must satisfy; vibe coding provides no such contract.

Karpathy was candid about what vibe coding was designed for: “I ‘Accept All’ always, I don’t read the diffs anymore” — and he flagged it explicitly as “not too bad for throwaway weekend projects.” The problems emerged when teams applied the same approach to production systems. Hallucinated APIs, mixed library versions, and unintended side effects followed. The distinction is not about tools — you can run a vibe coding session and an SDD session in the same IDE. It is whether a formalised spec governs the agent’s work.

The full failure-mode breakdown covers why production deployments diverge from prompt intent.

What happened with Amazon’s AI coding tools in April 2026?

In April 2026, a 13-hour production disruption at Amazon was linked — by The Register (29 April) and Aragon Research — to an agentic coding session that misconfigured access controls. Amazon officially classified the event as “user error” and denied direct involvement by its Kiro IDE, but the incident prompted an internal mandate: all AI-generated code must be reviewed by an engineer before it is accepted. Amazon’s official position and the analysts’ framing remain in tension.

Aragon Research’s assessment was direct: “The primary driver behind these incidents was the deployment of agentic AI tools… granted broad permissions that allowed autonomous actions to bypass traditional human-in-the-loop safeguards.” The outcome was Amazon’s mandatory review policy: “Nothing ships without someone looking at it and validating it. Spec-driven development helps reduce how much time that takes” (Steve Tarcza, Amazon Stores).

The full timeline, with sourcing, is in Amazon’s Internal Probe — What AI Coding Outages Reveal About Production Risk.

What is AWS Kiro and how does it implement spec-driven development?

AWS Kiro is Amazon’s agentic IDE — built on Code OSS, the VS Code base — that enforces a three-phase spec workflow before any code is generated: requirements.md (user stories in EARS notation), design.md (architecture decisions), and tasks.md (testable implementation units). Agent Hooks extend the IDE with event-driven automations that fire on file save, handling tasks such as test updates and security scans without manual prompting. Kiro replaced Amazon Q Developer (EOL announced April 30, 2026) as AWS’s primary AI coding product.

EARS (Easy Approach to Requirements Syntax) is the structured format Kiro uses for acceptance criteria — it produces machine-parseable requirements that cover edge cases by default. Steering Files embed compliance standards and architectural non-negotiables as persistent context the agent always references. Kiro does not require an AWS account, is built on VS Code so the environment is familiar, and has GovCloud availability for regulated verticals.

For the full three-phase workflow, Agent Hooks configuration, and a Kiro-versus-Cursor evaluation, see AWS Kiro — Amazon’s Spec-First Bet on Agentic Development.

How does GitHub SpecKit compare to Kiro?

GitHub SpecKit is an IDE-agnostic, open-source Python CLI framework with 93,000+ GitHub stars (v0.8.7, May 2026) that runs a four-phase workflow: Specify, Plan, Tasks, Implement. Its key differentiator is the “constitution” — a persistent, project-wide principles document that governs every agent session across tools, comparable to an ADR or RFC in function. Kiro mandates its three-phase workflow inside a VS Code environment; SpecKit’s governance layer works with any compatible agent, including Claude Code, Gemini CLI, and GitHub Copilot.

The choice between them often resolves on stack. Microsoft-ecosystem teams — Copilot, Azure DevOps — align naturally with SpecKit. AWS-native teams lean toward Kiro. SpecKit’s IDE-agnostic design is its portability argument; Kiro’s deeper IDE integration enables Agent Hooks that SpecKit cannot replicate natively. Both enforce spec-first discipline; neither is objectively superior — they suit different organisational contexts.

The constitution concept, SpecKit’s four-phase workflow in practice, and a full Kiro comparison are all covered in GitHub SpecKit and the Microsoft Approach to AI Coding Governance.

Which SDD framework is right for my team?

The primary evaluation axis is brownfield versus greenfield. For new projects, AWS Kiro and GitHub SpecKit are the vendor-backed starting points. For existing codebases, OpenSpec‘s delta-marker workflow (ADDED/MODIFIED/REMOVED) is designed specifically for change-scoped specs. BMAD-METHOD (46,700+ GitHub stars) suits complex multi-agent orchestration; GSD (61,000+ GitHub stars) is a leaner alternative for Claude Code users who want meta-prompting without ceremony. Cursor Plan Mode is a low-friction entry point for teams not yet ready for a full framework.

Three tiers organise the landscape: vendor-backed (Kiro, SpecKit), community-led (BMAD, GSD, Cursor Plan Mode), and niche-optimised (OpenSpec for brownfield, Tessl for API hallucination prevention). Per-feature cost signals from RanTheBuilder (February 2026): BMAD Full at ~200, OpenSpecat 95, SpecKit at ~$75. Most teams need to assess only the 2–3 frameworks that match their codebase type and existing toolchain.

The three-tier comparison with per-feature cost data and the brownfield/greenfield decision guide are in The 30-Plus Framework Landscape — Navigating Spec-Driven Development Options in 2026.

What is the “A Sufficiently Detailed Spec Is Code” principle?

The principle — articulated most clearly by Prezi engineers and the specdriven.com community — holds that when a specification is detailed enough to constrain an AI agent’s output completely, it is functionally equivalent to code. Code becomes a generated artifact; the spec is the deliverable. This is the spec-as-source end of Martin Fowler’s three-level taxonomy (spec-first, spec-anchored, spec-as-source) and represents the philosophical north star of the SDD movement.

Spec Drift — the divergence between a spec and the actual codebase over time — is the failure mode the principle is designed to prevent. Living specs, which auto-update as agents complete work, are the practical response. See A Sufficiently Detailed Spec Is Code — The Community Principle Behind Spec-Driven Development for the TDD/BDD/MDD lineage and an honest account of where the current frontier sits.

Is spec-driven development just waterfall with a new name?

No — but the concern is legitimate and worth addressing directly. Waterfall front-loads all specification work before any execution begins and treats the spec as a fixed contract. SDD treats the spec as a living document that evolves with the project; implementation begins incrementally from task-level specs, not a completed requirements freeze. The key structural difference: SDD specs govern AI agents continuously, not human developers once at the start of a project.

The “big upfront specification” critique applies to the spec-as-source end of the spectrum — the most radical position. Most practitioners operate at spec-first or spec-anchored, where the process is iterative within a feature or change scope. The TDD parallel is useful: tests drive development iteratively; specs do the same at the architecture and scope level.

A Sufficiently Detailed Spec Is Code addresses the lineage and the antipattern critique in full.

How does spec-driven development satisfy EU AI Act requirements?

The EU AI Act‘s full enforcement deadline is August 2, 2026. For organisations deploying high-risk AI systems — which includes many AI-assisted development workflows in FinTech, HealthTech, and government — Articles 9–17 (provider obligations), Article 26 (deployer obligations), and Article 50 (AI content disclosure) create documentation and traceability requirements. Spec-driven workflows produce the compliance artifacts these articles require: a structured audit trail, human-oversight records at each review checkpoint, and AI authorship attribution in version control.

The AugmentCode compliance evaluation (May 2026) ranks tools by EU AI Act posture: Intent (Augment Code) and Claude Code at Tier 1; Kiro at Tier 2 (partial); Cursor at Tier 3. ISO/IEC 42001 and SOC 2 Type II certification are the credibility signals to look for when evaluating tools for regulated environments. The April 2026 AWS incident has elevated this from a compliance checkbox to a board-level accountability question.

EU AI Act article citations, the full compliance matrix, and an August 2026 action checklist are in Spec-Driven Development in Regulated Industries — Governance, Compliance, and Audit Trails.

What should you do before deploying AI coding agents to production?

At minimum: establish a human-in-the-loop review policy for all AI-generated code before it is merged or deployed. Amazon’s internal mandate after the April 2026 incident is the operational baseline — “nothing ships without someone looking at it.” Beyond that, introduce a specification layer (even a lightweight CLAUDE.md or project rules file is a starting point) and select a framework matched to your codebase type and team size before scaling agentic workflows.

Human-in-the-loop (HITL) governance is not just best practice — EU AI Act Article 14 mandates it for high-risk AI systems. Start there regardless of which framework you adopt. A full framework adoption — Kiro, SpecKit, BMAD — is the next step once your team has validated the spec-review loop on a contained project.

For tool evaluation, see The 30-Plus Framework Landscape. For governance and compliance specifics, see Spec-Driven Development in Regulated Industries. For the incident that drove Amazon’s policy, see Amazon’s Internal Probe.

Spec-Driven Development — Article Series

Understanding the Shift (Start Here)

From Vibe to Spec — Why AI Coding Is Growing Up: The diagnostic case — vibe coding’s documented failure modes, context decay, and the April 2026 incident that crystallised the shift.
A Sufficiently Detailed Spec Is Code — The Community Principle: The intellectual foundation — TDD/BDD/MDD lineage, context engineering, and the paradigm-shift argument.
Amazon’s Internal Probe — What AI Coding Outages Reveal About Production Risk: The incident in full — timeline, attribution, Amazon’s mandatory review policy, and the accountability question.

Evaluating Tools

AWS Kiro — Amazon’s Spec-First Bet on Agentic Development: Three-phase workflow (requirements.md → design.md → tasks.md), Agent Hooks, Steering Files, and a Kiro-versus-Cursor evaluation.
GitHub SpecKit and the Microsoft Approach to AI Coding Governance: The constitution concept, four-phase workflow, and IDE-agnostic portability versus Kiro’s VS Code environment.
The 30-Plus Framework Landscape — Navigating SDD Options in 2026: BMAD, GSD, OpenSpec, Cursor Plan Mode, Tessl — three-tier comparison with per-feature cost signals.

Governance and Compliance

Spec-Driven Development in Regulated Industries — Governance, Compliance, and Audit Trails: EU AI Act Articles 9–17, 26, and 50; the August 2026 enforcement deadline; compliance matrix; and the CTO liability angle.

FAQ

What is an “agentic IDE”?

An agentic IDE is a development environment where the AI model operates as an autonomous agent — planning, implementing, and iterating across multi-step tasks with minimal mid-task prompting, rather than responding to individual queries. AWS Kiro is the highest-profile current example. For a full evaluation, see Kiro’s three-phase spec workflow and Agent Hooks.

How does SDD differ from TDD (test-driven development)?

TDD uses unit tests to drive interface design at the code level. SDD operates at a higher architectural layer — the spec defines outcomes, scope, and constraints before any tests or code are written. The two are complementary: an SDD workflow typically produces tests as part of the task list, which are then driven by TDD at implementation. A Sufficiently Detailed Spec Is Code traces the full TDD/BDD/MDD lineage.

What is context engineering and how does it relate to SDD?

Context engineering is the discipline of curating which information AI agents receive — providing precise, task-relevant context rather than exposing them to a full repository. Thoughtworks identifies it as the operational complement to SDD: the specification defines what the agent must achieve; context engineering defines what the agent is allowed to see. It is distinct from prompt engineering, which optimises human-to-LLM interaction.

Is spec-driven development worth the overhead for small teams?

For teams below about five engineers on a greenfield project with a contained scope, a lightweight approach — a project constitution file (CLAUDE.md or equivalent) plus a structured task list — captures most of the benefit without the overhead of a full framework. The overhead of a framework like BMAD or Kiro pays off when multiple agents run in parallel, when the codebase is large, or when compliance requirements mandate an audit trail.

What are BMAD, OpenSpec, and GSD?

BMAD-METHOD (Build More Architect Dreams) is an open-source multi-agent orchestration framework with 12+ specialised agent roles and 46,700+ GitHub stars. OpenSpec is a proposal-centred workflow designed for brownfield codebases, using delta markers (ADDED/MODIFIED/REMOVED) to scope specs to the change rather than the full system. GSD (Get Shit Done) is a lean, low-ceremony meta-prompting framework built primarily for Claude Code. All three are compared in The 30-Plus Framework Landscape.

Where can I find the AWS Kiro documentation?

Kiro’s official documentation and download are at kiro.dev. For an independent technical walkthrough of the three-phase spec workflow, Agent Hooks, and Steering Files — without the marketing framing — see the independent Kiro technical review.

Whether your starting point is a lightweight project constitution or a full framework adoption, the spec-review loop is the minimum viable governance layer for any team running AI agents in production.