Business

SaaS

Technology

•

May 29, 2026

A Sufficiently Detailed Spec Is Code — The Community Principle Behind Spec-Driven Development

Q: Can you use spec-driven development with any AI coding tool, or does it require specific frameworks?

The principle applies to any AI coding agent. Claude Code handles large specification documents well within a single session, processing complete requirement sets and generating implementations in one coherent pass, without any dedicated SDD framework. A well-written CLAUDE.md or Project Rules file is sufficient to begin.

Q: Is context engineering just a rebranding of prompt engineering?

No. Anthropic draws a clear line: prompt engineering refers to methods for writing and organizing LLM instructions for optimal outcomes, while context engineering refers to the set of strategies for curating and maintaining the optimal set of tokens during LLM inference. Prompt engineering is a single-session activity. Context engineering is permanent infrastructure, evolving over time. The scope difference is fundamental, not cosmetic.

There is a phrase doing the rounds in developer communities that started life in Haskell type-theory and has ended up in everyday AI-assisted development conversations: “a sufficiently detailed spec is code.” It came from Gabriella Gonzalez’s March 2026 post on Haskell for All, where she used it as a critique — when a spec is precise enough to determine a unique implementation, it becomes indistinguishable from thinly-veiled code. The SDD community heard the same argument and drew the opposite conclusion: if a spec and its implementation are informationally equivalent, and AI agents are the execution layer, then the spec is the primary artefact.

If you know TDD and BDD but haven’t seen the formal argument behind spec-driven development laid out, this is that piece. It covers the intellectual lineage, names the key operational concepts — Context Engineering, Spec Drift, Living Spec — and closes with Malte Ubl’s “free as in puppies” framing that keeps the whole thing honest. For the full SDD picture, start with our guide to what spec-driven development is and what it means for engineering practice.

What does “a sufficiently detailed spec is code” actually mean?

When a specification is precise enough that only one valid implementation exists, the spec and the code are functionally equivalent — the gap between intent and execution closes. Gonzalez’s original argument from Haskell type theory is that in a sufficiently constrained formal system, the correspondence between a spec and a unique implementation is not an aspiration; it is a theorem.

For AI-assisted development, the same equivalence holds — not as a formal proof, but as an observable engineering property. A well-detailed spec eliminates the guesswork that produces unreliable AI-generated code. The agent does not fill gaps with judgement; it fills them with whatever it can infer, which is not the same thing.

Hillel Wayne has a useful counter-argument worth sitting with: “a specification corresponds to a set of possible implementations, and code is a single implementation in that set.” This is not a refutation — it is a precision requirement. It defines what “sufficiently detailed” actually means: unambiguous data structures, explicit error behaviours, defined state transitions, testable acceptance criteria. Working through that list is where the real effort of SDD lives.

The distinction worth drawing: “spec as documentation” records what was built; “spec as code” determines what must be built. Where it sits in time relative to implementation is what separates the two.

Where did this idea come from? Tracing the lineage from TDD to SDD

Spec-driven development is the fourth step in a lineage that began with TDD in the late 1990s. Each step moved the source of truth further from code and closer to intent.

TDD (Kent Beck, ~2000): write a failing test before writing code. Tests become the specification proxy — they describe what the code should do before the code exists.

BDD (Dan North, mid-2000s): extends TDD by expressing tests in natural language — Gherkin, Given/When/Then — so that specifications are readable by non-technical stakeholders while staying machine-executable.

MDD (OMG/UML era): the specification becomes a formal model — a UML diagram, a DSL. Code is generated from the model. An artefact other than code becomes the source of truth. MDD largely failed because the spec languages were too rigid and the generators could not handle real-world complexity. But it proved the concept.

SDD (2025–2026): the AI agent replaces the code generator; the specification replaces the model. Agents execute directly from the spec, completing the transfer of authority from code to intent.

The Prezi engineering team named this lineage well. Their post on trying spec-driven development describes how working with SDD made them understand “even aspects that made TDD, BDD and why not even MDD interesting and popular at least for a while.” Each methodology moved the source of truth one step further toward intent, and SDD is where that progression terminates. The ArXiv paper on SDD frames it as a “spec spectrum” — from Spec-First through Spec-Anchored to Spec-as-Source, where code is entirely generated from specifications.

What is Context Engineering and how does it operationalise the spec-as-code principle?

Context Engineering is the discipline of designing the full information environment — system prompts, memory, retrieved context, tools — that an AI agent operates within. It is the successor to prompt engineering, which was about crafting single-turn instructions.

Tobi Lutke (Shopify CEO) put it directly: “I really like the term ‘context engineering’ over prompt engineering. It describes the core skill better: the art of providing all the context for the task to be plausibly solvable by the LLM.” Short version: prompt engineering is what you do inside the context window; context engineering is how you decide what fills it.

Writing a sufficiently detailed spec is doing context engineering. A vague spec does not produce agent failure — it produces context ambiguity, which produces inconsistent behaviour. Lutke’s framing of AI agents as new-hire contractors is apt: contractors need a full brief to function without constant supervision. The spec is that brief. CLAUDE.md files, Project Rules, and steering files are all lightweight versions of the same principle.

What is Spec Drift and why does it matter for engineering teams?

Spec Drift is the progressive divergence between a specification and the codebase it originally described, caused by undocumented implementation decisions accumulating over time. It is the operational failure mode that makes the spec-as-code principle collapse in practice. The academic literature calls the same thing “specification rot” — same phenomenon, different discourse community.

The mechanism is simple: every implementation decision that deviates from the spec without a corresponding spec update creates a gap. Over time, the spec describes a system that no longer exists, and agents working from it reproduce stale intent. AI-assisted development makes this worse — agents generate code faster than spec updates can follow. The classic failure: three months later, the team has adopted Vitest, two packages have been restructured, one library deprecated entirely, and the CLAUDE.md still says “we use Jest.”

JGCarmona’s practitioner framing captures it well: specifications act as “semantic anchors”. When the anchor drags, the agent drifts with it.

The Living Spec is the response — a specification treated as a live document, updated whenever implementation decisions diverge. Conformance tests are the detection mechanism. Without drift detection, SDD collapses back into documentation-driven development. With it, the system becomes self-policing.

For teams operating in regulated environments, the stakes of Spec Drift are considerably higher. The article on why regulated industries find spec-as-code compelling as a compliance strategy covers that case in detail.

What does the frontier evidence tell us about what spec-driven development can and cannot do?

The clearest current demonstration of SDD in a production-grade open-source codebase is whenwords, built by Drew Breunig. He distributed a library without implementation code — just a markdown specification, approximately 750 conformance tests in YAML format, and an installation guide for agent integration. The library attracted over 1,000 GitHub stars. Community members submitted pull requests identifying inconsistencies between specifications and tests, showing that collaborative development was viable without traditional code.

That is the SDD Triangle in practice: Spec drives Implementation; Conformance Tests verify against the Spec; failed tests flag divergence; Spec or Implementation is updated accordingly; the loop runs continuously. Vercel, Anthropic, and Pydantic operate as Spec-Anchored development organisations — specs as living documents maintained throughout feature lifecycles, which ArXiv identifies as “the sweet spot for most production systems.” For the tools that implement the spec-as-code principle, the frameworks that implement the spec-as-code principle are mapped out in the framework landscape article.

The honest limit: Malte Ubl (Vercel CTO) framed it as “Software is free now. (Free as in puppies).” You can ship something fast, but now you have to take care of it. The spec must be kept current or the cost compounds. SDD is not a free lunch.

Hillel Wayne’s counter resurfaces here too: the equivalence holds only for well-bounded problems. For ill-bounded problems — highly ambiguous domains, tasks requiring significant judgement calls, greenfield architecture without precedent — the valid implementation set stays large. And even if code generation from specifications became fully automated, humans would still need to write the specifications.

How does this principle compare to RFCs, ADRs, and existing architecture documentation practices?

RFCs and ADRs share the spec-first intent — document decisions before or alongside implementation — but they were designed for human readers who will then write code. SDD specs are designed for agents that will execute code directly. That changes the required level of precision substantially.

What RFCs and ADRs do well: record architectural intent, socialise decisions, provide rationale for future maintainers. SDD inherits all of those goals. The difference is tolerance for ambiguity. RFCs tolerate it because human engineers fill gaps with judgement. SDD specs cannot — agents generate from what is present.

The InnoBlog analysis is practical: “Write ADRs. Even two or three short decision records give Claude significantly better context than none. Start with the decisions that would be most expensive to violate.” That is good advice regardless of whether you call it SDD or not.

The gap between a well-written ADR and a good SDD spec is in completeness and testability, not format or philosophy. The SDD equivalent of an ADR is a constitution.md or spec.md — stored in the repo the same way ADRs live in docs/decisions/, with the same immutability principle: supersede rather than overwrite. Teams with existing RFC/ADR cultures have a shorter distance to travel. The discipline is already present — the upgrade is adding testability and completeness.

Where does a team start if they want to apply this principle without adopting a full framework?

The lowest-friction entry point is a CLAUDE.md or equivalent steering file — a structured markdown document at the repository root that describes the codebase’s architecture, conventions, and constraints persistently. Many developers are already practising a form of SDD with Claude Code without formally labelling it as such.

Martin Fowler’s framing of what belongs in a CLAUDE.md is practical: “we use yarn, not npm”; “don’t forget to activate the virtual environment before running anything”; “when we refactor, we don’t care about backwards compatibility.” These are the conventions that would otherwise require repeated explanation.

The three-step progression: write a CLAUDE.md that describes what currently exists; extend it with what must remain true — the invariants the codebase should never violate; add conformance tests that verify those invariants. At step three, the team has adopted SDD without necessarily calling it that.

Start with data structures and API contracts — the most constrained parts of any system — before behaviour and error handling. Simple rule: if you explained it twice, write it down. Treat a spec-code divergence as a bug rather than documentation debt, and commit spec changes alongside code changes. When coordination overhead across multiple agents and subsystems becomes a bottleneck, that is when a full SDD framework earns its place. Until then, a well-maintained CLAUDE.md and growing conformance test suite will take you further than most teams expect. For the broader SDD landscape, the full spec-driven development overview maps where this fits.

Frequently Asked Questions

What is the difference between a spec and a requirement in software development?

Requirements describe what a system must do at a business or user level. A spec in the SDD sense describes how the system must behave at a technical level, with enough precision that an AI agent can implement it without ambiguity. The distinction is precision and testability, not vocabulary.

What is “specification rot” and is it the same as spec drift?

Specification rot is the academic term — appearing in ArXiv 2602.00180 — for the same phenomenon practitioners call Spec Drift: the progressive divergence between a specification and the implemented codebase. Same phenomenon, different discourse communities. Spec Drift is the preferred practitioner-facing term.

Can you use spec-driven development with any AI coding tool, or does it require specific frameworks?

The principle applies to any AI coding agent. Claude Code, for example, “handles large specification documents well within a single session, processing complete requirement sets and generating implementations in one coherent pass” without any dedicated SDD framework. A well-written CLAUDE.md or Project Rules file is sufficient to begin.

Is context engineering just a rebranding of prompt engineering?

No. Anthropic draws a clear line: “Prompt engineering refers to methods for writing and organizing LLM instructions for optimal outcomes… Context engineering refers to the set of strategies for curating and maintaining the optimal set of tokens during LLM inference.” Prompt engineering is a single-session activity. Context engineering is permanent infrastructure, evolving over time. The scope difference is fundamental, not cosmetic.

What are conformance tests and how are they different from unit tests?

Conformance tests verify that the codebase matches the spec — they test the relationship between artefacts (spec and code) rather than the internal correctness of isolated units. Dbreunig’s whenwords project uses approximately 750 conformance tests in YAML format as the primary SDD verification mechanism. Functionally equivalent to BDD acceptance tests, but spec-specific in framing.

Does spec-driven development work for greenfield projects or only for existing codebases?

SDD is well-suited to greenfield projects where the spec can be written before any implementation exists. ArXiv identifies Spec-Anchored Development as “the sweet spot for most production systems,” whether greenfield or existing. For existing codebases, spec-writing requires reverse-engineering current behaviour — a useful exercise, but an expensive one.

Why do Prezi engineers say SDD is the “culmination” of TDD, BDD, and MDD?

Because each methodology moved the source of truth one step further from code toward intent: TDD made tests the spec proxy; BDD made tests human-readable; MDD made a formal model the source of truth; SDD makes the spec itself the execution directive. The Prezi engineering post describes this as the insight that emerges once you actually try SDD — you begin to understand what each prior methodology was attempting.

What is the “free as in puppies” metaphor and why does it matter for SDD?

Malte Ubl (Vercel CTO) used the phrase to describe AI-generated code: like a free puppy, it has a low acquisition cost but a high ongoing maintenance burden. For SDD, the spec work that makes code generation reliable creates a maintenance obligation — the spec must be kept current or the cost compounds. It is a concise framing of why SDD is not a free lunch.

How does spec-driven development relate to formal methods in computer science?

Formal methods — Z notation, TLA+, Alloy — share the spec-as-code intuition. In a fully formal system, a verified spec is the authoritative artefact. SDD is a pragmatic, lightweight version of the same idea applied to AI agent execution rather than formal verification. The specdriven.com timeline documents the lineage: formal methods “proved something important: specifications could be mathematically verified.” Gabriella Gonzalez’s original argument draws on this tradition from Haskell type theory.

How does Hillel Wayne’s counter-argument affect the case for SDD?

Hillel Wayne argues that a specification corresponds to a set of possible implementations, not a single one, which means a spec is only equivalent to code when it is complete enough to eliminate all implementation ambiguity. This is not a refutation — it is a precision requirement. It defines the threshold of “sufficiently detailed” and identifies the class of problems (well-bounded, high-constraint) where SDD works best.

What is the SDD Triangle and how does it create a self-correcting system?

The SDD Triangle (dbreunig) describes the iterative loop: the spec drives implementation; conformance tests verify the implementation against the spec; failed tests identify where spec and implementation have diverged; spec or implementation is updated accordingly; the loop runs continuously. Whenwords makes this concrete — a production-grade demonstration of the triangle as a self-correcting feedback system rather than a linear process.

Can spec-driven development co-exist with agile and iterative development?

Yes. Mark Brooker’s framing: “Spec Driven Development isn’t Waterfall. In specification driven development, the specification is the thing being iterated on, rather than the implementation.” The spec evolves alongside the codebase — what changes is the ordering discipline: spec changes precede or accompany implementation changes rather than following them. Microsoft’s developer blog documents SDD running alongside agile sprints using GitHub issues as locked specifications during delivery.